# This notebook is to make some processing of the audios transcribed

The transcriptions of lectures and conferences contain a lot of nonsensical speech.
- Repetitions of words and fragments of phrases
- Incomplete sentences, interactions with the public.

All of these speech forms do not add meaning to the chunks and their embeddings.

***The goal of this notebook is to process it a bit with GPT*** 

The input will be 

.json files with chunks 

# We will use ollama

See (https://www.postman.com/postman-student-programs/workspace/ollama-api/documentation/21521806-f48dc31a-a9f1-4dad-9082-fd07f5cd2fda)

We will have two api endpoints :
- /api/generate -> for single prompt generation
- /api/chat -> for chat conversations

## Important!!! Check the models that DO have Chat 

- llama2 models offer chat
- mistral and mixtral do not!


# setup stuff


In [4]:
import requests
import json

# Define the ollama URL
ollama_url = "http://localhost:11434"
ollama_generate_url = ollama_url+"/api/generate"
ollama_chat_url = ollama_url+"/api/chat"
# Choose your prefered model
ollama_generate_model = "mixtral"
ollama_chat_model = "mixtral"
#ollama_model = "llama2"
default_temperature = 0.7 # as suggested by Mistral API docs https://docs.mistral.ai/api/

# Helper functions



In [11]:
# Helper functions  
def ollama_chat(messages, temperature=default_temperature):
    
    data = {
        'model': ollama_chat_model,
        'messages': messages,   
        'temperature': temperature,
        'stream': False
    }
    
    json_data=json.dumps(data)
    
    response = requests.post(ollama_chat_url, data=json_data, headers={"Content-Type": "application/json"})

    if response.status_code == 200:
        response_json = response.json()
        actual_response = response_json["message"] 
        return actual_response
    else:
        return f"Request failed with status code {response.status_code}"

def ollama_generate(prompt, system="you are a helpfull assitant", temperature=default_temperature):
    
    data = {
        'model': ollama_generate_model,
        'prompt': prompt,
        'system': system,        
        'temperature': temperature,
        'stream': False
    }
    
    json_data=json.dumps(data)
    actual_response="error"
    
    response = requests.post(ollama_generate_url, data=json_data, headers={"Content-Type": "application/json"})
    if response.status_code == 200:
        response_json = response.json()
        actual_response = response_json["response"] 
        return actual_response
    else:
        return f"Request failed with status code {response.status_code}"

    

# Sample usage of ollama_generate

In [3]:
# Sample usage of ollama_generate

prompt="what is the meaning of life?"
system="talk like a pirate"
response=ollama_generate(prompt,system=system)
print(response)
   

 Arr matey, ye be askin' a question as old as time itself. The meaning o' life, you say? Well, I ain't no philosopher, but I'll give it me best shot.

To me, the meaning o' life be about findin' yer own path and makin' the most of what ye've got. It be about seekin' adventure, discoverin' new lands, and makin' memories with those ye hold dear.

But at the end o' the day, the meaning o' life be different for every soul on this earth. So set sail on yer own journey, and discover what it means to you. After all, life be too short to spend it wonderin', so get out there and start livin', ye scurvy dog!


# Sample usage of ollama_chat

In [10]:
# Sample usage of ollama_chat
messages= [
        {"role": "system", "content": "talk like a surfer dude."},
        {"role": "user", "content": "What is the meaning of life."},
##        {"role": "assistant", "content": "possible responses in the context?"},
##        {"role": "user", "content": "more stuff"},
]
#print(json.dumps(messages,indent=4))

response=ollama_chat(messages)

print(response)
#print(type(response))
#show_response(response)    

{"model":"mixtral","created_at":"2024-01-15T16:56:36.450374Z","message":{"role":"assistant","content":" Dude, you're really going deep with that question! Like, you want to know the meaning of life, ya? Well, I reckon it's all about hanging ten on those gnarly waves of experience, feeling the rush of living in the present moment, and being stoked about the good vibes.\n\nYou see, brah, life ain't always about scoring the perfect barrel or landing that sick trick. Sometimes it's about wiping out, learning from our mistakes, and getting back up to ride again. It's about cherishing those moments with our bros and brahs, creating unforgettable memories, and sharing some aloha spirit.\n\nSo, I'd say the meaning of life is finding your own flow, staying true to yourself, and spreading good vibes all around. Just remember, when in doubt, paddle out! Catch you on the next swell, dude!"},"done":true,"total_duration":5683339500,"load_duration":854833,"prompt_eval_duration":476231000,"eval_count"

# Loading Srt File



In [4]:
json_file_url="http://192.168.1.44:8001/1_1_Planteando_objetivos_politica_economica.json"



In [5]:
import requests
import json



try:
    response = requests.get(json_file_url)
    response.raise_for_status()  # Check for any HTTP errors

    json_data = response.text  # Get the JSON content from the response

    # Parse the JSON data into a Python dictionary
    data_dict = json.loads(json_data)

    # Now you can work with the data_dict as needed
    print(type(data_dict))
    print(data_dict)

except requests.exceptions.RequestException as e:
    print(f"An error occurred: {e}")
except json.JSONDecodeError as e:
    print(f"Error decoding JSON: {e}")

['1\n00:00:00,000 --> 00:00:59,820\n¿Por qué la inflación y el pleno empleo están en conflicto? ¿Por qué? ¿Qué pensáis? ¿Por qué intentar controlar la inflación nos perjudica la plena ocupación? ¿Sería esto? Y al revés, ¿no? ¿Por qué favorecer la plena ocupación nos va a perjudicar la inflación? Esa es tu pregunta, ¿no? Venga, va, un minuto. A ver. Venga, va. ¿Quién nos dice algo? Vale. Arturo, te lo voy a pasar y os lo vais pasando, ¿vale? No os rompáis que te vale una pasta esto. Venga, Arturo. No, no, ya está, ya está. Porque si tenemos plena ocupación significa que, bueno, si estamos favoreciendo la plena ocupación, significa que no hay problema.', '2\n00:00:59,820 --> 00:01:32,880\nSignifica que estamos intentando que el mayor número de personas tengan trabajo y, por tanto, generen bienes y riquezas. Por tanto, estas bajarán de valor porque hay mucha más oferta de bienes y riquezas. Entonces, mucha más gente que te pueda hacer lo mismo, por tanto. Y, bueno, ¿qué hay que...? Habrá 