# GenAI Homework 1 Part 1

#### Make sure that you read your API Key from a file and submit the api key file with your homework. Only TA and I will have access to your api key to check your homework. I will ask to delete api keys shared with us after checking the homework.

**Task 1:** Create an AI assistant that will answer "What's the temperature outside now" or "What's the temperature in Tokio now" type of questions (location can be any big city). Ask for the location of the user if the location is not in the question. Use OpenAI APIs, [openweathermap](http://api.openweathermap.org/data/2.5/weather) API(it is free), and function calling.

* Step 1: Use OpenAI Chat Completions API to get the location of the user if it is not given. If it is given, use function calling for getting weather api parameter(s).
* Step 2: Call weather api for the given location.
* Step 3: Call Chat Completions API again for processing the response of weather api. Make it to provide short answer like this: "The temperature in Yerevan is -1.91 degrees Celsius".
* Step 4: Call Chat Completions API again to translate the output of Step 3 into Armenian.
* Step 5: Use OpenAI Text to Speech API to create an audio version (mp3) of the output of Step 4.
* Step 6: Use one of OpenAIs APIs to create an image based on the output of Step 3 (text to image).
* Step 7: Use one of OpenAIs APIs to extract text (if any) from the output of Step 6
* Step 8: Create a Chainlit app that will answer the questions mentioned at the beginning of the task and will output the outputs of Steps 3, 4, 5, 6, and 7.

Useful links:
* [Chainlit App Creation](https://docs.chainlit.io/get-started/pure-python])
* [Text, Image, Audio and Video response with Chainlit](https://docs.chainlit.io/api-reference/elements/text)

**Check a Chainlit app example at the end of this notebook.**

Good luck!

In [1]:
from openai import OpenAI

with open('apikey.txt','r') as f:
  openai_api_key = f.read()

client = OpenAI(api_key=openai_api_key)

### Step 1: Use OpenAI Chat Completions API to get the location of the user if it is not given. If it is given, use function calling for getting weather api parameter(s)

In [2]:
import requests

def get_current_temperature(city_name, openweathermap_api_key):
    base_url = "http://api.openweathermap.org/data/2.5/weather"
    params = {
        'q': city_name,
        'appid': openweathermap_api_key,
        'units': 'metric'  
    }
    response = requests.get(base_url, params=params)
    if response.status_code == 200:
        data = response.json()
        temperature = data['main']['temp']
        return temperature
    else:
        return None

### Step 2: Call weather api for the given location

In [3]:
def generate_temperature_response(city_name, client, temperature, openai_api_key):
    response = client.chat.completions.create(
        model="gpt-4",  
        messages=[
          {"role": "system", "content": "You are LLM assistant for providing weather."},
          {"role": "user", "content": f"Create a short sentence that tells the current temperature in {city_name}, which is {temperature:.2f} degrees Celsius"}
        ]
    ) 
    return response.choices[0].message.content

### Step 3: Call Chat Completions API again for processing the response of weather api. Make it to provide short answer like this: "The temperature in Yerevan is -1.91 degrees Celsius"

In [4]:
import re

def extract_city_name(user_query):
    cleaned_query = re.sub(r'[^\w\s]', '', user_query)
    
    key_words = ['in', 'for', 'at', 'today', 'temperature']
    
    words = cleaned_query.split()
    
    for key in key_words:
        if key in words:
            key_index = words.index(key)
            if key_index + 1 < len(words):
                return words[key_index + 1]
                
    return None

In [5]:
def process_temperature_query(user_query, client, openweathermap_api_key, openai_api_key):
    city_name = extract_city_name(user_query)
    if city_name:
        temperature = get_current_temperature(city_name, openweathermap_api_key)
        
        if temperature is not None:
            response_message = generate_temperature_response(city_name, client, temperature, openai_api_key)
            return response_message  
        else:
            return f"Could not retrieve the temperature for {city_name}."
    else:
        return "Please specify the city for which you'd like to know the temperature."

In [6]:
# Example
user_query = "What's the temperature in Yerevan now?"
openweathermap_api_key = "3b3ab17dc8c57469292f3f52b3278c0f"
english_response = process_temperature_query(user_query, client, openweathermap_api_key, openai_api_key)
english_response

'The current temperature in Yeravan is 9.09 degrees Celsius.'

### Step 4: Call Chat Completions API again to translate the output of Step 3 into Armenian

In [7]:
def generate_response_in_armenian(english_response, client, openai_api_key):
    translation_prompt = f"Translate the following sentence into Armenian: '{english_response}'"
    response = client.chat.completions.create(
        model="gpt-4",  
        messages=[
            {"role": "system", "content": "You are a helpful assistant capable of translating English to Armenian."},
            {"role": "user", "content": translation_prompt},
        ],
    )
    return response.choices[0].message.content

In [8]:
# Example
armenian_response = generate_response_in_armenian(english_response, client, openai_api_key)
armenian_response

"'Երևանի ընթացիկ ջերմաստիճանը կազմում է 9.09 աստիճան Ցելսիուս։ '"

### Step 5: Use OpenAI Text to Speech API to create an audio version (mp3) of the output of Step 4

In [9]:
def convert_text_to_speech_and_save(input_text, client, output_file="output.mp3"):
    response = client.audio.speech.create(
        model="tts-1-hd",  
        voice="alloy",     
        input=input_text   
    )

    response.stream_to_file(output_file)
    print(f"Audio content successfully saved to '{output_file}'")

In [10]:
# Example 
convert_text_to_speech_and_save(armenian_response, client, "Temperature_in_Armenian.mp3")

Audio content successfully saved to 'Temperature_in_Armenian.mp3'


  response.stream_to_file(output_file)


### Step 6: Use one of OpenAIs APIs to create an image based on the output of Step 3 (text to image)

In [11]:
def generate_image_and_get_url(prompt, client, size="1024x1024", quality="hd", n=1):
    response = client.images.generate(
        model="dall-e-3",  
        prompt=prompt,     
        size=size,         
        quality=quality,   
        n=n                
    )

    image_url = response.data[0].url  
    return image_url

In [12]:
# Example 
image_url = generate_image_and_get_url(english_response, client)
image_url

'https://oaidalleapiprodscus.blob.core.windows.net/private/org-ej7auLzDqe4O6uOw5EHRgwBR/user-ZFHofgMljsoZMuNDhY1cl8h7/img-hIeOtyzsE2iQSzI0UplsZMIz.png?st=2024-03-11T08%3A49%3A07Z&se=2024-03-11T10%3A49%3A07Z&sp=r&sv=2021-08-06&sr=b&rscd=inline&rsct=image/png&skoid=6aaadede-4fb3-4698-a8f6-684d7786b067&sktid=a48cca56-e6da-484e-a814-9c849652bcb3&skt=2024-03-10T12%3A04%3A18Z&ske=2024-03-11T12%3A04%3A18Z&sks=b&skv=2021-08-06&sig=za%2BgIbFatTFTJdXROoUaM7Bp6u9/psyySmT8wgvOwI8%3D'

### Step 7: Use one of OpenAIs APIs to extract text (if any) from the output of Step 6

In [13]:
def analyze_image_with_text(image_url, client, question="What’s in this image?", model="gpt-4-vision-preview", max_tokens=300):
    response = client.chat.completions.create(
        model=model,
        messages=[
            {
                "role": "user",
                "content": [
                    {"type": "text", "text": question},
                    {
                        "type": "image_url",
                        "image_url": {
                            "url": image_url,
                            "detail": "high"
                        },
                    },
                ],
            }
        ],
        max_tokens=max_tokens,
    )

    return response.choices[0].message.content

In [14]:
# Example 
response_content = analyze_image_with_text(image_url, client)
response_content

'The image features a cityscape with a prominent electronic billboard showing the temperature and weather for Yerevan, the capital city of Armenia. The temperature is displayed as 9.09 degrees Celsius along with a symbol indicating sunny or partly sunny weather. The architecture of the buildings suggests a blend of modern and historical styles, typical of Yerevan, which combines contemporary structures with traditional Armenian designs. The city is set against a backdrop of mountains, likely representing Mount Ararat, which is a significant symbol for Armenians and visible from many parts of Yerevan on clear days.'

**Task 2:** Use a python library to download a short video from YouTube (e.g. https://github.com/pytube/pytube ), transcribe the text of the video with OpenAIs Whisper API, and use OpenAI's Moderation API to check it. Print the transcribed text and Moderation API response.

### Use a python library to download a short video from YouTube (e.g. https://github.com/pytube/pytube )

In [16]:
from pytube import YouTube
YouTube('https://www.youtube.com/watch?v=c0m6yaGlZh4').streams.first().download()
yt = YouTube('https://www.youtube.com/watch?v=c0m6yaGlZh4')
yt.streams \
  .filter(progressive=True, file_extension='mp4') \
  .order_by('resolution') \
  .desc() \
  .first() \
  .download()

'/Users/tigrangaplanyan/Downloads/Tigran_Gaplanyan_HW1/What is Artificial Intelligence  Quick Learner.mp4'

### Transcribe the text of the video with OpenAIs Whisper API

In [17]:
import whisper

model = whisper.load_model("base")
result = model.transcribe("What is Artificial Intelligence  Quick Learner.mp4")
print(result["text"])



 When you hear the term artificial intelligence, what comes to mind? Superpowered robots? Hyperintelligent devices? Science fiction has familiarized the world with the concept, but outside of Hollywood, what is artificial intelligence and what can AI actually do? For starters, AI involves using computers to do things that usually require human intelligence. Humans can see with their eyes and process what they see. We can understand our environments and move around within them. Our brains have the ability to see patterns, and we have the ability to understand spoken language. Artificial intelligence is a broad branch of computer science that includes many different terms. In order for computers to achieve these capabilities, they require lots and lots of data. Artificial intelligence sets allow AI algorithms to identify patterns, make predictions, and recommend actions. Artificial intelligence is already all around us, but today's best AI still can't compete with the human brain in some

### Use OpenAI's Moderation API to check it

In [18]:
import requests

url = "https://api.openai.com/v1/moderations"
headers = {
    "Content-Type": "application/json",
    "Authorization": f'Bearer {openai_api_key}'  # Replace YOUR_OPENAI_API_KEY with your actual API key
}
data = {
    "input": result["text"]
}

response = requests.post(url, headers=headers, json=data)

response.json()

{'id': 'modr-91WZqoFhzYCuiwoBpx2cEOP55xN2S',
 'model': 'text-moderation-007',
 'results': [{'flagged': False,
   'categories': {'sexual': False,
    'hate': False,
    'harassment': False,
    'self-harm': False,
    'sexual/minors': False,
    'hate/threatening': False,
    'violence/graphic': False,
    'self-harm/intent': False,
    'self-harm/instructions': False,
    'harassment/threatening': False,
    'violence': False},
   'category_scores': {'sexual': 1.858225687101367e-06,
    'hate': 0.0003464908804744482,
    'harassment': 0.0012262610252946615,
    'self-harm': 9.81941905564554e-09,
    'sexual/minors': 3.648816360168894e-08,
    'hate/threatening': 6.078220593508377e-08,
    'violence/graphic': 3.0797647923463956e-05,
    'self-harm/intent': 3.2770035574003487e-08,
    'self-harm/instructions': 1.2896180123789236e-07,
    'harassment/threatening': 9.040778422786389e-06,
    'violence': 8.514392538927495e-05}}]}

# END

In [None]:
# Chainlit app example with text, image and audio
# Write below code in app.py and run it locally
import chainlit as cl
from openai import OpenAI

@cl.on_message
async def main(message: cl.Message):
    # Your custom logic goes here...
    image = cl.Image(path="files/nature.png", name="Nature", display="inline")
    audio = cl.Audio(name="output.mp3", path="files/output.mp3", display="inline")
    elements = [image, audio]


    with open('apikey.txt','r') as f:
        OPENAI_API_KEY = f.read()

    client = OpenAI(api_key=OPENAI_API_KEY)

    response = client.chat.completions.create(
                 model="gpt-4-1106-preview",
                  response_format={ "type": "text" }, messages=[
    {"role": "system", "content": "You are LLM assistant that provides short answers."},
    {"role": "user", "content": message.content}
  ]
)

    f_response = response.choices[0].message.content

# Send a response back to the user
    await cl.Message(
        content=f"Received: {f_response}",
	elements = elements,
    ).send()