# GenAI Homework 1 Part 1 Vyacheslav Stepanyan

#### Make sure that you read your API Key from a file and submit the api key file with your homework. Only TA and I will have access to your api key to check your homework. I will ask to delete api keys shared with us after checking the homework.

**Task 1:** Create an AI assistant that will answer "What's the temperature outside now" or "What's the temperature in Tokio now" type of questions (location can be any big city). Ask for the location of the user if the location is not in the question. Use OpenAI APIs, [openweathermap](http://api.openweathermap.org/data/2.5/weather) API(it is free), and function calling.

* Step 1: Use OpenAI Chat Completions API to get the location of the user if it is not given. If it is given, use function calling for getting weather api parameter(s).
* Step 2: Call weather api for the given location.
* Step 3: Call Chat Completions API again for processing the response of weather api. Make it to provide short answer like this: "The temperature in Yerevan is -1.91 degrees Celsius".
* Step 4: Call Chat Completions API again to translate the output of Step 3 into Armenian.
* Step 5: Use OpenAI Text to Speech API to create an audio version (mp3) of the output of Step 4.
* Step 6: Use one of OpenAIs APIs to create an image based on the output of Step 3 (text to image).
* Step 7: Use one of OpenAIs APIs to extract text (if any) from the output of Step 6
* Step 8: Create a Chainlit app that will answer the questions mentioned at the beginning of the task and will output the outputs of Steps 3, 4, 5, 6, and 7.

Useful links:
* [Chainlit App Creation](https://docs.chainlit.io/get-started/pure-python])
* [Text, Image, Audio and Video response with Chainlit](https://docs.chainlit.io/api-reference/elements/text)

**Check a Chainlit app example at the end of this notebook.**

Good luck!

In [8]:
import json
import requests
import openai
from tenacity import retry, wait_random_exponential, stop_after_attempt
from termcolor import colored  
from openai import OpenAI

GPT_MODEL = "gpt-3.5-turbo"

In [9]:
with open('api_key.txt','r') as f:
  openai.api_key = f.read()
api_key = openai.api_key

In [10]:
@retry(wait=wait_random_exponential(multiplier=1, max=40), stop=stop_after_attempt(3))
def chat_completion_request(messages, tools=None, tool_choice=None, model=GPT_MODEL):
    headers = {
        "Content-Type": "application/json",
        "Authorization": "Bearer " + openai.api_key,
    }
    json_data = {"model": model, "messages": messages}
    if tools is not None:
        json_data.update({"tools": tools})
    if tool_choice is not None:
        json_data.update({"tool_choice": tool_choice})
    try:
        response = requests.post(
            "https://api.openai.com/v1/chat/completions",
            headers=headers,
            json=json_data,
        )
        return response
    except Exception as e:
        print("Unable to generate ChatCompletion response")
        print(f"Exception: {e}")
        return e

In [11]:
def pretty_print_conversation(messages):
    role_to_color = {
        "system": "red",
        "user": "green",
        "assistant": "blue",
        "function": "magenta",
    }
    
    for message in messages:
        if message["role"] == "system":
            print(colored(f"system: {message['content']}\n", role_to_color[message["role"]]))
        elif message["role"] == "user":
            print(colored(f"user: {message['content']}\n", role_to_color[message["role"]]))
        elif message["role"] == "assistant" and message.get("function_call"):
            print(colored(f"assistant: {message['function_call']}\n", role_to_color[message["role"]]))
        elif message["role"] == "assistant" and not message.get("function_call"):
            print(colored(f"assistant: {message['content']}\n", role_to_color[message["role"]]))
        elif message["role"] == "function":
            print(colored(f"function ({message['name']}): {message['content']}\n", role_to_color[message["role"]]))

In [12]:
tools = [
    {
        "type": "function",
        "function": {
            "name": "get_current_weather",
            "description": "Get the current weather",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The city and state, e.g. San Francisco, CA",
                    },
                    "format": {
                        "type": "string",
                        "enum": ["celsius", "fahrenheit"],
                        "description": "The temperature unit to use. Infer this from the users location.",
                    },
                },
                "required": ["location", "format"],
            },
        }
    }
]

## STEP 1

In [13]:
messages = []
message = input('ask the question about weather :')
messages.append({"role": "system", "content": "Don't make assumptions about what values to plug into functions. Ask for clarification if a user request is ambiguous."})
messages.append({"role": "user", "content": message})
chat_response = chat_completion_request(
    messages, tools=tools, tool_choice = 'auto'
)
assistant_message = chat_response.json()["choices"][0]["message"]
messages.append(assistant_message)
while chat_response.json()["choices"][0]["finish_reason"] != 'tool_calls': #assistant_message.content != None:  other option 
    message = input(assistant_message["content"])
    messages.append({"role": "user", "content": message})
    chat_response = chat_completion_request(
        messages, tools=tools, tool_choice = 'auto'
    )
    assistant_message = chat_response.json()["choices"][0]["message"]
    messages.append(assistant_message)
else:
    loc = assistant_message['tool_calls'][0]['function']['arguments'].split('"')[3]
    unit = assistant_message['tool_calls'][0]['function']['arguments'].split('"')[7]
    

## STEP 2

In [14]:
url = "http://api.openweathermap.org/data/2.5/weather"
params = {
    "q": loc,
    "APPID": "3128f76a7551a274746e884fd29a0e8f"
}

response = requests.get(url, params=params)
info = {'location' : loc, 'temp' : response.json()['main']['temp'], 'unit' : unit}

In [15]:
info

{'location': 'Yerevan', 'temp': 278.24, 'unit': 'celsius'}

## STEP 3

In [16]:
messages = []
messages.append({"role": "system", "content": "Provide a short description about the weather. Adjust the temperature unit from Kelvin to given unit, e.g The temperature in Yerevan is -1.91 degrees Celsius"})
messages.append({"role": "user", "content": f'{info['location']} , {info['temp']} Kelvin, {info['unit']}'})
chat_response = chat_completion_request(
    messages, tools=None, tool_choice = None
)
assistant_message = chat_response.json()["choices"][0]["message"]["content"]
messages.append({"role": "system", "content": assistant_message})
print(assistant_message)

The temperature in Yerevan is 5.09 degrees Celsius.


## STEP 4

In [17]:
messages.append({"role": "user", "content": "translate the your last response to armenian."}) #changed to gpt 4, as gpt3 couldn't handle this task
chat_response = chat_completion_request(
    messages, tools=None, tool_choice = None, model = 'gpt-4-turbo-preview'
)
assistant_message = chat_response.json()["choices"][0]["message"]["content"]
messages.append({"role": "system", "content": assistant_message})
print(assistant_message)

Երևանի ջերմաստիճանը 5.09 աստիճան է Ցելսիուսով։


In [18]:
messages.append({"role": "user", "content": "change the numbers of your last message to words. e.g. Երևանում ջերմաստիճանը մինուս մեկ ու կես ցելսիուս է։"}) #T2V can't generate armenian numbers from numbers in text
chat_response = chat_completion_request(
    messages, tools=None, tool_choice = None,  model = 'gpt-4-turbo-preview' #changed to gpt 4, as gpt3 couldn't handle this task
)
assistant_message = chat_response.json()["choices"][0]["message"]["content"]
messages.append({"role": "system", "content": assistant_message})
print(assistant_message)

Երևանի ջերմաստիճանը հինգ կոտորակ աստիճան է Ցելսիուսով։


## STEP 5

In [19]:
api_key = openai.api_key
client = OpenAI(api_key=api_key)
response = client.audio.speech.create(
  model="tts-1-hd",
  voice="alloy",
  input=messages[-1]['content']
)

response.stream_to_file("եղանակ.mp3")

  response.stream_to_file("եղանակ.mp3")


## STEP 6

In [20]:
response = client.images.generate(
  model="dall-e-3",
  prompt= messages[2]['content'],
  size="1024x1024",
  quality="hd",
  n=1,
)

image_url = response.data[0].url
image_url
img_data = requests.get(image_url).content
with open('weather_image.jpg', 'wb') as handler:
    handler.write(img_data)

## STEP 7

In [21]:
response = client.chat.completions.create(
  model="gpt-4-vision-preview",
  messages=[
    {
      "role": "user",
      "content": [
        {"type": "text", "text": "Is there text in the image? If yes what is it? DONT DESCRIBE THE IMAGE. OUTPUT ONLY TEXT THAT THERE IS ON IT"},
        {
          "type": "image_url",
          "image_url": {
            "url": image_url,
          },
        },
      ],
    }
  ],
  max_tokens=300,
)

print(response.choices[0].message.content)

4⁰
5.09
Chily
5.1⁰


## STEP 8

In [23]:
!chainlit run app.py

2024-02-11 23:28:17 - Your app is available at http://localhost:8000


ERROR:    [Errno 10048] error while attempting to bind on address ('0.0.0.0', 8000): only one usage of each socket address (protocol/network address/port) is normally permitted


**Task 2:** Use a python library to download a short video from YouTube (e.g. https://github.com/pytube/pytube ), transcribe the text of the video with OpenAIs Whisper API, and use OpenAI's Moderation API to check it. Print the transcribed text and Moderation API response.

In [22]:
import os
import openai
import requests
from pytube import YouTube
from openai import OpenAI

with open('api_key.txt','r') as f:
  openai.api_key = f.read()

client = OpenAI(api_key=openai.api_key)
OPENAI_API_KEY = openai.api_key


def download_video(url, output_path):
    # Download a short video from YouTube
    yt = YouTube(url)
    ys = yt.streams.first()
    ys.download(output_path)

def transcribe_video(video_path):
    # Use OpenAI's Whisper API to transcribe the video
    audio_file = open(video_path, "rb")
    transcript = client.audio.transcriptions.create(
    model="whisper-1", 
    file=audio_file, 
    response_format="text"
    )
    
    return transcript

def moderation_check(text):
    # Use OpenAI's Moderation API to check the transcribed text
    url = "https://api.openai.com/v1/moderations"
    headers = {
        "Content-Type": "application/json",
        "Authorization": f'Bearer {OPENAI_API_KEY}' 
    }
    data = {
        "input": text
    }

    response = requests.post(url, headers=headers, json=data)

    return response.json()

youtube_url = "https://youtu.be/jNQXAC9IVRw?si=hwO-jjLC9Vta1s7g" 
output_path = "./downloaded_video"

# Step 1: Download video from YouTube
download_video(youtube_url, output_path)

# Step 2: Transcribe the video using OpenAI Whisper API
transcribed_text = transcribe_video(output_path + "/" + os.listdir(output_path)[0])

# # Step 3: Check the transcribed text using OpenAI Moderation API
moderation_response = moderation_check(transcribed_text)

# Step 4: Print the results
print(f"Transcribed Text:\n{transcribed_text}\n")
print(f"Moderation API Response:\n{moderation_response}\n")

Transcribed Text:
Alright, so here we are in front of the elephants. The cool thing about these guys is that they have really, really, really long trunks. And that's cool. And that's pretty much all there is to say.


Moderation API Response:
{'id': 'modr-8r9YdEKdal6N02Cpjuk5ZcHGkrAFU', 'model': 'text-moderation-007', 'results': [{'flagged': False, 'categories': {'sexual': False, 'hate': False, 'harassment': False, 'self-harm': False, 'sexual/minors': False, 'hate/threatening': False, 'violence/graphic': False, 'self-harm/intent': False, 'self-harm/instructions': False, 'harassment/threatening': False, 'violence': False}, 'category_scores': {'sexual': 0.0004069973365403712, 'hate': 0.0001445058878744021, 'harassment': 0.004312983714044094, 'self-harm': 6.655918127762561e-07, 'sexual/minors': 8.603204150858801e-06, 'hate/threatening': 4.2790756538124697e-07, 'violence/graphic': 0.0003304301353637129, 'self-harm/intent': 6.307342346190126e-07, 'self-harm/instructions': 3.646016921265982e

# END