# ChatGPT Tutorial

<table align="left">
  <td>
    <a target="_blank" href="https://colab.research.google.com/github/georgia-tech-db/eva/blob/master/tutorials/08-chatgpt.ipynb"><img src="https://www.tensorflow.org/images/colab_logo_32px.png" /> Run on Google Colab</a>
  </td>
  <td>
    <a target="_blank" href="https://github.com/georgia-tech-db/eva/blob/master/tutorials/08-chatgpt.ipynb"><img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source on GitHub</a>
  </td>
  <td>
    <a target="_blank" href="https://github.com/georgia-tech-db/eva/raw/master/tutorials/08-chatgpt.ipynb"><img src="https://www.tensorflow.org/images/download_logo_32px.png" /> Download notebook</a>
  </td>
</table><br><br>

### Connect to EvaDB

In [1]:
%pip install --quiet "evadb[document,notebook]"
import evadb
cursor = evadb.connect().cursor()

[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
grpcio-tools 1.56.0 requires protobuf<5.0dev,>=4.21.6, but you have protobuf 3.20.1 which is incompatible.
ray 2.4.0 requires grpcio<=1.51.3,>=1.42.0; python_version >= "3.10" and sys_platform != "darwin", but you have grpcio 1.56.0 which is incompatible.[0m[31m
[0m

Note: you may need to restart the kernel to use updated packages.


## Download News Video and ChatGPT UDF 

In [2]:
# Download News Video
!wget -nc "https://www.dropbox.com/s/rfm1kds2mv77pca/russia_ukraine.mp4?dl=0" -O russia_ukraine.mp4

# Download ChatGPT UDF if needed
!wget -nc https://raw.githubusercontent.com/georgia-tech-db/eva/master/evadb/udfs/chatgpt.py -O chatgpt.py

File ‘russia_ukraine.mp4’ already there; not retrieving.


File ‘chatgpt.py’ already there; not retrieving.


## Set your OpenAI API key here

In [3]:
# Set your OpenAI key as an environment variable
import os
#os.environ['OPENAI_KEY'] = 'sk-....................'
open_ai_key = os.environ.get('OPENAI_KEY')

In [4]:
# Drop the UDF if it already exists
cursor.query("DROP UDF IF EXISTS ChatGPT;").df()

# Register the ChatGPT UDF in EvaDB
create_udf_query = f"""CREATE UDF ChatGPT
                       IMPL 'chatgpt.py' """
cursor.query(create_udf_query).df()


Unnamed: 0,0
0,UDF ChatGPT successfully added to the database.


## Run the ChatGPT UDF

![OPENAI UDF](chatgpt.png)

In [5]:
#load the video
cursor.drop_table("VIDEOS", if_exists=True).df()
cursor.query("LOAD VIDEO 'russia_ukraine.mp4' INTO VIDEOS;").df()

Unnamed: 0,0
0,Number of loaded VIDEO: 1


In [6]:
# Drop the Text Summarization UDF if needed
cursor.query("DROP UDF IF EXISTS SpeechRecognizer;").df()

# Create a Text Summarization UDF using Hugging Face
text_summarizer_udf_creation = """
        CREATE UDF SpeechRecognizer 
        TYPE HuggingFace 
        'task' 'automatic-speech-recognition' 
        'model' 'openai/whisper-base';
        """
cursor.query(text_summarizer_udf_creation).df()





Unnamed: 0,0
0,UDF SpeechRecognizer successfully added to the...


In [7]:
# Drop the table if needed
cursor.query("DROP TABLE IF EXISTS TEXT_SUMMARY;").df()


# Create a materialized view of the text summarization output
text_summarization_query = """
    CREATE MATERIALIZED VIEW 
    TEXT_SUMMARY(text) AS 
    SELECT SpeechRecognizer(audio) FROM VIDEOS; 
    """
cursor.query(text_summarization_query).df()



In [8]:
# Run ChatGPT over the Text Summary extracted by Whisper
chatgpt_udf = """
      SELECT ChatGPT('Is this video summary related to Ukraine russia war',text) 
      FROM TEXT_SUMMARY;
      """
cursor.query(chatgpt_udf).df()

Unnamed: 0,chatgpt.response
0,"Yes, the video summary is related to the Ukrai..."


## Check if it works on an SNL Video

In [9]:
# Download Entertainment Video
!wget -nc "https://www.dropbox.com/s/u66im8jw2s1dmuw/snl.mp4?dl=0" -O snl.mp4

cursor.query("DROP TABLE IF EXISTS SNL_VIDEO;").df()

cursor.query("LOAD VIDEO 'snl.mp4' INTO SNL_VIDEO;").df()

File ‘snl.mp4’ already there; not retrieving.


Unnamed: 0,0
0,Number of loaded VIDEO: 1


In [10]:
# Drop the table if needed
cursor.query("DROP TABLE IF EXISTS SNL_TEXT_SUMMARY;").df()


# Create a materialized view of the text summarization output
text_summarization_query = """
    CREATE MATERIALIZED VIEW 
    SNL_TEXT_SUMMARY(text) AS 
    SELECT SpeechRecognizer(audio) FROM SNL_VIDEO;
    """
cursor.query(text_summarization_query).df()



### ChatGPT: Is this video summary related to Ukraine War?

In [11]:
# Run ChatGPT over the Text Summary extracted by Whisper
chatgpt_udf = """
      SELECT ChatGPT('Is this video summary related to Ukraine russia war',text) 
      FROM SNL_TEXT_SUMMARY;
      """
cursor.query(chatgpt_udf).df()

Unnamed: 0,chatgpt.response
0,"No, this video summary is not related to the U..."


### ChatGPT: Is this video summary related to a hospital?

In [12]:
# Run ChatGPT over the Text Summary extracted by Whisper
chatgpt_udf = """
      SELECT ChatGPT('Is this video summary related to a hospital',text) 
      FROM SNL_TEXT_SUMMARY;
      """
cursor.query(chatgpt_udf).df()

Unnamed: 0,chatgpt.response
0,"Yes, the video summary is related to a hospita..."
