# ChatGPT Tutorial

<table align="left">
  <td>
    <a target="_blank" href="https://colab.research.google.com/github/georgia-tech-db/eva/blob/master/tutorials/08-chatgpt.ipynb"><img src="https://www.tensorflow.org/images/colab_logo_32px.png" /> Run on Google Colab</a>
  </td>
  <td>
    <a target="_blank" href="https://github.com/georgia-tech-db/eva/blob/master/tutorials/08-chatgpt.ipynb"><img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source on GitHub</a>
  </td>
  <td>
    <a target="_blank" href="https://github.com/georgia-tech-db/eva/raw/master/tutorials/08-chatgpt.ipynb"><img src="https://www.tensorflow.org/images/download_logo_32px.png" /> Download notebook</a>
  </td>
</table><br><br>

### Connect to EvaDB

In [21]:
%pip install --quiet "evadb[document,notebook]"
import evadb
cursor = evadb.connect().cursor()

Note: you may need to restart the kernel to use updated packages.


Could not find image processor class in the image processor config or the model config. Loading based on pattern matching with the model's feature extractor configuration.


## Download News Video and ChatGPT UDF 

In [22]:
# Download News Video
!wget -nc "https://www.dropbox.com/s/rfm1kds2mv77pca/russia_ukraine.mp4?dl=0" -O russia_ukraine.mp4

# Download ChatGPT UDF if needed
!wget -nc https://raw.githubusercontent.com/georgia-tech-db/eva/master/evadb/udfs/chatgpt.py -O chatgpt.py

--2023-06-17 00:46:24--  https://www.dropbox.com/s/rfm1kds2mv77pca/russia_ukraine.mp4?dl=0
Resolving www.dropbox.com (www.dropbox.com)... 162.125.5.18, 2620:100:601f:18::a27d:912
Connecting to www.dropbox.com (www.dropbox.com)|162.125.5.18|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: /s/raw/rfm1kds2mv77pca/russia_ukraine.mp4 [following]
--2023-06-17 00:46:24--  https://www.dropbox.com/s/raw/rfm1kds2mv77pca/russia_ukraine.mp4
Reusing existing connection to www.dropbox.com:443.
HTTP request sent, awaiting response... 302 Found
Location: https://uc097ecffd1c722dd8cc24688cc3.dl.dropboxusercontent.com/cd/0/inline/B-LyTRbfB-OdytIJzCyYryP0ueyCd22Y568u4Db6oEA_myn2wzXryIusBvGJxnhsQUs2GqaP3FZjUgAaiHUybuhrePYJZbTiD6U4Ef2Oqa_-5esrinFdFEddIiUvEbyA3kSBSdLGdHmtjkO7JP6gVnERyxT1QMUyMkfNGGuD36CUuw/file# [following]
--2023-06-17 00:46:25--  https://uc097ecffd1c722dd8cc24688cc3.dl.dropboxusercontent.com/cd/0/inline/B-LyTRbfB-OdytIJzCyYryP0ueyCd22Y568u4Db6oEA_myn2wzXryIusB

## Set your OpenAI API key here

In [23]:
# Set your OpenAI key as an environment variable
import os
#os.environ['OPENAI_KEY'] = 'sk-....................'
open_ai_key = os.environ.get('OPENAI_KEY')

In [24]:
# Drop the UDF if it already exists
cursor.query("DROP UDF IF EXISTS ChatGPT;").df()

# Register the ChatGPT UDF in EvaDB
create_udf_query = f"""CREATE UDF ChatGPT
                       IMPL 'chatgpt.py' """
cursor.query(create_udf_query).df()


Unnamed: 0,0
0,UDF ChatGPT successfully added to the database.


## Run the ChatGPT UDF

![OPENAI UDF](chatgpt.png)

In [25]:
#load the video
cursor.drop_table("VIDEOS", if_exists=True).df()
cursor.query("LOAD VIDEO 'russia_ukraine.mp4' INTO VIDEOS;").df()



Unnamed: 0,0
0,Number of loaded VIDEO: 1


In [26]:
# Drop the Text Summarization UDF if needed
cursor.query("DROP UDF IF EXISTS SpeechRecognizer;").df()

# Create a Text Summarization UDF using Hugging Face
text_summarizer_udf_creation = """
        CREATE UDF SpeechRecognizer 
        TYPE HuggingFace 
        'task' 'automatic-speech-recognition' 
        'model' 'openai/whisper-base';
        """
cursor.query(text_summarizer_udf_creation).df()



Unnamed: 0,0
0,UDF SpeechRecognizer successfully added to the...


In [27]:
# Drop the table if needed
cursor.query("DROP TABLE IF EXISTS TEXT_SUMMARY;").df()


# Create a materialized view of the text summarization output
text_summarization_query = """
    CREATE MATERIALIZED VIEW 
    TEXT_SUMMARY(text) AS 
    SELECT SpeechRecognizer(audio) FROM VIDEOS; 
    """
cursor.query(text_summarization_query).df()

2023-06-17 00:46:34,758	INFO worker.py:1625 -- Started a local Ray instance.


In [29]:
# Run ChatGPT over the Text Summary extracted by Whisper
chatgpt_udf = """
      SELECT ChatGPT('Is this video summary related to Ukraine russia war',text) 
      FROM TEXT_SUMMARY;
      """
cursor.query(chatgpt_udf).df()

Unnamed: 0,chatgpt.response
0,"Yes, the video summary is related to the Ukrai..."


## Check if it works on an SNL Video

In [30]:
# Download Entertainment Video
!wget -nc "https://www.dropbox.com/s/u66im8jw2s1dmuw/snl.mp4?dl=0" -O snl.mp4

cursor.query("DROP TABLE IF EXISTS SNL_VIDEO;").df()

cursor.query("LOAD VIDEO 'snl.mp4' INTO SNL_VIDEO;").df()

--2023-06-17 00:51:02--  https://www.dropbox.com/s/u66im8jw2s1dmuw/snl.mp4?dl=0
Resolving www.dropbox.com (www.dropbox.com)... 162.125.5.18, 2620:100:601f:18::a27d:912
Connecting to www.dropbox.com (www.dropbox.com)|162.125.5.18|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: /s/raw/u66im8jw2s1dmuw/snl.mp4 [following]
--2023-06-17 00:51:02--  https://www.dropbox.com/s/raw/u66im8jw2s1dmuw/snl.mp4
Reusing existing connection to www.dropbox.com:443.
HTTP request sent, awaiting response... 302 Found
Location: https://uc6c52df43ed7de6f5fd0218310c.dl.dropboxusercontent.com/cd/0/inline/B-KmCchrnFb-umjFViXmS8ZQvVmAJ9I6f0d81EaSJq39A5eFpmOPZEKODatTy6FN4HQZiJUJs91r8x6_dX0QfxsabL-1iGbzx-lvdFlWD1LDKTXJH2tYC5I3NnWUmWqgW5m01rChaO3GXV2087MJNS29imvKNIEKxpc5BaUdY6J2gw/file# [following]
--2023-06-17 00:51:03--  https://uc6c52df43ed7de6f5fd0218310c.dl.dropboxusercontent.com/cd/0/inline/B-KmCchrnFb-umjFViXmS8ZQvVmAJ9I6f0d81EaSJq39A5eFpmOPZEKODatTy6FN4HQZiJUJs91r8x6_dX0QfxsabL



Unnamed: 0,0
0,Number of loaded VIDEO: 1


In [31]:
# Drop the table if needed
cursor.query("DROP TABLE IF EXISTS SNL_TEXT_SUMMARY;").df()


# Create a materialized view of the text summarization output
text_summarization_query = """
    CREATE MATERIALIZED VIEW 
    SNL_TEXT_SUMMARY(text) AS 
    SELECT SpeechRecognizer(audio) FROM SNL_VIDEO;
    """
cursor.query(text_summarization_query).df()



### ChatGPT: Is this video summary related to Ukraine War?

In [32]:
# Run ChatGPT over the Text Summary extracted by Whisper
chatgpt_udf = """
      SELECT ChatGPT('Is this video summary related to Ukraine russia war',text) 
      FROM SNL_TEXT_SUMMARY;
      """
cursor.query(chatgpt_udf).df()

Unnamed: 0,chatgpt.response
0,"No, this video summary is not related to the U..."


### ChatGPT: Is this video summary related to a hospital?

In [33]:
# Run ChatGPT over the Text Summary extracted by Whisper
chatgpt_udf = """
      SELECT ChatGPT('Is this video summary related to a hospital',text) 
      FROM SNL_TEXT_SUMMARY;
      """
cursor.query(chatgpt_udf).df()

Unnamed: 0,chatgpt.response
0,"Yes, the video summary is related to a hospita..."
