# Extract motivational messages

I conceived of this project the same week as ChatGPT3 exploded into conversations at work, online, and at home. While my original plan had been to practice my NLP skills, I switched my goal to learn how to use this new tool.

1. Load transcripts from the output of the [previous notebook](https://github.com/ltran17/motivational_messages/blob/main/notebooks/02-transcripts.ipynb).
2. Interpret response from ChatGPT3
3. Play with the parameters of request
4. Summarize motivational talks for each video
5. Save the motivational messages


### 1. Load transcripts from the output of the previous notebook

In [1]:
import pandas as pd

* I do not push data to code repositories, and encourage you to do the same. To keep everything organized on my machine, I have a `data` folder in my project and include this folder in the project's `.gitignore` file.

In [2]:
filename = '../data/2023-01-effort-motivations.csv'

In [12]:
data = pd.read_csv(filename, parse_dates=['date'])
data.head()

Unnamed: 0,date,vid,motivation
0,2023-01-02,qLKflJjWDi4,big exhale shift your hips up and forward and...
1,2023-01-03,NzIylrLhkJ4,amazing job rest back on your glutes you did ...
2,2023-01-04,A5kHPgAi3I0,break deep breath nice wide stance hinge forw...
3,2023-01-06,A67VMX9hlmc,give me two minutes here nice wide stance exh...
4,2023-01-07,lpv4RdNefyU,amazing work give me two minutes do not leave...


### 2. Interpret response from ChatGPT3

* ChatGPT3 has multiple language models available for use. It looks like [Davinci](https://platform.openai.com/docs/models/davinci) will be the right one for this task of interpreting and summarizing.

* Different models will give different results. The creators of an endless, AI-generated, Seinfeld parody [learned this the hard way](https://wapo.st/3I7BSHW) when they switched from Davinci to Curie.

* For this project, I am not worried about generating hateful or harming content. I am taking note of [OpenAI's moderation tools](https://platform.openai.com/docs/guides/moderation/overview) for future work.

* The next three cells are set up.

In [13]:
with open('../config/chat_gpt3_api') as file:
    CHAT_KEY = file.readline()

* Even more important than not pushing data to a code repository is to **never push a key to a code repository**! I have stored the keys in a folder called `config` and added this folder to the project's `.gitignore` file. There are other ways to accomplish this, choose your preferred method.

In [14]:
import openai
openai.api_key = CHAT_KEY
models = openai.Model.list()

In [18]:
test_text = data.loc[0,'motivation']
test_text[:100]

' big exhale shift your hips up and forward and before you head out if this is your first workout or '

* The template for the function in the next cell comes from OpenAI. If you used the playground when you were setting up API access, you already encountered the `temperature` and `max_tokens` parameters.

* I interpret `temperature` as a creativity setting. Set to 0, multiple calls with the same text will return similar (if not identical) responses. Set to 1, the model gets very creative! 

* You can find no better explanation of OpenAI's tokens than the [documentation](https://help.openai.com/en/articles/4936856-what-are-tokens-and-how-to-count-them) from the source. If you want a more general discussion of tokens and tokenization, send "nlp tokens" through your favorite search engine to learn more.

In [26]:
def get_response(text):
    prompt = 'extract the motivational message from the following text and summarize in three sentences: '
    response = openai.Completion.create(
        model='text-davinci-003',
        prompt=prompt+text,
        temperature=0.7,
        max_tokens=25
    )
    return response

In [28]:
response = get_response(test_text)
response

<OpenAIObject text_completion id=cmpl-6hfQV5TTlYKCopTdgWTXgb5zauFPo at 0x7fbf50bd6160> JSON: {
  "choices": [
    {
      "finish_reason": "length",
      "index": 0,
      "logprobs": null,
      "text": "\n\nThis motivational message encourages applying effort not just when we're in the mood, but when we show up consistently. It"
    }
  ],
  "created": 1675865367,
  "id": "cmpl-6hfQV5TTlYKCopTdgWTXgb5zauFPo",
  "model": "text-davinci-003",
  "object": "text_completion",
  "usage": {
    "completion_tokens": 25,
    "prompt_tokens": 464,
    "total_tokens": 489
  }
}

* For a first pass, this is a pretty good summary! Let me see how to extract it from the response.

In [32]:
response['choices'][0]['text']

"\n\nThis motivational message encourages applying effort not just when we're in the mood, but when we show up consistently. It"

* I notice that it cut off the summary -- no doubt because I limited the `max_tokens` to 25.

* I also notice that it really sounds like a summary! I would rather it sound like a motivational message. I need to adjust the prompt.

In [33]:
def get_response(text):
    prompt = 'extract the motivational message from the following text in three sentences: '
    response = openai.Completion.create(
        model='text-davinci-003',
        prompt=prompt+text,
        temperature=0.7,
        max_tokens=100
    )
    return response

In [34]:
response = get_response(test_text)
response['choices'][0]['text']

'\n\nEffort is an important part of achieving success and making progress towards your goals. Showing up and not giving up is the key to making progress and reaching your goals. Subscribe to the channel to help more people have access to high quality fitness and stay dedicated to putting in the effort every day.'

* This sounds better! However, the model is interpreting some of the boilerplate text as motivational. Adjust the prompt further.

In [42]:
def get_response(text):
    directions = 'extract the motivational message from the following text in three sentences: '
    conditions = 'Ignore sentences that direct viewers to subscribe, like, buy, or comment.'
    response = openai.Completion.create(
        model='text-davinci-003',
        prompt=directions+text+conditions,
        temperature=0.7,
        max_tokens=100
    )
    return response

In [36]:
response = get_response(test_text)
response['choices'][0]['text']

" Motivation comes from within, so remember to stay determined and dedicated to your goals. Show up and never give up, even when you don't feel motivated. Effort will show you that you can do more than you thought and help you reach your goals."

* Not bad! This sounds almost like Sydney herself. I am going to move on with the `get_response` function as defined above.

* Before I get the motivations from all of the videos, I'll write a couple of functions streamline this process.

In [37]:
def extract_text(response):
    '''
    Return the text from the response object
    '''
    text = response['choices'][0]['text']
    return text

* Also, I will be polite by pausing between requests. 

In [38]:
import time

In [39]:
def get_motivation(text, sleep_time = 3):
    '''
    Return the extracted motivation from the text, pausing for sleep_time seconds after request.
    '''
    response = get_response(text)
    motivation = extract_text(response)
    time.sleep(sleep_time)
    return motivation

* Test this out...

In [40]:
get_motivation(test_text)

" Motivate yourself to stay dedicated, consistent, and determined with your lifestyle or goal. Put in the effort and don't give up - even when you're not feeling motivated or in the mood. Show up and keep growing together."

* Alright! I am ready to get the motivations for the month! 

* This next call will take a minute because of (1) the sleep parameter and (2) however busy the ChatGPT3 servers are. In fact...the first time I ran the next cell, it returned a rate limit error. That's what happens when the entire US is awake and pinging the servers.

In [43]:
motivations = data['motivation'].apply(get_motivation)
motivations[0]

RateLimitError: The server had an error while processing your request. Sorry about that!