# Fine-tuning GPT-3.5 for LiveLongMotivate(LLM)!

Use OpenAI's GPT-3.5 model to create synthetic training data.

In [1]:
!pip install openai

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting openai
  Downloading openai-0.27.4-py3-none-any.whl (70 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m70.3/70.3 KB[0m [31m2.0 MB/s[0m eta [36m0:00:00[0m
Collecting aiohttp
  Downloading aiohttp-3.8.4-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.0 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.0/1.0 MB[0m [31m8.0 MB/s[0m eta [36m0:00:00[0m
Collecting yarl<2.0,>=1.0
  Downloading yarl-1.8.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (264 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m264.6/264.6 KB[0m [31m11.9 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting multidict<7.0,>=4.5
  Downloading multidict-6.0.4-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (114 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m114.2/114.2 KB[0m [31m3.9 MB/s[0m eta [36

In [2]:
import os 

# Set the OPENAI_API_KEY environment variable
os.environ["OPENAI_API_KEY"] = ""

In [3]:
import openai
import os
from IPython.display import display, Markdown

class Prompter:
    def __init__(self, gpt_model):
        if not os.environ.get("OPENAI_API_KEY"):
            raise Exception("Please set the OPENAI_API_KEY environment variable")

        openai.api_key = os.environ.get("OPENAI_API_KEY")

        self.gpt_model = gpt_model

    def prompt_model_print(self, messages: list):
        response = openai.ChatCompletion.create(model=self.gpt_model, messages=messages)
        display(Markdown(response["choices"][0]["message"]["content"]))
    
    def prompt_model_return(self, messages: list):
        response = openai.ChatCompletion.create(model=self.gpt_model, messages=messages)
        return response["choices"][0]["message"]["content"]

Replace `"MODEL_STRING"` with either `"gpt-3.5-turbo"` or `"gpt-4"`.

In [4]:
prompter = Prompter("gpt-3.5-turbo")

In [6]:
datagen_prompts = [
    {"role" : "system", "content" : "You are an energetic health motivational coach."},
    {"role" : "user", "content" : "Please generate a Python list of health coaching goals and a short description of each goal. Goals should include diet, rest, exercise, environment, and sleep."},
]

In [7]:
prompter.prompt_model_print(datagen_prompts)

Sure, here are some health coaching goals that you can consider and customize based on your needs and preferences:

```
goals = [
    {
        'name': 'Balanced Diet',
        'description': 'Eating a variety of foods from each food group in appropriate portions to get essential nutrients and maintain a healthy weight.'
    },
    {
        'name': 'Hydration',
        'description': 'Drinking enough water to keep the body hydrated, improve digestion, increase energy, and optimize brain function.'
    },
    {
        'name': 'Physical Exercise',
        'description': 'Engaging in regular physical activity to improve cardiovascular health, build muscle, increase flexibility, and reduce stress levels.'
    },
    {
        'name': 'Rest and Recovery',
        'description': 'Taking breaks from work or study, engaging in relaxation techniques, and getting enough restful sleep to reduce stress, avoid burnout, and enhance productivity.'
    },
    {
        'name': 'Mindful Eating',
        'description': 'Eating slowly, chewing food thoroughly, and paying attention to hunger and fullness cues to prevent overeating, improve digestion, and enjoy food more fully.'
    },
    {
        'name': 'Reducing Sugar Intake',
        'description': 'Eating less sugar and processed foods to lower the risk of developing chronic diseases such as obesity, diabetes, and heart disease, and to boost energy levels.'
    },
    {
        'name': 'Improving Sleep Habits',
        'description': 'Establishing a consistent sleep schedule, creating a relaxing sleep environment, and avoiding caffeine and electronic use before bed to improve sleep quality, duration, and overall health.'
    },
    {
        'name': 'Reducing Stress Levels',
        'description': 'Incorporating stress-management techniques such as meditation, deep breathing, or yoga into daily routines to reduce stress and boost overall well-being.'
    },
    {
        'name': 'Creating a Healthy Environment',
        'description': 'Making changes to the physical and social environment, such as reducing exposure to toxins, increasing natural light, and building supportive relationships to enhance health and wellness.'
    },
    {
        'name': 'Building Healthy Habits',
        'description': 'Incorporating new healthy habits into daily routines, such as walking, meal prepping, or journaling, to improve overall health and create long-term positive changes.'
    }
]
```

These goals are not exhaustive and can be revised based on specific needs and goals.

Now that we have our initial data, lets create the health tips. We will use the following function to create the tips.

In [8]:
goals_list = [
    {
        'name': 'Balanced Diet',
        'description': 'Eating a variety of foods from each food group in appropriate portions to get essential nutrients and maintain a healthy weight.'
    },
    {
        'name': 'Hydration',
        'description': 'Drinking enough water to keep the body hydrated, improve digestion, increase energy, and optimize brain function.'
    },
    {
        'name': 'Physical Exercise',
        'description': 'Engaging in regular physical activity to improve cardiovascular health, build muscle, increase flexibility, and reduce stress levels.'
    },
    {
        'name': 'Rest and Recovery',
        'description': 'Taking breaks from work or study, engaging in relaxation techniques, and getting enough restful sleep to reduce stress, avoid burnout, and enhance productivity.'
    },
    {
        'name': 'Mindful Eating',
        'description': 'Eating slowly, chewing food thoroughly, and paying attention to hunger and fullness cues to prevent overeating, improve digestion, and enjoy food more fully.'
    },
    {
        'name': 'Reducing Sugar Intake',
        'description': 'Eating less sugar and processed foods to lower the risk of developing chronic diseases such as obesity, diabetes, and heart disease, and to boost energy levels.'
    },
    {
        'name': 'Improving Sleep Habits',
        'description': 'Establishing a consistent sleep schedule, creating a relaxing sleep environment, and avoiding caffeine and electronic use before bed to improve sleep quality, duration, and overall health.'
    },
    {
        'name': 'Reducing Stress Levels',
        'description': 'Incorporating stress-management techniques such as meditation, deep breathing, or yoga into daily routines to reduce stress and boost overall well-being.'
    },
    {
        'name': 'Creating a Healthy Environment',
        'description': 'Making changes to the physical and social environment, such as reducing exposure to toxins, increasing natural light, and building supportive relationships to enhance health and wellness.'
    },
    {
        'name': 'Building Healthy Habits',
        'description': 'Incorporating new healthy habits into daily routines, such as walking, meal prepping, or journaling, to improve overall health and create long-term positive changes.'
    }
]

In [9]:
system_prompt = {"role" : "system", "content" : "You are a health coach with a very upbeat attitude, like coach Ted Lasso. Your job is to create motivational healthy messages for your clients."}

In [10]:
coach = []
for item in goals_list:
    user_prompt = {"role" : "user", "content" : f"Tip: {item['name']} Description: {item['description']}"}
    coach.append(prompter.prompt_model_return([system_prompt, user_prompt]))
    print(coach[-1])


"Hey there champions! Remember, a balanced diet is your fuel for success! Fill your plate with a variety of foods in just the right amount from each food group to keep your body energized and healthy. Let's make every meal a winning one! Keep it up!"
"Stay hydrated and keep your body happy! Sip on water throughout the day and enjoy the benefits of improved digestion, increased energy, and optimal brain function. Remember, sometimes the simplest changes can make the biggest difference!"
Message: Get your body moving! Whether it's a brisk walk, a jog in the park, or a yoga class, make sure you take some time each day to engage in physical exercise. Not only will it help you feel better physically, but it will also reduce stress and boost your mood. So, put on your workout clothes, grab some water, and get ready to feel energized and alive! Your body will thank you.
Great work so far! Here's another motivational tip for your clients:

Tip: Drink Plenty of Water
Description: Staying hydrat

In [11]:
!pip install huggingface_hub

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting huggingface_hub
  Downloading huggingface_hub-0.13.4-py3-none-any.whl (200 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m200.1/200.1 KB[0m [31m18.1 MB/s[0m eta [36m0:00:00[0m
Installing collected packages: huggingface_hub
Successfully installed huggingface_hub-0.13.4


Log into Huggingface

In [12]:
!huggingface-cli login


    _|    _|  _|    _|    _|_|_|    _|_|_|  _|_|_|  _|      _|    _|_|_|      _|_|_|_|    _|_|      _|_|_|  _|_|_|_|
    _|    _|  _|    _|  _|        _|          _|    _|_|    _|  _|            _|        _|    _|  _|        _|
    _|_|_|_|  _|    _|  _|  _|_|  _|  _|_|    _|    _|  _|  _|  _|  _|_|      _|_|_|    _|_|_|_|  _|        _|_|_|
    _|    _|  _|    _|  _|    _|  _|    _|    _|    _|    _|_|  _|    _|      _|        _|    _|  _|        _|
    _|    _|    _|_|      _|_|_|    _|_|_|  _|_|_|  _|      _|    _|_|_|      _|        _|    _|    _|_|_|  _|_|_|_|
    
    To login, `huggingface_hub` requires a token generated from https://huggingface.co/settings/tokens .
Token: 
Add token as git credential? (Y/n) n
Token is valid.
Your token has been saved to /root/.cache/huggingface/token
Login successful


In [13]:
def create_dataset(goals_list, coach):
  for goals, coach in zip(goals_list, coach):
    goals["coach"] = coach

  return goals_list

In [14]:
dataset = create_dataset(goals_list, coach)

In [15]:
!pip install datasets

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting datasets
  Downloading datasets-2.11.0-py3-none-any.whl (468 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m468.7/468.7 KB[0m [31m33.3 MB/s[0m eta [36m0:00:00[0m
Collecting responses<0.19
  Downloading responses-0.18.0-py3-none-any.whl (38 kB)
Collecting dill<0.3.7,>=0.3.0
  Downloading dill-0.3.6-py3-none-any.whl (110 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m110.5/110.5 KB[0m [31m18.0 MB/s[0m eta [36m0:00:00[0m
Collecting multiprocess
  Downloading multiprocess-0.70.14-py39-none-any.whl (132 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m132.9/132.9 KB[0m [31m20.0 MB/s[0m eta [36m0:00:00[0m
Collecting xxhash
  Downloading xxhash-3.2.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (212 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m212.2/212.2 KB[0m [31m28.5 MB/s[0m et

In [16]:
from datasets import load_dataset, Dataset
import pandas as pd

In [17]:
hf_large_dataset = Dataset.from_pandas(pd.DataFrame(data=dataset))

In [18]:
hf_large_dataset

Dataset({
    features: ['name', 'description', 'coach'],
    num_rows: 10
})

In [19]:
hf_username = "curtpond"

In [21]:
hf_large_dataset.push_to_hub(f"{hf_username}/live-long-motivate")

Pushing dataset shards to the dataset hub:   0%|          | 0/1 [00:00<?, ?it/s]

Creating parquet from Arrow format:   0%|          | 0/1 [00:00<?, ?ba/s]

Upload 1 LFS files:   0%|          | 0/1 [00:00<?, ?it/s]