<a href="https://colab.research.google.com/github/hochschule-pforzheim/ml-examples/blob/main/chatgpt/example_chatpgt.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# How to use ChatGPT in a notebook

Hi there! This is just a simple notebook which demonstrates

* how to call ChatGPT 👾 from the official OpenAI API,
* let ChatGPT use some sample content provided by Google Colaboratory, and (`california_housing_train.csv`)
* train a scikit model for a simple prediction.

Have fun using and extending it. 🥳

## Setup OpenAI API

In [None]:
# install required python package
pip install openai --upgrade


In order to use ChatGPT (or other OpenAI models) you will need to provide an API key. Simply login to OpenAI and create a personal key from this page: [api keys](https://platform.openai.com/account/api-keys). Then just call the next two cells, enter it and make OpenAI using it.

In [None]:
# use getpass to provide API key
from getpass import getpass
secret = getpass('Enter the secret value: ')

In [None]:
from google.colab import userdata

secret=userdata.get('OPENAI_API_KEY')

In [None]:
import openai

openai.api_key = secret

In [None]:
from openai import OpenAI

try:
  client = OpenAI(
    api_key=secret
  )
except userdata.SecretNotFoundError as e:
   print(f'''Secret not found\n\nThis expects you to create a secret named {openai_api_secret_name} in Colab\n\nVisit https://platform.openai.com/api-keys to create an API key\n\nStore that in the secrets section on the left side of the notebook (key icon)\n\nName the secret {openai_api_secret_name}''')
   raise e
except userdata.NotebookAccessError as e:
  print(f'''You need to grant this notebook access to the {openai_api_secret_name} secret in order for the notebook to access Gemini on your behalf.''')
  raise e
except Exception as e:
  # unknown error
  print(f"There was an unknown error. Ensure you have a secret {openai_api_secret_name} stored in Colab and it's a valid key from https://platform.openai.com/api-keys")
  raise e

Making a call to OpenAI API as described from their official [documentation](https://platform.openai.com/docs/api-reference/models/list).

In [None]:
# Beispiel-Eingabe
prompt = "Once upon a time"
chatgpt_model = "gpt-3.5-turbo-0125"
messages = [{"role": "user", "content": prompt}]

# Anfrage an die Chat-Completion-API
response = client.chat.completions.create(
    model=chatgpt_model,
    messages=messages,
    max_tokens=1024,
    n=1,
    stop=None,
    temperature=0.5,
)

# Ausgabe des Inhalts
print(response.choices[0].message.content)

## Setting up ChatGPT
So far so good... now let's connect with ChatGPT.

In [None]:
class ChatBot:
    def __init__(self, initial_prompt: str, model="gpt-3.5-turbo-0125"):
        self.initial_prompt = initial_prompt
        self.model = model
        openai.api_key = secret

    def __call__(self, prompt: str):
        response = client.chat.completions.create(
            model=self.model,
            messages=[
                {"role": "system", "content": self.initial_prompt},
                {"role": "user", "content": prompt}
            ]
        )
        return response.choices[0].message.content.strip()

In [None]:
from IPython.display import Markdown, display

def display_markdown_output(output):
    # Format the output string
    formatted_output = output.replace("\\n", "\n")
    display(Markdown(formatted_output))

## Call your Bot
Let's initialize our ChatGPT bot:

In [None]:
chat_gpt = ChatBot("You are a chatbot imitating ChatGPT.")
chat_gpt("Tell me about yourself")

Let's try and create a regression model with [scikit learn](https://scikit-learn.org/stable/index.html):

In [None]:
display_markdown_output(chat_gpt("Please create a python programm to train a regression model with scikit learn."))

In [None]:
# will it work? give it a try in this code cell

Now we could also let ChatGPT create some test data...

In [None]:
chat_gpt("Please create a csv file as input to this code.")

But wouldn't it be more interesting to train a model on "real" data? Let's try with some data and see how our model performs. First let ChatGPT create a short program to load a sample file provided by [Google Colab](https://colab.research.google.com/). We will try to use [pandas](https://pandas.pydata.org/) to handle data in a csv-file. Besides a couple of comments (made by ChatGPT), your output may look like this:

```python
# Import the necessary libraries
import pandas as pd
from google.colab import files

# Load the data into a Pandas dataframe
df = pd.read_csv('sample_data/california_housing_train.csv')

# Print the first five rows of the data
print(df.head())
```

If provided, you may remove an upload dialog (such as `uploaded = files.upload()` and possibly, you may need to add `sample_data/` to your path, to load the csv-file.

In [None]:
# call to let chatgpt create our data handling code
chat_gpt("Please create a python programm to load the sample csv-file california_housing_train.csv provided by Google Colab.")

In [None]:
# place your code (generated by ChatGPT) here


In [None]:
# Import the necessary libraries
import pandas as pd
from google.colab import files

# Upload the file from your computer to Colab
# uploaded = files.upload()

# Load the data into a Pandas dataframe
df = pd.read_csv('sample_data/california_housing_train.csv')

# Print the first five rows of the data
print(df.head())

Now let's train a model which is making use of our california housing dataframe.

In [None]:
chat_gpt("Please create a python programm to train a regression model with scikit learn based on the california_housing_train.csv.")

In [None]:
import pandas as pd
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split

# Separate the features and target variable
X = df.drop('median_house_value', axis=1)
y = df['median_house_value']

# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=42)

# Create a Linear Regression model and fit it with the training data
model = LinearRegression()
model.fit(X_train, y_train)

# Evaluate the model on the testing data
score = model.score(X_test, y_test)

# Print the R^2 score of the model
print("R^2 Score:", score)


## Challenge Task

You did it! 🌴🎉🥳

Quite nice so far. But is that a good result? Ask yourself (and maybe ChatGPT 😃) how to further test and improve your model. Possible questions to think about:

1. How to analyze input data? (preprocessing and visualization)
1. How to test a machine learning model?
1. What is a test data set?
1. How to train other machine learning models?
1. Which are possible In-/Outputs for ChatGPT?

### Further reading

- Overview of [OpenAI models](https://platform.openai.com/docs/models/overview)
- machinelearningmastery-Blog: [How Do I Get Started?](https://machinelearningmastery.com/start-here/#process)
- Open-Soruce library for machine learning: [scikit learn](https://scikit-learn.org/stable/index.html)
- [ChatGPT4](https://openai.com/product/gpt-4)
