# Working with OpenAI GPT models

just like any other APIs, you can send a request to openAI chatGPT server and get the response back from your query.

[<img src="https://res.cloudinary.com/practicaldev/image/fetch/s--6UwyTHKO--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/t637s1yazyyxfl31ymmq.jpg">](https://res.cloudinary.com/practicaldev/image/fetch/s--6UwyTHKO--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/t637s1yazyyxfl31ymmq.jpg)


You need to:
* [Set up an account](https://auth0.openai.com/u/signup/identifier?state=hKFo2SBLZVEyMlJSRDNkbWVMUWVYdU5SVGZKQWltY016ek1POaFur3VuaXZlcnNhbC1sb2dpbqN0aWTZIEJxeTRsb191RnZySEV0b2dlYnRZdGNzQWpZdkRWZjI4o2NpZNkgRFJpdnNubTJNdTQyVDNLT3BxZHR3QjNOWXZpSFl6d0Q)
* [Get an API Key](https://platform.openai.com/api-keys)
* Add money!

In [None]:
!pip install --upgrade openai

In [17]:
from openai import OpenAI
from datasets import load_dataset

import random
import pandas as pd
from pprint import pprint

# I created a local config.py file to manage my secret keys
from config import API_KEY 

In [2]:
client = OpenAI(api_key=API_KEY)

### Text classification with a LLM

In [3]:
# download and cache the dataset:
raw_datasets = load_dataset("imdb")

Found cached dataset imdb (/Users/Amiros/.cache/huggingface/datasets/imdb/plain_text/1.0.0/d613c88cf8fa3bab83b4ded3713f1f74830d1100e171db75bbddb80b3345c9c0)


  0%|          | 0/3 [00:00<?, ?it/s]

In [4]:
raw_datasets['train']['text'][1]

'"I Am Curious: Yellow" is a risible and pretentious steaming pile. It doesn\'t matter what one\'s political views are because this film can hardly be taken seriously on any level. As for the claim that frontal male nudity is an automatic NC-17, that isn\'t true. I\'ve seen R-rated films with male nudity. Granted, they only offer some fleeting views, but where are the R-rated films with gaping vulvas and flapping labia? Nowhere, because they don\'t exist. The same goes for those crappy cable shows: schlongs swinging in the breeze but not a clitoris in sight. And those pretentious indie movies like The Brown Bunny, in which we\'re treated to the site of Vincent Gallo\'s throbbing johnson, but not a trace of pink visible on Chloe Sevigny. Before crying (or implying) "double-standard" in matters of nudity, the mentally obtuse should take into account one unavoidably obvious anatomical difference between men and women: there are no genitals on display when actresses appears nude, and the s

In [5]:
raw_datasets['train']['label'][1]

0

# Add info on temp and other params

In [43]:
params ={
    "model_name": "gpt-3.5-turbo",
    "temperature": 0.1,
    "max_tokens":256
}

def classifier(input_text, parameters, client=client):
    
    
    messages=[
    {"role": "system", "content": "You are a useful assitant for the imdb website. You should read the submitted movie review by a user below and decide if it is a positive or negative. return the result with 0 or 1 for negative and positive respectively"},
    {"role": "user", "content": input_text}
    ]
    
    response = client.chat.completions.create(
        model=parameters["model_name"],
        messages=messages,
        temperature=parameters["temperature"], 
        max_tokens=parameters["max_tokens"],
    )

    return response.choices[0].message.content

In [44]:
classifier(raw_datasets['train']['text'][1], params)

'0'

I can now run the same query on all the rows and get the response

In [38]:
# select random 20 review and their label
random_idx = random.sample(range(1, 25000), 20)


sel_text = [raw_datasets['train']['text'][i] for i in random_idx]
sel_labels = [raw_datasets['train']['label'][i] for i in random_idx]

In [39]:
# turn to a dataframe to make it easier to see and manipulate data/
df = pd.DataFrame([sel_text, sel_labels]).T
df.columns = ['text', 'label']

df.head()

Unnamed: 0,text,label
0,This is my third comment here attempting to co...,1
1,This impossible tale is of a female witch purs...,1
2,Jimmy Dean could not have been more hammy or a...,0
3,A fine story about following your dreams and a...,1
4,"""The 700 Club"" has to be the single most bigot...",0


**pro tip**: **Partial Functions**

A partial function allows us to call a second function with fixed values in certain arguments.

In [40]:
from functools import partial

classifier_pd = partial(classifier, parameters=params)


In [55]:
%%time
df['predicted'] = df['text'].apply(classifier_pd)

You can't get away from data cleaning!!

In [69]:
df['predicted'] = df['predicted'].apply(lambda x: '1' if "positive" in x else x)
df['predicted'] = df['predicted'].apply(lambda x: '0' if "negative" in x else x)

df[["label", "predicted"]] = df[["label", "predicted"]].apply(pd.to_numeric)

In [67]:
from sklearn.metrics import classification_report

print(classification_report(df["label"], df["predicted"]))

              precision    recall  f1-score   support

           0       1.00      0.70      0.82        10
           1       0.77      1.00      0.87        10

    accuracy                           0.85        20
   macro avg       0.88      0.85      0.85        20
weighted avg       0.88      0.85      0.85        20

