## Detailed Article Explaination

The detailed code explanation for this article is available at the following link:

https://www.daniweb.com/programming/computer-science/tutorials/542001/openai-gpt-4o-vs-meta-llama-3-for-zero-shot-text-classifiation

For my other articles for Daniweb.com, please see this link:

https://www.daniweb.com/members/1235222/usmanmalik57

## Importing and Installing Required Libraries

In [9]:
!pip install openai
!pip install groq
!pip install pandas
!pip install scikit-learn

Collecting scikit-learn
  Downloading scikit_learn-1.5.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (11 kB)
Collecting scipy>=1.6.0 (from scikit-learn)
  Downloading scipy-1.13.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (60 kB)
[2K     [38;2;114;156;31m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m60.6/60.6 kB[0m [31m1.9 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting joblib>=1.2.0 (from scikit-learn)
  Downloading joblib-1.4.2-py3-none-any.whl.metadata (5.4 kB)
Collecting threadpoolctl>=3.1.0 (from scikit-learn)
  Downloading threadpoolctl-3.5.0-py3-none-any.whl.metadata (13 kB)
Downloading scikit_learn-1.5.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (13.3 MB)
[2K   [38;2;114;156;31m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m13.3/13.3 MB[0m [31m7.6 MB/s[0m eta [36m0:00:00[0mm eta [36m0:00:01[0m0:01[0m:01[0m
[?25hDownloading joblib-1.4.2-py3-none-any.whl (301 kB)
[2K   [38;2;114;156;31m━━━━

In [2]:
import os
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

from openai import OpenAI
from groq import Groq


## Importing and Preprocessing the Dataset

In [3]:
dataset = pd.read_csv(r"/home/mani/Datasets/Tweets.csv")
print(dataset.shape)
dataset.head()

(14640, 15)


Unnamed: 0,tweet_id,airline_sentiment,airline_sentiment_confidence,negativereason,negativereason_confidence,airline,airline_sentiment_gold,name,negativereason_gold,retweet_count,text,tweet_coord,tweet_created,tweet_location,user_timezone
0,570306133677760513,neutral,1.0,,,Virgin America,,cairdin,,0,@VirginAmerica What @dhepburn said.,,2015-02-24 11:35:52 -0800,,Eastern Time (US & Canada)
1,570301130888122368,positive,0.3486,,0.0,Virgin America,,jnardino,,0,@VirginAmerica plus you've added commercials t...,,2015-02-24 11:15:59 -0800,,Pacific Time (US & Canada)
2,570301083672813571,neutral,0.6837,,,Virgin America,,yvonnalynn,,0,@VirginAmerica I didn't today... Must mean I n...,,2015-02-24 11:15:48 -0800,Lets Play,Central Time (US & Canada)
3,570301031407624196,negative,1.0,Bad Flight,0.7033,Virgin America,,jnardino,,0,@VirginAmerica it's really aggressive to blast...,,2015-02-24 11:15:36 -0800,,Pacific Time (US & Canada)
4,570300817074462722,negative,1.0,Can't Tell,1.0,Virgin America,,jnardino,,0,@VirginAmerica and it's a really big bad thing...,,2015-02-24 11:14:45 -0800,,Pacific Time (US & Canada)


In [4]:
def preprocess_data(dataset):

    # Remove rows where 'airline_sentiment' or 'text' are NaN
    dataset = dataset.dropna(subset=['airline_sentiment', 'text'])

    # Remove rows where 'airline_sentiment' or 'text' are empty strings
    dataset = dataset[(dataset['airline_sentiment'].str.strip() != '') & (dataset['text'].str.strip() != '')]

    # Filter the DataFrame for each sentiment
    neutral_df = dataset[dataset['airline_sentiment'] == 'neutral']
    positive_df = dataset[dataset['airline_sentiment'] == 'positive']
    negative_df = dataset[dataset['airline_sentiment'] == 'negative']

    # Randomly sample records from each sentiment
    neutral_sample = neutral_df.sample(n=34)
    positive_sample = positive_df.sample(n=33)
    negative_sample = negative_df.sample(n=33)

    # Concatenate the samples into one DataFrame
    dataset = pd.concat([neutral_sample, positive_sample, negative_sample])

    # Reset index if needed
    dataset.reset_index(drop=True, inplace=True)

    return dataset


dataset = preprocess_data(dataset)
# print value counts
print(dataset["airline_sentiment"].value_counts())

airline_sentiment
neutral     34
positive    33
negative    33
Name: count, dtype: int64


## Zero Shot Text Classification with GPT-4o

In [15]:
client = OpenAI(
    api_key = os.environ.get('OPENAI_API_KEY'),
)


def find_sentiment_gpt(tweet):

    content = """What is the sentiment expressed in the following tweet about an airline?
    Select sentiment value from positive, negative, or neutral. Return only the sentiment value in small letters.
    tweet: {}""".format(tweet)

    sentiment = client.chat.completions.create(
      model= "gpt-4o",
      temperature = 0,
      max_tokens = 10,
      messages=[
            {"role": "user", "content": content}
        ]
    )

    return sentiment.choices[0].message.content



In [18]:
%%time

all_sentiments = []

tweets_list = dataset["text"].tolist()

i = 0
exceptions = 0
while i < len(tweets_list):

    try:
        tweet = tweets_list[i]
        sentiment_value = find_sentiment_gpt(tweet)
        all_sentiments.append(sentiment_value)
        i = i + 1
        print(i, sentiment_value)

    except Except as e:
        print("===================")
        print("Exception occured", e)
        exception = exception + 1

print("Total exception count:", exceptions)

1 negative
2 neutral
3 negative
4 neutral
5 neutral
6 neutral
7 neutral
8 negative
9 neutral
10 negative
11 neutral
12 neutral
13 neutral
14 negative
15 neutral
16 neutral
17 neutral
18 neutral
19 neutral
20 neutral
21 negative
22 neutral
23 neutral
24 neutral
25 neutral
26 neutral
27 positive
28 neutral
29 negative
30 neutral
31 neutral
32 negative
33 neutral
34 neutral
35 positive
36 neutral
37 neutral
38 positive
39 positive
40 positive
41 positive
42 positive
43 positive
44 neutral
45 neutral
46 positive
47 positive
48 positive
49 positive
50 neutral
51 positive
52 neutral
53 positive
54 positive
55 neutral
56 positive
57 positive
58 positive
59 positive
60 neutral
61 positive
62 neutral
63 negative
64 positive
65 positive
66 positive
67 positive
68 negative
69 negative
70 negative
71 negative
72 negative
73 negative
74 negative
75 negative
76 negative
77 negative
78 negative
79 negative
80 negative
81 negative
82 negative
83 negative
84 neutral
85 negative
86 negative
87 negative


In [19]:
accuracy = accuracy_score(all_sentiments, dataset["airline_sentiment"])
print("Accuracy:", accuracy)

Accuracy: 0.78


## Zero Shot Text Classification with Groq Llama-3

In [6]:
client = Groq(
    api_key=os.environ.get("GROQ_API_KEY"),
)



def find_sentiment_llama3(tweet):

    content = """What is the sentiment expressed in the following tweet about an airline?
    Select sentiment value from positive, negative, or neutral. Return only the sentiment value in small letters.
    tweet: {}""".format(tweet)

    sentiment = client.chat.completions.create(
      model="llama3-70b-8192",
      temperature = 0,
      max_tokens = 10,
      messages=[
            {"role": "user", "content": content}
        ]
    )

    return sentiment.choices[0].message.content



In [8]:
%%time

all_sentiments = []

tweets_list = dataset["text"].tolist()

i = 0
exceptions = 0
while i < len(tweets_list):

    try:
        tweet = tweets_list[i]
        sentiment_value = find_sentiment_llama3(tweet)
        all_sentiments.append(sentiment_value)
        i = i + 1
        print(i, sentiment_value)

    except Except as e:
        print("===================")
        print("Exception occured", e)
        exception = exception + 1

print("Total exception count:", exceptions)

1 negative
2 neutral
3 neutral
4 neutral
5 neutral
6 neutral
7 neutral
8 neutral
9 negative
10 positive
11 negative
12 neutral
13 negative
14 neutral
15 negative
16 neutral
17 negative
18 negative
19 negative
20 neutral
21 positive
22 neutral
23 negative
24 negative
25 neutral
26 negative
27 neutral
28 negative
29 positive
30 neutral
31 neutral
32 positive
33 neutral
34 negative
35 positive
36 positive
37 positive
38 positive
39 positive
40 negative
41 positive
42 positive
43 positive
44 positive
45 negative
46 positive
47 positive
48 positive
49 positive
50 positive
51 positive
52 positive
53 positive
54 positive
55 positive
56 positive
57 neutral
58 positive
59 positive
60 positive
61 positive
62 positive
63 positive
64 positive
65 positive
66 positive
67 positive
68 negative
69 negative
70 negative
71 negative
72 negative
73 negative
74 negative
75 negative
76 negative
77 negative
78 negative
79 negative
80 negative
81 negative
82 negative
83 negative
84 negative
85 negative
86 nega

In [9]:
accuracy = accuracy_score(all_sentiments, dataset["airline_sentiment"])
print("Accuracy:", accuracy)

Accuracy: 0.78
