## Detailed Article Explaination

The detailed code explanation for this article is available at the following link:

https://www.daniweb.com/programming/computer-science/tutorials/541335/comparing-google-gemini-pro-with-openai-gpt-4-for-zero-shot-classification

For my other articles for Daniweb.com, please see this link:

https://www.daniweb.com/members/1235222/usmanmalik57

## Importing and Installing Required Libraries 

In [1]:
## pip install --upgrade google-cloud-aiplatform
## pip install openai

In [6]:
import os
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
from openai import OpenAI

import vertexai
from vertexai.preview.generative_models import GenerativeModel, Part

## Importing the Dataset

In [11]:
dataset = pd.read_csv(r"D:\Datasets\IMDB Dataset.csv")
dataset = dataset.sample(frac=1).reset_index(drop=True)
dataset = dataset.head(100)
print(dataset['sentiment'].value_counts())
dataset.head()

sentiment
positive    50
negative    50
Name: count, dtype: int64


Unnamed: 0,review,sentiment
0,This film is a very good movie.The way how the...,positive
1,"This film is just as bad as ""The Birdman of Al...",negative
2,Bromwell High is nothing short of brilliant. E...,positive
3,"""Tourist Trap"" is an odd thriller that came ou...",positive
4,There is no greater disservice to do to histor...,positive


## Zero Shot Text Classification with Google Gemini Pro

In [12]:
os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = r"PATH_TO_VERTEX_AI_SERVICE_ACCOUNT JSON FILE"

In [15]:
model = GenerativeModel("gemini-pro")
config = {
    "max_output_tokens": 100,
    "temperature": 0,
}

def find_sentiment_gemini(review):
    
    content = """What is the sentiment expressed in the following IMDB movie review? 
    Select sentiment value from positive or negative. Return only the sentiment value in small letters.
    Movie review: {}""".format(review)
        
    responses = model.generate_content(
        content,
        generation_config= config,
    stream=True,
    )
  
    for response in responses:
        return response.text
    

In [16]:
%%time

all_sentiments = []

reviews_list = dataset["review"].tolist()

i = 0
exceptions = 0
while i < len(reviews_list):

    try:
        review = reviews_list[i]
        sentiment_value = find_sentiment_gemini(review)
        all_sentiments.append(sentiment_value)
        i = i + 1
        print(i, sentiment_value)

    except Except as e:
        print("===================")
        print("Exception occured", e)
        exception = exception + 1
        
print("Total exception count:", exceptions)

1 positive
2 negative
3 positive
4 positive
5 negative
6 negative
7 negative
8 negative
9 positive
10 positive
11 negative
12 negative
13 positive
14 negative
15 negative
16 negative
17 positive
18 positive
19 negative
20 positive
21 negative
22 negative
23 negative
24 negative
25 negative
26 positive
27 negative
28 negative
29 positive
30 positive
31 positive
32 positive
33 positive
34 positive
35 negative
36 positive
37 positive
38 negative
39 positive
40 negative
41 positive
42 negative
43 negative
44 negative
45 positive
46 negative
47 negative
48 negative
49 positive
50 negative
51 negative
52 positive
53 negative
54 negative
55 positive
56 negative
57 positive
58 negative
59 negative
60 positive
61 negative
62 negative
63 positive
64 positive
65 positive
66 negative
67 positive
68 positive
69 negative
70 negative
71 positive
72 negative
73 negative
74 positive
75 negative
76 positive
77 positive
78 positive
79 positive
80 negative
81 positive
82 negative
83 negative
84 negative
8

In [17]:
accuracy = accuracy_score(all_sentiments, dataset["sentiment"])
print("Accuracy:", accuracy)

Accuracy: 0.93


## Zero Shot Text Classification with GPT-4

In [18]:
client = OpenAI(
    # This is the default and can be omitted
    api_key = os.environ.get('OPENAI_KEY2'),
)

In [20]:
def find_sentiment_gpt(review):

    content = """What is the sentiment expressed in the following IMDB movie review? 
    Select sentiment value from positive or negative. Return only the sentiment value in small letters.
    Movie review: {}""".format(review)

    sentiment = client.chat.completions.create(
      model= "gpt-4",
      temperature = 0,
      max_tokens = 100,
      messages=[
            {"role": "user", "content": content}
        ]
    )
    
    return sentiment.choices[0].message.content
  

In [21]:
%%time

all_sentiments = []

reviews_list = dataset["review"].tolist()

i = 0
exceptions = 0
while i < len(reviews_list):

    try:
        review = reviews_list[i]
        sentiment_value = find_sentiment_gpt(review)
        all_sentiments.append(sentiment_value)
        i = i + 1
        print(i, sentiment_value)

    except Except as e:
        print("===================")
        print("Exception occured", e)
        exception = exception + 1
        
print("Total exception count:", exceptions)

1 positive
2 negative
3 positive
4 positive
5 negative
6 negative
7 negative
8 negative
9 positive
10 positive
11 negative
12 negative
13 positive
14 negative
15 negative
16 negative
17 positive
18 positive
19 negative
20 positive
21 negative
22 negative
23 negative
24 negative
25 negative
26 positive
27 negative
28 negative
29 positive
30 positive
31 positive
32 positive
33 positive
34 positive
35 negative
36 positive
37 positive
38 negative
39 positive
40 negative
41 positive
42 negative
43 negative
44 negative
45 positive
46 negative
47 positive
48 negative
49 positive
50 negative
51 negative
52 positive
53 negative
54 negative
55 positive
56 negative
57 positive
58 negative
59 negative
60 positive
61 negative
62 negative
63 positive
64 positive
65 positive
66 negative
67 positive
68 positive
69 negative
70 negative
71 positive
72 negative
73 positive
74 positive
75 negative
76 positive
77 positive
78 positive
79 positive
80 negative
81 positive
82 negative
83 negative
84 negative
8

In [22]:
accuracy = accuracy_score(all_sentiments, dataset["sentiment"])
print("Accuracy:", accuracy)

Accuracy: 0.95
