In [179]:
%pip install python-dotenv

Note: you may need to restart the kernel to use updated packages.



[notice] A new release of pip is available: 24.0 -> 25.3
[notice] To update, run: python.exe -m pip install --upgrade pip


In [181]:
import google.generativeai as genai
import json
import pandas as pd
import re
from sklearn.metrics import accuracy_score
import time
import os
from dotenv import load_dotenv

In [182]:
load_dotenv()
google_api_key=os.getenv("GOOGLE_API_KEY")
genai.configure(api_key=google_api_key)
model=genai.GenerativeModel("gemini-2.0-flash")

In [183]:
# for m in genai.list_models():
#     print(m.name)

In [184]:
res=model.generate_content("what is AI")
print(res.text)

AI, or Artificial Intelligence, is a broad field of computer science focused on creating machines that can perform tasks that typically require human intelligence. Think of it as teaching computers to "think" and "learn" like humans do.

Here's a breakdown of key aspects:

**Core Concepts:**

*   **Mimicking Human Intelligence:** AI aims to replicate or simulate human cognitive functions such as learning, problem-solving, decision-making, perception, reasoning, and understanding natural language.
*   **Algorithms and Data:** AI systems rely heavily on algorithms (sets of rules or instructions) and vast amounts of data to learn and improve their performance.
*   **Learning and Adaptation:** AI systems are designed to learn from data and experiences, adapting their behavior over time without explicit programming for every situation.

**Types of AI:**

AI is often categorized based on its capabilities and functionality:

*   **Narrow or Weak AI (ANI):** Designed and trained for a specific

In [169]:
df=pd.read_csv("yelp.csv")
df.head()

Unnamed: 0,business_id,date,review_id,stars,text,type,user_id,cool,useful,funny
0,9yKzy9PApeiPPOUJEtnvkg,2011-01-26,fWKvX83p0-ka4JS3dc6E5A,5,My wife took me here on my birthday for breakf...,review,rLtl8ZkDX5vH5nAx9C3q5Q,2,5,0
1,ZRJwVLyzEJq1VAihDhYiow,2011-07-27,IjZ33sJrzXqU-0X6U8NwyA,5,I have no idea why some people give bad review...,review,0a2KyEL0d3Yb1V6aivbIuQ,0,0,0
2,6oRAC4uyJCsJl1X0WZpVSA,2012-06-14,IESLBzqUCLdSzSqm0eCSxQ,4,love the gyro plate. Rice is so good and I als...,review,0hT2KtfLiobPvh6cDC8JQg,0,1,0
3,_1QQZuf4zZOyFCvXc0o6Vg,2010-05-27,G-WvGaISbqqaMHlNnByodA,5,"Rosie, Dakota, and I LOVE Chaparral Dog Park!!...",review,uZetl9T0NcROGOyFfughhg,1,2,0
4,6ozycU1RpktNG2-1BroVtw,2012-01-05,1uJFq2r5QfJG_6ExMRCaGw,5,General Manager Scott Petello is a good egg!!!...,review,vYmM4KTsC8ZfQBg-j5MWkw,0,0,0


In [170]:
# selecting only text and stars features and 200 rows
df=df[["stars","text"]]
df_sample=df.sample(n=200,random_state=42)
df_sample.head()

Unnamed: 0,stars,text
6252,4,We got here around midnight last Friday... the...
4684,5,Brought a friend from Louisiana here. She say...
1731,3,"Every friday, my dad and I eat here. We order ..."
4742,1,"My husband and I were really, really disappoin..."
4521,5,Love this place! Was in phoenix 3 weeks for w...


In [171]:
# defining prompts
def prompt1(review):
    return f"""
Analyze the following Yelp review and predict the star rating from 1 to 5.
Provide the output as a JSON object with keys: 
{{
"predicted_stars":(int) 
"explanation":(string)
 }}

Review:{review}
"""

In [172]:
# 2nd prompt: reasoning
def prompt2(review):
    return f"""
Review:{review}

First analyze the tone and content of this review.
Then provide your star rating prediction from 1 to 5 with brief reasoning.

Return ONLY valid JSON in this exact format:
{{
  "predicted_stars": (integer 1-5),
  "explanation": "your reasoning here"
}}
"""

In [173]:
# 3rd prompt: few examples with specific criteria
def prompt3(review):
    return f"""
You are an expert review rating model.
Predict the star rating for this Yelp review based on these criteria:
- 5 stars: Excellent, highly satisfied, strong positive language
- 4 stars: Good, satisfied, mostly positive
- 3 stars: Average, mixed feelings, neutral language
- 2 stars: Poor, disappointed, mostly negative
- 1 star: Terrible, very dissatisfied, strong negative language

Here are examples to guide you:
Example 1:
Review: "Amazing food, great service. Will come again!"
Output:
{{
  "predicted_stars": 5,
  "explanation": "Very positive review about food and service."
}}

Example 2:
Review: "The food was terrible and cold. Worst experience ever."
Output:
{{
  "predicted_stars": 1,
  "explanation": "Strong negative sentiment."
}}


Analyze the review below. FIRST, weigh the positive vs. negative aspects. THEN assign the star rating.
Return STRICTLY JSON in this format:
{{
  "predicted_stars": (integer 1-5),
  "explanation": "your reasoning here"
}}

Review:{review}
"""

In [174]:
# cleaning  markdown code blocks from the llm response
def clean_json(json_str):
    cleaned=re.sub(r"```json\n?|```", "",json_str).strip()
    return cleaned

# sending prompts to llm and parsing the response
def predict_rating(review,prompt_fun):
    prompt=prompt_fun(review)
    try:
        response=model.generate_content(prompt)
        text_response=response.text

        # parsing response
        cleaned_json=clean_json(text_response)
        data=json.loads(cleaned_json)
        return data["predicted_stars"],data["explanation"],True    #if valid json then True
    except Exception as e:
        return -1, e, False
    
    

In [175]:
# run
results=[]
print("starting prediction.it will take few min.")

# testing all 3 prompts on first 10 row of df_sample
for index,row in df_sample.head(15).iterrows():
    actual_stars=row["stars"]
    review_text=row["text"]

    # prompt1
    p1_star, p1_exp, p1_valid = predict_rating(review_text,prompt1)
    # prompt2
    p2_star, p2_exp, p2_valid = predict_rating(review_text,prompt2)
    # prompt3
    p3_star, p3_exp, p3_valid = predict_rating(review_text,prompt3)

    results.append({
        "index": index,
        "actual_stars": actual_stars,
        "p1_pred": p1_star,"p1_valid": p1_valid,
        "p2_pred": p2_star,"p2_valid": p2_valid,
        "p3_pred": p3_star,"p3_valid": p3_valid
    })
    time.sleep(4)  #delay execution because model is limited to 15 request per min
result_df=pd.DataFrame(results)


starting prediction.it will take few min.


In [176]:
result_df.head(20)

Unnamed: 0,index,actual_stars,p1_pred,p1_valid,p2_pred,p2_valid,p3_pred,p3_valid
0,6252,4,4,True,4,True,4,True
1,4684,5,5,True,5,True,5,True
2,1731,3,4,True,4,True,-1,False
3,4742,1,1,True,1,True,1,True
4,4521,5,5,True,5,True,5,True
5,6340,4,4,True,4,True,4,True
6,576,4,4,True,4,True,4,True
7,5202,4,5,True,5,True,-1,False
8,6363,5,5,True,5,True,5,True
9,439,1,1,True,-1,False,-1,False


In [177]:
def calculate_metrics(col_pred,col_valid):
    valid_rows=result_df[result_df[col_valid]==True]
    if len(valid_rows)>0:
        accuracy=accuracy_score(valid_rows['actual_stars'],valid_rows[col_pred])
    else:
        accuracy=0.0

    validity_rate = result_df[col_valid].mean()
    return accuracy, validity_rate

# Calculate for all 3
acc1, val1 = calculate_metrics('p1_pred', 'p1_valid')
acc2, val2 = calculate_metrics('p2_pred', 'p2_valid')
acc3, val3 = calculate_metrics('p3_pred', 'p3_valid')

# Comparison Table
comparison_table = pd.DataFrame({
    "Prompt": ["basic prompt", "reasoning based prompt", "criteria based and examples"],
    "Accuracy": [acc1, acc2, acc3],
    "JSON Validity": [val1, val2, val3]
})
print("\n--- Comparison Table ---")
print(comparison_table)

# save results to csv
result_df.to_csv("task1_results.csv", index=False)


--- Comparison Table ---
                        Prompt  Accuracy  JSON Validity
0                 basic prompt  0.818182       0.733333
1       reasoning based prompt  0.818182       0.733333
2  criteria based and examples  1.000000       0.600000
