In [9]:
import os
import time
from google import genai
from dotenv import load_dotenv

# Load API key from .env file
load_dotenv()
api_key_gemini=os.getenv('API_KEY_GEMINI')
client = genai.Client(api_key=api_key_gemini)

In [7]:
import pandas as pd


df = pd.read_csv("small_cleaned_data_amazon.csv")

In [8]:
result = df[df['product_title'] == 'PlayStation Headsets'].apply(
    lambda row: f"{row['customer_id']}: {row['review_headline']} {row['review_body']}",
    axis=1
).tolist()

final_string = '\n'.join(result)
data = final_string.replace("<br />", "")
print(data)

35574474: perfect, thank Its, perfect, thank you
37861475: Excellent A little spendy but the sound quality is amazing and it works perfect out of the box on PS4 or a PC (Windows 10) with Skype.
37987371: these are definitely more comfortable but flawed nonetheless I had the Sony Pulse before these and broke apart within a couple of months, these are definitely more comfortable but flawed nonetheless. Both sets sound great in Game, but poorly when streaming video or listening to music (except when you plug them to good sounding mobile devices from Apple and Sony where they just sound amazing). Sony should add a firmware to let the consoles do the sound processing instead of the earphones and have their design team test their durability before sending them into production.
50549864: pretty good, but wait until on sale... Sound quality is okay but not great (lacks low bass that I like), build quality a bit fragile (can't see hinges or plastic frame lasting long), and has a LOT of drop out

In [19]:
question = f"""The Following text are some reviews from a Amazon Product. \n
     Each row Is a new review which is indicated at the begining with the Customer ID and after is their Review. \n
     Can you identify what are possible improvments that should be made to the product based on all reviews? \n
     How many Reviews are there in total ? 
     """
response2 = client.models.generate_content(model="gemini-2.0-flash", contents=
    f"""{question}\n
     {data}"""
)

print(response2.text)

log_response = f"{question}\n{response2}\n-------------------\n\n"
with open('logs.txt', 'a') as file:
    file.write(log_response)

Okay, here's a breakdown of the Amazon product reviews you provided:

**1. Potential Product Improvements:**

Based on the aggregated reviews, here are the most common areas for potential improvement, ordered by frequency of mention:

*   **Durability & Build Quality:** A significant number of reviews mention the plastic hinges breaking, especially after a few months of use. The folding mechanism is seen as a weak point. The plastic frame feels cheap and brittle to many users.
*   **Microphone Quality:** Many reviewers reported issues with the microphone.  Common complaints included:
    *   Low volume or being difficult to hear.
    *   Muffled or "boxed-in" sound.
    *   Picking up excessive ambient noise.
*   **Comfort (Subjective):**  While many find the headset comfortable, some report discomfort after extended use, especially if they wear glasses. Ear cups can get hot and sweaty. Smaller heads are mentioned for fitting the head set well.
*   **Bass Response:** Some reviewers fin

In [20]:
question = f""" The Following text has First a Usernumber and than their review, each usernamber is a Number and after : it comes the review. The next review is than in the next row which again begins with a number. \n
   How many Reviews are there in total? \n how did you come up with that number ?
     """
response2 = client.models.generate_content(model="gemini-2.0-flash", contents=
    f"""{question}\n
     {data}"""
)

print(response2.text)

log_res = response2.text

log_response = f"{question}\n{log_res}\n-------------------\n\n"
with open('logs.txt', 'a') as file:
    file.write(log_response)

Based on the provided text, there are a total of **120** reviews.

I arrived at this number by counting each line of text where a "Usernumber:" preceeds the review and counted each of those. Each user number begins with a number and thus its easy to count the number of rows which have a reviews.



In [None]:
# the text File clearly isnt working on gicing the number of reviews although the summary looks good

import json


result = df[df['product_title'] == 'PlayStation Headsets'].apply(
    lambda row: {
        "customer_id": row['customer_id'],
        "review_headline": row['review_headline'],
        "review_body": row['review_body']
    },
    axis=1
).tolist()

# Write the result to a JSON file
with open('reviews.json', 'w') as json_file:
    json.dump(result, json_file, indent=4)



In [30]:
def analyze_json_reviews(json_file_path,question):
    try:
        with open(json_file_path, 'r') as f:
            json_data = json.load(f)
    except FileNotFoundError:
        return "Error: JSON file not found."
    except json.JSONDecodeError:
        return "Error: Invalid JSON format."

    json_string = json.dumps(json_data)

    # Initialize the generative model
    model = "gemini-2.0-flash"

    # Create a Gemini content
    contents = f"""{question}\n{json_string}"""
    try:
        response = client.models.generate_content(model="gemini-2.0-flash", contents=contents)
        return response.text
    except Exception as e:
        return f"Error: {str(e)}"



In [None]:
question = f"""The Following JSON File has a customer_id, a review_headline, and a review_body.
How many reviews are there in total by counting how many times the word customer_id is there?
"""
json_file_path = 'reviews_playstation.json' 
result = analyze_json_reviews(json_file_path,question)


log_response = f"GEMINI: {question}\n{result}\n-------------------\n\n"
with open('logs.txt', 'a') as file:
    file.write(log_response)
print(result)

# Note: looks like the LLMS cannot count when the text is long

Based on the provided JSON data, the word "customer_id" appears 202 times, indicating that there are **202 reviews** in total.


In [32]:
question = f"""The Following text are some reviews from a Amazon Product. \n
     Can you identify what are possible improvments that should be made to the product based on all reviews? \n
"""
json_file_path = 'reviews_playstation.json' 
result = analyze_json_reviews(json_file_path,question)


log_response = f"GEMINI: {question}\n{result}\n-------------------\n\n"
with open('logs.txt', 'a') as file:
    file.write(log_response)
print(result)

Based on the reviews, here's a breakdown of potential improvements that should be made to the product:

**1. Durability & Build Quality (Highest Priority):**

*   **Problem:** Many, many reviews cite the plastic hinges breaking easily, often within a few months, even with careful use. This is the most consistently reported issue and a major source of dissatisfaction. Cover wearing away.
*   **Proposed Improvements:**
    *   **Redesign the hinge mechanism:**  Use stronger materials (e.g., metal reinforcement, thicker plastic) or a different design to prevent breakage.
    *   **Improve plastic quality:** Even aside from the hinge, some users report general cheapness and fragility in the plastic. Invest in more durable, higher-quality plastics.
    *   **Reinforce the headband:** Reinforce the top band to prevent cracking, especially in the sections near the pivot hinges, or top of head band.

**2. Microphone Quality:**

*   **Problem:** A significant number of users report issues with 

In [5]:
# Checking if Gemini is recieveing the whole text -- Tokens send around 250.000
text = data
word_count = len(text.split())
print(f"Word count: {word_count}")
print (text)

Word count: 154122
35574474: perfect, thank Its, perfect, thank you
37861475: Excellent A little spendy but the sound quality is amazing and it works perfect out of the box on PS4 or a PC (Windows 10) with Skype.
37987371: these are definitely more comfortable but flawed nonetheless I had the Sony Pulse before these and broke apart within a couple of months, these are definitely more comfortable but flawed nonetheless. Both sets sound great in Game, but poorly when streaming video or listening to music (except when you plug them to good sounding mobile devices from Apple and Sony where they just sound amazing). Sony should add a firmware to let the consoles do the sound processing instead of the earphones and have their design team test their durability before sending them into production.
50549864: pretty good, but wait until on sale... Sound quality is okay but not great (lacks low bass that I like), build quality a bit fragile (can't see hinges or plastic frame lasting long), and ha

In [None]:
question = 

start_time = time.time()
response = client.models.generate_content(model="gemini-2.0-flash", contents=f"""
     {data} \n {question}""") # question at the end 
end_time = time.time()
response_log = response.text
duration = end_time - start_time

print(response_log)

logit = f"llama3.2: untrained no previeous prompt \n {question}\n{response_log}\n Duration:{duration:.3f} \n-------------------\n\n"

with open('logs_gemini-2.0-flash.txt', 'a') as file:
    file.write(logit)



Okay, I'm ready to analyze the PlayStation Gold Wireless Stereo Headset reviews. Here's a breakdown:

**Positive Highlights**

*   **Comfort:** Many customers emphasize how comfortable the headset is, even for extended gaming sessions. The large earcups, good padding, and lightweight design are frequently mentioned. Also good for glasses wearers.
*   **Sound Quality:** The virtual surround sound is considered impressive for the price point, enhancing the immersive experience in games and movies. Specific sounds like footsteps, gunshots, and environmental details are often cited as being clear and directional.
*   **Ease of Use:** The headset is generally described as plug-and-play, with a simple setup process, especially on PlayStation consoles.
*   **Value for Money:** A recurring theme is that the headset provides excellent value for its price, often compared favorably to more expensive brands like Turtle Beach and Astro.
*   **Wireless Convenience:** The wireless functionality is hi

'amazon_review_prompts.json'