In [2]:
# -------------------- 📌 Phase 4: Generative AI Recommendation System --------------------

# 🔍 Objective:
# In this phase, we use machine learning to find similar games based on text features,
# then apply two Generative AI models — GPT (OpenAI) and LLaMA-like (via Hugging Face) —
# to explain the recommendations in human-friendly language.

# ✅ Algorithm choice:
# - We use TF-IDF with Nearest Neighbors because it's simple, interpretable, and well-suited for sparse textual data.
# - GPT (gpt-3.5-turbo) and Hugging Face's Mistral-7B (as a substitute for LLaMA) are used to demonstrate explanation styles.

# ⚠️ Notes:
# - GPT requires OpenAI API key (free or paid).
# - Hugging Face model requires free token (from hf.co/settings/tokens).

# -----------------------------------------------------------------------

# 📦 Import required libraries
import pandas as pd
import numpy as np
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.neighbors import NearestNeighbors
import openai
import os
import requests
import json
from huggingface_hub import InferenceClient
from openai import OpenAI


# --------------------------------------------
# 🔐 Set your OpenAI and Hugging Face API keys
# --------------------------------------------
openai.api_key = "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"  
HUGGINGFACE_API_KEY = "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"  
client_hf = InferenceClient(model="mistralai/Mistral-7B-Instruct-v0.1", token=HUGGINGFACE_API_KEY)

# --------------------------------------------
# 📥 Load dataset
# --------------------------------------------
url = "https://raw.githubusercontent.com/AljawharahAlotaibi/swe485/main/Dataset/updated_cleaned_games.xls"
df = pd.read_csv(url)

# 📄 Process the genre/category and text columns
df['genres'] = df['genres_y'].astype(str).apply(lambda x: x.strip("[]").replace("'", "").split(', '))
df['categories'] = df['categories_y'].astype(str).apply(lambda x: x.strip("[]").replace("'", "").split(', '))
df['text'] = df['detailed_description'].fillna("")

# --------------------------------------------
# ✍️ Ask user for game name
# --------------------------------------------
game_name = input("Enter a game name from the dataset: ").strip() # aexamples from dataset: Stacking, Planet Centauri, All Alone, Twelve Minutes

# --------------------------------------------
# 🧠 TF-IDF + Nearest Neighbors for Recommendation
# --------------------------------------------
tfidf = TfidfVectorizer(max_features=300)
tfidf_matrix = tfidf.fit_transform(df['text'])

nn = NearestNeighbors(n_neighbors=4, metric='cosine')
nn.fit(tfidf_matrix)

def get_recommendations(name):
    if name not in df['name'].values:
        print("❌ Game not found.")
        return None, None
    
    idx = df[df['name'] == name].index[0]
    distances, indices = nn.kneighbors(tfidf_matrix[idx])
    recommended = df.iloc[indices[0][1:]]  # skip the game itself
    return df.iloc[idx], recommended

base_game, recos = get_recommendations(game_name)
if recos is None:
    raise ValueError("Game not found. Please run the cell again and enter a valid game name.")

# --------------------------------------------
# ✨ Compose prompt for GPT and Hugging Face
# --------------------------------------------
gpt_prompt = f"""
The user is interested in a game called '{base_game['name']}'. Here's its description:

{base_game['detailed_description'][:600]}

Below are the descriptions of three recommended games:
1. {recos.iloc[0]['name']}: {recos.iloc[0]['detailed_description'][:400]}
2. {recos.iloc[1]['name']}: {recos.iloc[1]['detailed_description'][:400]}
3. {recos.iloc[2]['name']}: {recos.iloc[2]['detailed_description'][:400]}

Explain why these are good recommendations for someone who enjoyed '{base_game['name']}'.
"""

# --------------------------------------------
# 🤖 Get GPT Explanation (OpenAI)
# --------------------------------------------
client = OpenAI(api_key=openai.api_key)

response_gpt = client.chat.completions.create(
    model="gpt-3.5-turbo",
    messages=[
        {"role": "system", "content": "You are a helpful game advisor."},
        {"role": "user", "content": gpt_prompt}
    ]
)

gpt_output = response_gpt.choices[0].message.content

# --------------------------------------------
# 🦙 Get LLaMA-like Explanation (Hugging Face - Mistral)
# --------------------------------------------
response_llama = client_hf.text_generation(
    gpt_prompt,
    max_new_tokens=400,
    temperature=0.7
)

# --------------------------------------------
# 📊 Print Outputs for Comparison
# --------------------------------------------
print("\n" + "="*60)
print("🤖 GPT Explanation:\n")
print(gpt_output)
print("\n" + "="*60)
print("🦙 LLaMA (Mistral) Explanation:\n")
print(response_llama)


Enter a game name from the dataset:  All Alone



🤖 GPT Explanation:

These recommendations might not align perfectly with the horror genre of 'All Alone,' but they do have some elements that could appeal to players who enjoyed the quiet terror and suspense experience of the game:

1. **The Vanishing of Ethan Carter**: This game also focuses on exploration and discovery rather than combat, similar to 'All Alone.' Players who enjoyed the mysterious and immersive atmosphere of 'All Alone' might appreciate the narrative-driven experience of trying to unravel the mysteries in 'The Vanishing of Ethan Carter.'

2. **Dorasyeoda**: While this game is set in a different time period and has a different premise, it also offers a unique world and story for players to explore. If someone enjoyed the sense of mystery and uncovering the secrets of the environment in 'All Alone,' they might appreciate the depth of lore and storytelling in 'Dorasyeoda.'

3. **Ultimate Fishing Simulator 2**: Despite being a fishing simulator, this game offers a relaxi

In [3]:
# ================================================
# 🧠 Final Choice & Justification
# ================================================

# We tested two different Generative AI templates: 
# 1. GPT (gpt-3.5-turbo from OpenAI) 
# 2. Mistral (LLaMA-like model via Hugging Face)

# While both models provided useful game recommendations and explanations, 
# we decided to move forward with GPT for the following reasons:

# ✅ GPT provided more immersive and emotionally engaging explanations.
# ✅ It matched the tone of the original game more naturally, which aligns 
#    with how gamers think when looking for similar experiences (themes, tone, mood).
# ✅ It required less prompt engineering to get relevant responses.

# 🛑 Mistral was still useful, but it produced slightly more mechanical and repetitive 
# outputs, and often restated the same sentence pattern for each recommendation. 

# ⚠️ Note: While GPT required payment or usage quotas, the assignment does not 
# mandate using fully free models — it only requires comparing two templates. 
# Therefore, we made our choice based on quality of output rather than cost.

# 📚 Reference: Phase 4 description explicitly says:
# "You must apply at least two templates and demonstrate the differences between their outcomes" — and that’s what we have done.
