# 🎮 GameIntel AI – Sentiment + Simulated Scene Classification

✅ Final Project Notebook: Fully Commented & Annotated

- **Part 1:** Real sentiment analysis (Steam reviews)
- **Part 2:** Simulated scene classification + Grad-CAM

⚠️ Note: The CV (scene classification) section is simulated. Replace simulated data with real screenshots in `screenshots/` directory when available.

---

### 📋 Run With Real Data Checklist
✅ Prepare folders like:
```
screenshots/
├── combat/
├── menu/
├── shop/
├── cutscene/
```
✅ Replace the simulated data cell with `ImageDataGenerator.flow_from_directory()`.
✅ Train ResNet50 as shown.


## 🔍 Part 1: Sentiment Prediction – Real Data

In [None]:

# Load Steam reviews dataset
import pandas as pd

df = pd.read_csv("steam_reviews.csv")  # make sure this file is in the same folder
# Calculate sentiment score: positive vs total ratings
df['sentiment_score'] = df['positive_ratings'] / (df['positive_ratings'] + df['negative_ratings'])
df['sentiment_label'] = df['sentiment_score'].apply(lambda x: 'Positive' if x > 0.75 else 'Negative')
df[['sentiment_score', 'sentiment_label']].head()


In [None]:

# Train a Random Forest on price & playtime to predict sentiment
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import classification_report, roc_curve
import matplotlib.pyplot as plt
import seaborn as sns

df_model = df[['price', 'average_playtime', 'sentiment_label']].dropna()
df_model['sentiment_label'] = df_model['sentiment_label'].map({'Positive':1, 'Negative':0})
X = df_model[['price', 'average_playtime']]
y = df_model['sentiment_label']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

model = RandomForestClassifier()
model.fit(X_train, y_train)
y_pred = model.predict(X_test)

print(classification_report(y_test, y_pred))

# ROC curve
y_prob = model.predict_proba(X_test)[:,1]
fpr, tpr, _ = roc_curve(y_test, y_prob)
plt.plot(fpr, tpr)
plt.xlabel("False Positive Rate")
plt.ylabel("True Positive Rate")
plt.title("ROC Curve")
plt.show()


In [None]:

# Feature importance
importance = model.feature_importances_
sns.barplot(x=importance, y=X.columns)
plt.title("Feature Importance")
plt.show()


## 🖼️ Part 2: Simulated Scene Classification + Grad-CAM

⚠️ This section uses simulated data.
- Replace `X_fake` with real data from `screenshots/` when available.
- Keep folder structure as mentioned in checklist.


In [None]:

# Simulate images and labels
import numpy as np
from tensorflow.keras.utils import to_categorical
from sklearn.model_selection import train_test_split
import tensorflow as tf
from tensorflow.keras.applications import ResNet50
from tensorflow.keras.layers import Dense, GlobalAveragePooling2D
from tensorflow.keras.models import Model

X_fake = np.random.rand(500, 224, 224, 3)
y_fake = np.random.choice([0, 1, 2, 3], size=500)  # 4 classes
y_fake_cat = to_categorical(y_fake)

X_train, X_val, y_train, y_val = train_test_split(X_fake, y_fake_cat, test_size=0.2)

base_model = ResNet50(weights=None, include_top=False, input_shape=(224,224,3))
x = GlobalAveragePooling2D()(base_model.output)
output = Dense(4, activation='softmax')(x)
model_cv = Model(inputs=base_model.input, outputs=output)

model_cv.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
history = model_cv.fit(X_train, y_train, validation_data=(X_val, y_val), epochs=3)


In [None]:

# Simulated training accuracy plot
import matplotlib.pyplot as plt

plt.plot(history.history['accuracy'], label='train acc')
plt.plot(history.history['val_accuracy'], label='val acc')
plt.legend()
plt.title("Simulated ResNet50 Training Accuracy")
plt.show()


## 🔥 Simulated Grad-CAM Example

In [None]:

# Simulate Grad-CAM on a random image
import cv2

sample_img = np.random.rand(224,224,3)
img_tensor = np.expand_dims(sample_img, axis=0)

grad_model = tf.keras.models.Model([model_cv.inputs], [model_cv.get_layer('conv5_block3_out').output, model_cv.output])

with tf.GradientTape() as tape:
    conv_outputs, predictions = grad_model(img_tensor)
    loss = predictions[:, tf.argmax(predictions[0])]

grads = tape.gradient(loss, conv_outputs)[0]
pooled_grads = tf.reduce_mean(grads, axis=(0, 1, 2))
cam = tf.reduce_sum(tf.multiply(pooled_grads, conv_outputs[0]), axis=-1)

cam = np.maximum(cam, 0) / tf.reduce_max(cam)
cam = cv2.resize(cam.numpy(), (224, 224))
heatmap = cv2.applyColorMap(np.uint8(255*cam), cv2.COLORMAP_JET)
superimposed_img = heatmap*0.4 + sample_img*255

plt.imshow(superimposed_img.astype("uint8"))
plt.title("Simulated Grad-CAM Heatmap")
plt.axis("off")
plt.show()


## 📝 Conclusion

- 📄 Part 1: Successfully analyzed Steam review sentiment with Random Forest.
- 🖼️ Part 2: Demonstrated scene classification pipeline with simulated data.
- 🔜 When real screenshots are available, replace simulated data with real images.

🎯 Next steps: Integrate real data + deploy Streamlit dashboard for interactive use.
