# **STEAM REVIEW SENTIMENT & PLAYER BEHAVIOR ANALYSIS**
Authors: `Krystal Bacalso` `Javier Raut` `Joseph Desyolong` `Jhon Omblero` `Hayah Apistar`

## **Phase 1: API Selection — Steam User Reviews API**
### **Overview:**
Steam is one of the world's largest digital distribution platforms for PC gaming, hosting thousands of games and millions of active players. The Steam Web API allows developers and researchers to access various types of public data from the platform, such as game information, user statistics, and community content.
<br><br>
### **Focus of the Analysis**
Our project explores the question:

> *“What factors influence how players review games on Steam, and how do those reviews reflect player experience and engagement?”*

Through this analysis, we aim to identify patterns that connect:
- **Player engagement** (playtime, ownership history)  
- **Behavioral indicators** (helpful votes, purchase type)  
- **Sentiment or recommendation trends** (voted_up vs. review content)

We will analyze these relationships to uncover how players form opinions and how engagement behaviors influence review positivity or negativity.
<br><br>
### **What Kind of Data the Steam User Reviews API Provides**
The Steam User Reviews API provides a wide range of fields describing user behavior, playtime, and feedback. Below are some of the most relevant fields that will be used in our analysis:
| Field | Description |
|:------|:-------------|
| `review` | The text content of the user’s review |
| `voted_up` | Indicates whether the user recommended the game (positive = True, negative = False) |
| `votes_up` | Number of users who found the review helpful |
| `author.num_games_owned` | Total number of games owned by the reviewer |
| `author.num_reviews` | Number of reviews written by the user |
| `author.playtime_forever` | Total lifetime playtime (in minutes) for the game |
| `author.playtime_last_two_weeks` | Playtime in the past two weeks |
| `author.playtime_at_review` | Playtime at the time the review was written |
| `timestamp_created` | Unix timestamp when the review was created |
| `timestamp_updated` | date the review was last updated (unix timestamp) |
| `steam_purchase` | True if the review came from a verified Steam purchase |
| `received_for_free` | True if the game was received for free |
| `written_during_early_access ` | true if the user posted this review while the game was in Early Access |

These selected fields will serve as the foundation for our analysis of player engagement, review behavior, and sentiment trends.

By combining metadata such as playtime and ownership with review outcomes (e.g., voted_up), we can uncover what factors drive positive or negative reviews and how engagement reflects player satisfaction across different games.
<br><br>
### **Example: How to Call the Steam User Reviews API**
The Steam User Reviews API endpoint is structured as follows:
```python
https://store.steampowered.com/appreviews/{app_id}?json=1
```
Replace {app_id} with the game’s unique Steam App ID.
Adding ?json=1 ensures the response is returned in JSON format.
For example, the game Stardew Valley has the App ID 413150.
To retrieve its reviews:


In [None]:
import requests
import pandas as pd

app_id = 413150  # Stardew Valley
url = f"https://store.steampowered.com/appreviews/{app_id}"

params = {
    "json": 1,
    "filter": "recent",           # Sort reviews by most recent
    "language": "all",            # Include all languages
    "day_range": "365",           # Get reviews from the past year
    "review_type": "all",         # Include both positive and negative reviews
    "purchase_type": "all",       # Include both Steam and non-Steam purchases
    "num_per_page": 100,          # Fetch 100 reviews per request
    "cursor": "*"                 # Start from the first review set
}

response = requests.get(url, params=params)
data = response.json()

### **Explanation of Parameters**

| Parameter | Description |
|------------|-------------|
| `filter` | Defines how reviews are sorted — by **recent**, **updated**, or **helpfulness**. |
| `language` | Specifies the language of reviews (or “all” to include every language). |
| `day_range` | The number of days from the current date to include in results (max 365). |
| `cursor` | Handles pagination — used to fetch the next batch of reviews. |
| `review_type` | Filters reviews by sentiment: `all`, `positive`, or `negative`. |
| `purchase_type` | Filters whether the review came from a **Steam purchase** or not. |
| `num_per_page` | Number of reviews returned per request (max = 100). |
| `filter_offtopic_activity` | Optional — include or exclude off-topic “review bombs.” |

<br>

### **Why This API is Useful**

Using these parameters allows us to **customize the dataset** we collect — for example:
- Focusing only on **recent reviews** to study new player sentiment.  
- Comparing **positive vs. negative** review trends.  
- Measuring **player engagement** via playtime, ownership, and purchase history.  

This flexible structure makes the Steam Reviews API a strong foundation for **data-driven analysis** on player experience, engagement, and review behavior.
