# README

- **Author**: `黃書佑`
- **Created At**: `2025-10-08`
- **Last Modified At**: `2025-10-08`

---

## What does this file do?

- `Merge each game's data.`
- `Change 'DateTime' to 'Date'.`
- `Select date from 2023-01-01 to 2025-08-31.`
- `Remove the column 'Average Players'.`
- `Remove the column 'Historical Low'.`
- `Add a column 'GameID', which is the game's ID.`
- `Add a column 'MultiPlayer' to verify if the game is multi-player.`
- `Add a column 'ReleaseYear', which is the game's release year.`
- `Add a column 'OriginalPrice', which is the game's released price.`
- `Add a column 'FixDiscountRate' to check if the game has a constant discount rate.`

---

## What does this file take?

- **Source Data Sets**:  
  1. `/data/raw/followers_413150` 
    - Description: `The data contains date and followers from 2019-07-20 to 2025-10-05 at 00:00:00.` 
  2.  `/data/raw/players_413150` 
    - Description: `The data contains date, players and average players from 2015-12-01 to 2025-10-07 at 00:00:00.` 
  3. `/data/raw/price_413150` 
    - Description: `The data contains datetime, price and historical lowest price from 2016-02-26 to 2025-10-07.` 
  4.  `/data/raw/reviews_413150` 
    - Description: `The data contains date, positive reviews and negative reviews from 2016-02-26 to 2025-09-26 at 00:00:00.` 
  
  
---

## What does this file output?

- `/data/final/game_413150`  
  - Description: `The row contains 'Date', 'Followers', 'Players', 'Price', 'Positive reviews', 'Negative reviews', 'GameID', 'MultiPlayer', 'ReleaseYear', 'OriginalPrice', 'FixDiscountRate'`


In [1]:
# Load packages here
import pandas as pd




In [None]:
# Load input here




In [None]:
# 1. 刪除 price 表裡的 Historical Low 欄位
if "Historical Low" in price.columns:
    price.drop(columns=["Historical Low"], inplace=True)

# 2. 刪除 players 表裡的 Average Players 欄位
if "Average Players" in players.columns:
    players.drop(columns=["Average Players"], inplace=True)

# 3. 將 DateTime 欄位改成 Date（僅保留日期）
def fix_date_column(df):
    if "DateTime" in df.columns:
        df["Date"] = pd.to_datetime(df["DateTime"]).dt.date
        df.drop(columns=["DateTime"], inplace=True)
    elif "Date" in df.columns:
        df["Date"] = pd.to_datetime(df["Date"]).dt.date
    return df

players = fix_date_column(players)
price = fix_date_column(price)
reviews = fix_date_column(reviews)

# 4. 只保留 2023-01-01 到 2025-08-31 的資料
start_date = pd.to_datetime("2023-01-01").date()
end_date = pd.to_datetime("2025-08-31").date()

players = players[(players["Date"] >= start_date) & (players["Date"] <= end_date)]
price = price[(price["Date"] >= start_date) & (price["Date"] <= end_date)]
reviews = reviews[(reviews["Date"] >= start_date) & (reviews["Date"] <= end_date)]

# 5. 合併 price 和 reviews 到 players
merged = players.merge(price, on="Date", how="left").merge(reviews, on="Date", how="left")

# price 缺值填補為 398
if "price" in merged.columns:
    merged["price"].fillna(398, inplace=True)

# 顯示前五筆結果
merged.head()
