# Project: Video Game Sales 2024 Data Analysis

In this project, I will analyze a dataset of video game sales for 2024. The goal is to perform exploratory data analysis (EDA), build predictive models using machine learning, and create visualizations to represent the results.

## Analysis Goals:
- Investigate how factors like genre, platform, and release year influence video game sales.
- Analyze regional sales distribution and identify the most profitable regions (North America, Europe, Japan, Other).
- Explore how factors such as ratings and other metadata can predict game success.
- Build a predictive model to estimate global video game sales based on available features.

## Data Preparation:
- Checking for missing values and anomalies.
- Data transformations: removing duplicates, normalizing, and categorizing when necessary.
- Handling columns with missing values and ensuring the data is ready for analysis.

## Types of Analysis:
1. **Exploratory Data Analysis (EDA)**:
   - Analyzing the distribution of global sales and sales by region (North America, Europe, Japan, Other).
   - Investigating correlations between `critic_score`, `genre`, and `total_sales` to uncover factors influencing game success.
   - Examining sales trends over time based on `release_date`.
   - Exploring the impact of `publisher` and `developer` on sales, identifying industry leaders.
   - Identifying the most popular consoles (`console`) and comparing sales across regions (`na_sales`, `jp_sales`, `pal_sales`, `other_sales`).

2. **Prediction Models**:
   - Building a regression model to predict global sales.
   - Evaluating the impact of features like release year, platform, genre, and critic score on sales predictions.

3. **Visualization**:
   - Using histograms and scatter plots to examine the relationships between key variables (e.g., sales, critic scores, release year).
   - Creating heatmaps to visualize correlations between columns and identify significant relationships.



## Dataset
This project uses the Video Game Sales dataset from Kaggle:  
🔗 [https://www.kaggle.com/datasets/hosammhmdali/video-game-sales-2024](https://www.kaggle.com/datasets/hosammhmdali/video-game-sales-2024)


In [1]:
import pandas as pd 

### Field Description:
- **img** : URL slug for the box art at vgchartz.com
- **title** : Game title
- **console** : Console the game was released for
- **genre** : Genre of the game
- **publisher** : Publisher of the game
- **developer** : Developer of the game
- **critic_score** : Metacritic score (out of 10)
- **total_sales** : Global sales of copies in millions
- **na_sales** : North American sales of copies in millions
- **jp_sales** : Japanese sales of copies in millions
- **pal_sales** : European & African sales of copies in millions
- **other_sales** : Rest of world sales of copies in millions
- **release_date** : Date the game was released on
- **last_update** : Date the data was last updated

In [2]:
# Load the dataset
file_path = 'data/vgchartz-2024.csv'  # replace with the actual path
df = pd.read_csv(file_path)

# Show the first few rows of the dataset
df.head()


Unnamed: 0,img,title,console,genre,publisher,developer,critic_score,total_sales,na_sales,jp_sales,pal_sales,other_sales,release_date,last_update
0,/games/boxart/full_6510540AmericaFrontccc.jpg,Grand Theft Auto V,PS3,Action,Rockstar Games,Rockstar North,9.4,20.32,6.37,0.99,9.85,3.12,2013-09-17,
1,/games/boxart/full_5563178AmericaFrontccc.jpg,Grand Theft Auto V,PS4,Action,Rockstar Games,Rockstar North,9.7,19.39,6.06,0.6,9.71,3.02,2014-11-18,2018-01-03
2,/games/boxart/827563ccc.jpg,Grand Theft Auto: Vice City,PS2,Action,Rockstar Games,Rockstar North,9.6,16.15,8.41,0.47,5.49,1.78,2002-10-28,
3,/games/boxart/full_9218923AmericaFrontccc.jpg,Grand Theft Auto V,X360,Action,Rockstar Games,Rockstar North,,15.86,9.06,0.06,5.33,1.42,2013-09-17,
4,/games/boxart/full_4990510AmericaFrontccc.jpg,Call of Duty: Black Ops 3,PS4,Shooter,Activision,Treyarch,8.1,15.09,6.18,0.41,6.05,2.44,2015-11-06,2018-01-14


#### Analysis by: Stanislav Kuranov   
#### Date: [26.02.2025]  
