# Level Up: Exploring the World of Video Game Sales
***
Author: Kellie Leopold    
Date: February 18th, 2025  

#### *Purpose*
Create an exploratory data analysis (EDA) project using GitHub, Git, Jupyter, pandas, Seaborn and other popular data analytics tools using data for video game sales in 2024. 

### *Introduction*
Welcome to the ultimate “press start” moment of our data analysis adventure! In this exploration, we’re diving deep into the world of video game sales, where the numbers are as varied as the genres of games themselves. From Mario’s iconic leaps to the Fortnite dance craze, the gaming industry has seen its fair share of highs, lows, and unexpected plot twists.

But behind every sold-out console and record-breaking title, there’s a story to be told — and it’s all about the data. How do game sales measure up across different platforms, genres, and decades? Which titles made players’ thumbs sore and wallets happy? And can we predict which future release might break all the records?

Grab your controller (or, you know, a cup of coffee), because we’re about to uncover the truth behind the numbers. Spoiler alert: The data doesn’t lie, but it does make for some pretty interesting analysis!

***

#### Imports
* pandas
* pathlib
* Seaborn
* matplotlib.pyplot

In [45]:
import pandas as pd
import pathlib
import seaborn as sns
import matplotlib.pyplot as plt

#### 1. Load the Data
* Load the CSV dataset downloaded from Kaggle.
* Inspect the first few lines of data to ensure they loaded correctly.

In [46]:
# Define the path to the CSV file in the root folder (assuming the notebook is in the same directory)
csv_file_path = pathlib.Path.cwd() / 'vgchartz-2024.csv'

# Read the CSV file into a DataFrame
df = pd.read_csv(csv_file_path)

# Display the first few rows of the DataFrame
print(df.head())

                         title console    genre       publisher  \
0           Grand Theft Auto V     PS3   Action  Rockstar Games   
1           Grand Theft Auto V     PS4   Action  Rockstar Games   
2  Grand Theft Auto: Vice City     PS2   Action  Rockstar Games   
3           Grand Theft Auto V    X360   Action  Rockstar Games   
4    Call of Duty: Black Ops 3     PS4  Shooter      Activision   

        developer  critic_score  total_sales  north_amer_sales  jpn_sales  \
0  Rockstar North           9.4        20.32              6.37       0.99   
1  Rockstar North           9.7        19.39              6.06       0.60   
2  Rockstar North           9.6        16.15              8.41       0.47   
3  Rockstar North           NaN        15.86              9.06       0.06   
4        Treyarch           8.1        15.09              6.18       0.41   

   euro_africa_sales  other_sales release_date last_update  
0               9.85         3.12    9/17/2013         NaN  
1           

#### 2. Initial Data Inspection
* Display the first 10 rows of the DataFrame.
* Check the shape.
* Display the data types of each column.

In [47]:
# Display the first few rows of the DataFrame
print(df.head(10)) # Prints the first 10 rows of data

# Display the data shape
(print(df.shape)) # Prints (rows, columns)

# Display the data type
print(df.dtypes) # Prints data types of each column

                            title console             genre       publisher  \
0              Grand Theft Auto V     PS3            Action  Rockstar Games   
1              Grand Theft Auto V     PS4            Action  Rockstar Games   
2     Grand Theft Auto: Vice City     PS2            Action  Rockstar Games   
3              Grand Theft Auto V    X360            Action  Rockstar Games   
4       Call of Duty: Black Ops 3     PS4           Shooter      Activision   
5  Call of Duty: Modern Warfare 3    X360           Shooter      Activision   
6         Call of Duty: Black Ops    X360           Shooter      Activision   
7           Red Dead Redemption 2     PS4  Action-Adventure  Rockstar Games   
8      Call of Duty: Black Ops II    X360           Shooter      Activision   
9      Call of Duty: Black Ops II     PS3           Shooter      Activision   

        developer  critic_score  total_sales  north_amer_sales  jpn_sales  \
0  Rockstar North           9.4        20.32         

#### 3. Initial Descriptive Statistics
* Display summary statistics for each column

In [48]:
print(df.describe())

       critic_score   total_sales  north_amer_sales    jpn_sales  \
count   6678.000000  18922.000000      12637.000000  6726.000000   
mean       7.220440      0.349113          0.264740     0.102281   
std        1.457066      0.807462          0.494787     0.168811   
min        1.000000      0.000000          0.000000     0.000000   
25%        6.400000      0.030000          0.050000     0.020000   
50%        7.500000      0.120000          0.120000     0.040000   
75%        8.300000      0.340000          0.280000     0.120000   
max       10.000000     20.320000          9.760000     2.130000   

       euro_africa_sales   other_sales  
count       12824.000000  15128.000000  
mean            0.149472      0.043041  
std             0.392653      0.126643  
min             0.000000      0.000000  
25%             0.010000      0.000000  
50%             0.040000      0.010000  
75%             0.140000      0.030000  
max             9.850000      3.120000  
