
# Exploratory Data Analysis: Game Sales

This notebook provides an exploratory analysis of the video game sales dataset. 
The objective is to understand trends in the data, identify top genres, platforms, publishers, and regions, and explore yearly sales dynamics.

## Dataset Description

- **Data Source**: Processed game sales data
- **Columns**:
  - `Rank`: Overall ranking based on global sales.
  - `Name`: Title of the game.
  - `Platform`: Platform on which the game was released.
  - `Year`: Year of release.
  - `Genre`: Genre of the game.
  - `Publisher`: Publisher of the game.
  - `North_America_Sales`, `Europe_Sales`, `Japan_Sales`, `Other_Region_Sales`, `Total_Global_Sales`: Regional and global sales in millions of units.


In [None]:

# Import libraries
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

# Load dataset
data_path = 'processed_game_sales_data.csv'
df = pd.read_csv(data_path)

# Display dataset info
df.info()

# Display first few rows
df.head()


In [None]:

# Summary statistics
df.describe()

# Top 10 games by global sales
top_games = df.nlargest(10, 'Total_Global_Sales')
print("Top 10 Games by Global Sales:")
print(top_games[['Name', 'Total_Global_Sales']])

# Sales distribution by genre
plt.figure(figsize=(10, 6))
sns.boxplot(x='Genre', y='Total_Global_Sales', data=df)
plt.xticks(rotation=45)
plt.title('Sales Distribution by Genre')
plt.show()

# Yearly sales trends
yearly_sales = df.groupby('Year')['Total_Global_Sales'].sum()
plt.figure(figsize=(10, 6))
sns.lineplot(x=yearly_sales.index, y=yearly_sales.values)
plt.title('Yearly Global Sales Trends')
plt.xlabel('Year')
plt.ylabel('Total Global Sales (in millions)')
plt.show()



## Conclusions

1. **Top Games**: The dataset highlights the most successful games by global sales.
2. **Genre Insights**: Certain genres (e.g., Action, Sports) dominate sales, while others show niche performance.
3. **Yearly Trends**: Sales peaked in certain years, reflecting the impact of major releases or platform trends.

Further analysis can dive into regional trends and platform-specific performances.
