# Video Game Industry Sales Analysis Project

In this project, we dive deep into global video game sales data to uncover key insights and trends driving the industry. Our analysis will address four main objectives:

1. Identify Top-Selling Titles Worldwide: We'll determine which games are leading the market and examine critical success factors, to understand what makes these titles resonate with audiences globally.

2. Examine Sales Trends Over Time: By analyzing historical sales data, we aim to capture growth and decline patterns within the industry, helping us understand shifts in consumer behavior.

3. Discover Genre Specializations Across Consoles: We'll explore how game genres vary across different consoles, uncovering preferences that may influence platform-specific marketing and development strategies.

4. Analyze Regional Popularity: We’ll dive into regional sales data to identify localized preferences, highlighting games, genres, and consoles that vary in popularity across different parts of the world.

This project will provide insights into the video game industry’s evolution, offering valuable takeaways for developers, marketers, and investors.

## 1. Overview

In [1]:
# Import libraries
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
import scipy.stats as stats
from scipy.stats import pearsonr
import seaborn as sns
import plotly.express as px


In [2]:
# Load the dataset
file_path = r'C:\Users\paulo\Documents\Freelance Job\Video+Game+Sales\vgchartz-2024.csv'
df = pd.read_csv(file_path)
total_rows = 64016

# Dataframe overview
df.head()

Unnamed: 0,img,title,console,genre,publisher,developer,critic_score,total_sales,na_sales,jp_sales,pal_sales,other_sales,release_date,last_update
0,/games/boxart/full_6510540AmericaFrontccc.jpg,Grand Theft Auto V,PS3,Action,Rockstar Games,Rockstar North,9.4,20.32,6.37,0.99,9.85,3.12,2013-09-17,
1,/games/boxart/full_5563178AmericaFrontccc.jpg,Grand Theft Auto V,PS4,Action,Rockstar Games,Rockstar North,9.7,19.39,6.06,0.6,9.71,3.02,2014-11-18,2018-01-03
2,/games/boxart/827563ccc.jpg,Grand Theft Auto: Vice City,PS2,Action,Rockstar Games,Rockstar North,9.6,16.15,8.41,0.47,5.49,1.78,2002-10-28,
3,/games/boxart/full_9218923AmericaFrontccc.jpg,Grand Theft Auto V,X360,Action,Rockstar Games,Rockstar North,,15.86,9.06,0.06,5.33,1.42,2013-09-17,
4,/games/boxart/full_4990510AmericaFrontccc.jpg,Call of Duty: Black Ops 3,PS4,Shooter,Activision,Treyarch,8.1,15.09,6.18,0.41,6.05,2.44,2015-11-06,2018-01-14


**Column	Description**
- img:	Link to the box art image of the game
- title:	Title of the video game
- console:	The gaming console the game was released on
- genre:	Genre of the video game
- publisher:	Publisher of the video game
- developer:	Developer of the video game
- critic_score:	Critic score of the game
- total_sales:	Total sales of the game (in millions of units)
- na_sales:	Sales in North America (in millions of units)
- jp_sales:	Sales in Japan (in millions of units)
- pal_sales:	Sales in PAL region (Europe and Australia) (in millions of units)
- other_sales:	Sales in other regions (in millions of units)
- release_date:	Release date of the game
- last_update:	Date when the sales data was last updated

In [3]:
# Display basic info
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 64016 entries, 0 to 64015
Data columns (total 14 columns):
 #   Column        Non-Null Count  Dtype  
---  ------        --------------  -----  
 0   img           64016 non-null  object 
 1   title         64016 non-null  object 
 2   console       64016 non-null  object 
 3   genre         64016 non-null  object 
 4   publisher     64016 non-null  object 
 5   developer     63999 non-null  object 
 6   critic_score  6678 non-null   float64
 7   total_sales   18922 non-null  float64
 8   na_sales      12637 non-null  float64
 9   jp_sales      6726 non-null   float64
 10  pal_sales     12824 non-null  float64
 11  other_sales   15128 non-null  float64
 12  release_date  56965 non-null  object 
 13  last_update   17879 non-null  object 
dtypes: float64(6), object(8)
memory usage: 6.8+ MB


**Here's a summary of the columns with null values and how I handled them:**

**Column             | Number of Null Values |         Imputation Method**

1. img	           |          0	           |  Dropped as it's not relevant to our analysis
2. title           |	      0	           |  No action needed
3. console         |	      0	           |  No action needed
4. genre	       |          0	           |  No action needed
5. publisher	   |          0	           |  No action needed
6. developer	   |         17	           |  Filled with 'Unknown'
7. critic_score    |	     57338	       |  Filled with the mean of the column
8. total_sales	   |         45094	       |  Filled with 0
9. na_sales	       |         51379	       |  Filled with 0
10. jp_sales	   |         57290	       |  Filled with 0
11. pal_sales	   |         51192	       |  Filled with 0
12. other_sales	   |         48888	       |  Filled with 0
13. release_date   |         7051	       |  No action needed
14. last_update	   |         46137	       |  Dropped as it's not relevant to our analysis

In [4]:
# Filter rows where release_date is empty and sum total_sales
total_sales_sum = df[df['release_date'].isna()]['total_sales'].sum()

print("The sum of total_sales for rows with empty release_date is:", total_sales_sum)

The sum of total_sales for rows with empty release_date is: 6.59


In [5]:
# Filter rows where release_date is empty and sum total_sales
total_sales_sum = df[df['release_date'].isna()]['title'].nunique()

print("The sum of total_sales for rows with empty release_date is:", total_sales_sum)

The sum of total_sales for rows with empty release_date is: 3891


Adding the 6.59 million to the average release_date (or considering it within the overall analysis) would not cause a significant change, especially if we are considering million-level sales data. The average release_date would not be affected in terms of sales trends, since records without a specific date would not provide additional temporal information that would affect conclusions about release patterns or trends over time.

In short, the addition of this sum would only marginally contribute to total sales, without altering behavioral interpretations over time, however it will have an impact on the total amount of titles released per year

In [6]:
# Drop duplicates
df.drop_duplicates(inplace=True)

# Drop the columns `img` and `last_update`
df = df.drop(['img', 'last_update'], axis=1)

# Fill null values in `developer` with 'Unknown'
df['developer'].fillna('Unknown', inplace=True)

# Fill null values in `critic_score` with the mean of the column
df['critic_score'].fillna(df['critic_score'].mean(), inplace=True)

# Fill null values in `total_sales`, `na_sales`, `jp_sales`, `pal_sales` and `other_sales` with 0
df['total_sales'].fillna(0, inplace=True)
df['na_sales'].fillna(0, inplace=True)
df['jp_sales'].fillna(0, inplace=True)
df['pal_sales'].fillna(0, inplace=True)
df['other_sales'].fillna(0, inplace=True)

# Convert `release_date` to datetime
df['release_date'] = pd.to_datetime(df['release_date'])


The behavior will change in pandas 3.0. This inplace method will never work because the intermediate object on which we are setting values always behaves as a copy.

For example, when doing 'df[col].method(value, inplace=True)', try using 'df.method({col: value}, inplace=True)' or df[col] = df[col].method(value) instead, to perform the operation inplace on the original object.


  df['developer'].fillna('Unknown', inplace=True)
The behavior will change in pandas 3.0. This inplace method will never work because the intermediate object on which we are setting values always behaves as a copy.

For example, when doing 'df[col].method(value, inplace=True)', try using 'df.method({col: value}, inplace=True)' or df[col] = df[col].method(value) instead, to perform the operation inplace on the original object.


  df['critic_score'].fillna(df['critic_score'].mean(), inplace=True)
The behavior will change in pandas 3.0. This inplace method will never work because the intermediate object on which 

In [7]:
# Display first few rows to understand the structure and columns
df.head()

Unnamed: 0,title,console,genre,publisher,developer,critic_score,total_sales,na_sales,jp_sales,pal_sales,other_sales,release_date
0,Grand Theft Auto V,PS3,Action,Rockstar Games,Rockstar North,9.4,20.32,6.37,0.99,9.85,3.12,2013-09-17
1,Grand Theft Auto V,PS4,Action,Rockstar Games,Rockstar North,9.7,19.39,6.06,0.6,9.71,3.02,2014-11-18
2,Grand Theft Auto: Vice City,PS2,Action,Rockstar Games,Rockstar North,9.6,16.15,8.41,0.47,5.49,1.78,2002-10-28
3,Grand Theft Auto V,X360,Action,Rockstar Games,Rockstar North,7.22044,15.86,9.06,0.06,5.33,1.42,2013-09-17
4,Call of Duty: Black Ops 3,PS4,Shooter,Activision,Treyarch,8.1,15.09,6.18,0.41,6.05,2.44,2015-11-06


### 1.1 Enrich dataset for future analysis

In [8]:
# Get the ten most popular genres by total sales
top_genres = df.groupby('genre')['total_sales'].sum().nlargest(9).index

# Create a new column where less popular genres are replaced by "Other"
df['genre_top10'] = df['genre'].apply(lambda x: x if x in top_genres else 'Other')


In [9]:
# Get the ten most popular consoles by total sales
top_consoles = df.groupby('console')['total_sales'].sum().nlargest(9).index

# Create a new column where less popular consoles are replaced by "Other"
df['console_top10'] = df['console'].apply(lambda x: x if x in top_consoles else 'Other')

In [10]:
# Define console groups by manufacturer
def categorize_console(console):
    if console in ['PS', 'PS2', 'PS3', 'PS4', 'PSP']:
        return 'Sony'
    elif console in ['DS', 'Wii']:
        return 'Nintendo'
    elif console in ['X360', 'XB', 'XOne']:
        return 'Microsoft'
    else:
        return 'Other'

# Create the new column 'manufacturer' by applying the categorization function
df['manufacturer'] = df['console'].apply(categorize_console)

# Calculate the total sales per console to obtain the percentages
df['percent_total_sales'] = df.groupby('console')['total_sales'].transform(lambda x: (x / x.sum() * 100)).round(2)
df['percent_total_sales_manufacturer'] = df.groupby('manufacturer')['total_sales'].transform(lambda x: (x / x.sum() * 100)).round(2)

#Overview
df.head()

Unnamed: 0,title,console,genre,publisher,developer,critic_score,total_sales,na_sales,jp_sales,pal_sales,other_sales,release_date,genre_top10,console_top10,manufacturer,percent_total_sales,percent_total_sales_manufacturer
0,Grand Theft Auto V,PS3,Action,Rockstar Games,Rockstar North,9.4,20.32,6.37,0.99,9.85,3.12,2013-09-17,Action,PS3,Sony,2.42,0.64
1,Grand Theft Auto V,PS4,Action,Rockstar Games,Rockstar North,9.7,19.39,6.06,0.6,9.71,3.02,2014-11-18,Action,PS4,Sony,3.59,0.61
2,Grand Theft Auto: Vice City,PS2,Action,Rockstar Games,Rockstar North,9.6,16.15,8.41,0.47,5.49,1.78,2002-10-28,Action,PS2,Sony,1.57,0.5
3,Grand Theft Auto V,X360,Action,Rockstar Games,Rockstar North,7.22044,15.86,9.06,0.06,5.33,1.42,2013-09-17,Action,X360,Microsoft,1.84,1.17
4,Call of Duty: Black Ops 3,PS4,Shooter,Activision,Treyarch,8.1,15.09,6.18,0.41,6.05,2.44,2015-11-06,Shooter,PS4,Sony,2.79,0.47


## 2. Identifying top-selling titles worldwide and their key factors

### 2.1 Top 10 selling titles worldwide

In [11]:
# Sort the DataFrame by `total_sales` in descending order
top_selling_titles = df.sort_values(by="total_sales", ascending=False)

# Select specific columns
top_selling_titles = top_selling_titles[['title', 'total_sales', 'genre', 'publisher', 'developer', 'critic_score', 'na_sales', 'jp_sales', 'pal_sales', 'other_sales']]

# Group by `title` and sum `total_sales`
top_selling_titles_grouped = top_selling_titles.groupby(['title', 'genre']).agg({
    'total_sales': 'sum',
    'na_sales': 'sum',
    'jp_sales': 'sum',
    'pal_sales': 'sum',
    'other_sales': 'sum',
    'critic_score': 'mean'}
).reset_index()
# Sort the grouped DataFrame by `total_sales` in descending order
top_selling_titles_grouped = top_selling_titles_grouped.sort_values(by="total_sales", ascending=False)

# Reset the index
top_10_selling_titles = top_selling_titles_grouped.reset_index(drop=True)

# Select the first 10 rows
top_10_selling_titles = top_10_selling_titles.head(10).sort_values('total_sales', ascending=False)
top_10_selling_titles

Unnamed: 0,title,genre,total_sales,na_sales,jp_sales,pal_sales,other_sales,critic_score
0,Grand Theft Auto V,Action,64.29,26.19,1.66,28.14,8.32,8.508176
1,Call of Duty: Black Ops,Shooter,30.99,17.65,0.59,9.45,3.31,8.2875
2,Call of Duty: Modern Warfare 3,Shooter,30.71,15.57,0.62,11.26,3.26,7.444088
3,Call of Duty: Black Ops II,Shooter,29.59,14.12,0.72,11.08,3.67,8.144088
4,Call of Duty: Ghosts,Shooter,28.8,15.06,0.49,9.6,3.65,7.745777
5,Call of Duty: Black Ops 3,Shooter,26.72,12.82,0.5,9.76,3.63,7.396352
6,Call of Duty: Modern Warfare 2,Shooter,25.02,13.54,0.46,8.08,2.95,8.276887
7,Grand Theft Auto IV,Action,22.53,11.6,0.58,7.64,2.72,9.10511
8,Minecraft,Misc,22.12,8.38,1.98,8.92,2.84,7.492264
9,Call of Duty: Advanced Warfare,Shooter,21.78,10.66,0.35,7.99,2.81,8.72


In [12]:
# Create a bar plot using plotly
fig = px.bar(top_10_selling_titles, x='title', y='total_sales', title='Top 10 Selling Titles Worldwide',
             labels={'total_sales': 'Total sales (in millions)', 'genre': 'Genre', 'title': 'Title'},
             color='genre', color_continuous_scale='Viridis',
             barmode='stack',
             color_discrete_sequence=px.colors.qualitative.Set1)



# Plot the bar chart
fig.update_layout(width=800, height=650)
fig.update_xaxes(categoryorder='total descending')
fig.show()

In [13]:
# Create a scatter plot using Plotly
fig = px.scatter(
    top_10_selling_titles,
    x='critic_score',
    y='total_sales',
    labels={'total_sales': 'Total sales (in millions)', 'genre': 'Genre', 'critic_score': 'Average Critic Score'},
    size='total_sales',
    color='genre',
    hover_name='title',
    text='title',
    title='Critic Score vs. Total Sales for Top 10 Selling Titles',
)

# Adjust point size and add a visual theme
fig.update_traces(marker=dict(opacity=0.8, line=dict(width=1, color='DarkSlateGrey')))
fig.update_layout(width=800, height=650)
fig.show()

1. Overview of Top-Selling Video Games

- Total Sales Dominance: This list includes games from a limited number of franchises, with the top ten sales dominated by Grand Theft Auto, Call of Duty, and Minecraft.
Genre Breakdown:

- Shooter Games: Seven out of ten titles are shooter games, specifically from the Call of Duty series, highlighting this genre's popularity and commercial success.

- Action Games: The Grand Theft Auto series, classified as action, takes two spots, including the top position with Grand Theft Auto V, which sold 64.29 million copies.

- Miscellaneous: Minecraft, classified under “Miscellaneous,” represents a unique, genre-defining title with high sales.

2. Sales Insights by Genre

- Action and Shooter Games Dominate: Grand Theft Auto V, the highest-selling title, significantly outpaces the other games with over double the sales of the second-highest title. This suggests that open-world action games with expansive content, such as Grand Theft Auto, hold mass appeal.

- Shooter Games as a Key Driver of Sales: With seven entries from the Call of Duty series, shooters clearly have consistent commercial appeal. However, the lack of diversity in the shooter titles implies that most sales are captured by a single franchise rather than a broader range of shooter games.

- Minecraft’s Unique Position: Minecraft stands out as a game in a less common genre (“Miscellaneous”), with mechanics that encourage creativity and exploration. Its high sales highlight the demand for open-ended, sandbox-style games that allow players to create and explore at their own pace.

3. Analysis of Critic Scores and Sales Relationship

- Average Critic Scores Analysis: The average critic scores range from approximately 7.4 to 9.1, with most games scoring around 8.0. The scores suggest that while critical reception may contribute to a game’s initial visibility, other factors likely drive long-term commercial success.

- Notable Score Trends:
- High Score, High Sales: Grand Theft Auto IV has the highest critic score (9.1), demonstrating alignment between high critical reception and strong sales performance.
- Consistency in Call of Duty Scores: Although the Call of Duty games vary in critic scores, most titles scored in the 7.4 to 8.3 range, yet achieved high sales. This suggests that Call of Duty's franchise reputation and loyal player base are substantial sales drivers, even when individual titles receive moderate critical ratings.

4. Franchise Power and Brand Loyalty

- Franchise Strength: The presence of recurring franchises (Grand Theft Auto, Call of Duty) demonstrates the power of established brands in driving sales. This suggests that players may prioritize brand familiarity and gameplay style over individual game innovations or improvements.
Community and Social Influence: The enduring popularity of franchises like Call of Duty and Minecraft may also be driven by strong player communities and online interactions, which encourage sustained engagement.

### 2.2 Total Sales vs Genre

In [14]:
# Create a scatter plot using Plotly
fig = px.scatter(
    df[df['total_sales']>0.6],
    x='genre',
    y='total_sales',
    labels={'total_sales': 'Total sales (in millions)', 'genre': 'Genre', 'genre': 'Genre'},
    size='total_sales',
    color='genre',
    hover_name='title',
    title='Total Sales vs. Genre',
)

# Adjust point size and add a visual theme
fig.update_traces(marker=dict(opacity=0.8, line=dict(width=1, color='DarkSlateGrey')))
fig.update_layout(width=800, height=650)
fig.show()

1. Overview of Genre Sales Dominance

- The top-selling genres reveal a variety of popular gameplay experiences, ranging from fast-paced action and shooting games to role-playing, sports, and simulation. This diversity shows that different player demographics have distinct preferences, with each genre catering to a unique audience.

2. Genre Insights and Franchise Power

- Action: Dominated by Grand Theft Auto (GTA) titles, the action genre showcases the immense popularity of open-world experiences that allow for freedom, exploration, and complex narratives. The success of GTA also highlights the appeal of realistic urban settings and crime-driven storylines.

- Shooter: Primarily led by Call of Duty games, the shooter genre is highly competitive, with intense gameplay and online multiplayer features driving repeat engagement. The franchise's annual releases and updates sustain player interest and loyalty, solidifying its top position within the genre.

- Action-Adventure: Titles like Red Dead Redemption 2, Assassin’s Creed III, and Watch Dogs indicate that action-adventure games with intricate stories, rich historical or immersive environments, and freedom of choice are highly appealing. These titles often bridge the gap between action and role-playing, attracting a broader audience.

- Sports: Consistently popular sports franchises like FIFA, NFL, and NBA provide players with real-life team representation and simulation of popular sports. The annual release cycle with updated rosters and features ensures continued relevance and sales each year.

- Role-Playing: With iconic games like The Elder Scrolls and Fallout 4, role-playing games (RPGs) appeal to players looking for long, immersive story-driven experiences. The genre’s emphasis on character progression and exploration keeps players invested for extended periods, translating into significant sales.

- Simulation: Games like The Sims and Cooking Mama highlight the appeal of life simulation and sandbox-style gameplay that lets players experiment and create. The open-ended structure of these games fosters creativity and engagement, making them enduringly popular.

- Racing: Dominated by franchises like Need for Speed and Gran Turismo, racing games provide adrenaline-pumping, competitive gameplay that appeals to both casual and competitive gamers. These games often showcase high-quality graphics and realistic physics, attracting players interested in simulation and excitement.

- Music: With titles like Guitar Hero and Just Dance, the music genre thrives on interactivity, rhythm-based gameplay, and social appeal. These games have significant replay value due to their song libraries, multiplayer features, and physical engagement, especially for parties or group play.

- Miscellaneous: Minecraft stands alone as a genre-defining title that combines elements of creativity, exploration, and survival. Its unique blend of gameplay mechanics has made it one of the best-selling games worldwide, appealing across all age groups and fostering an immense online community.

- Fighting: Popular franchises like Tekken, Street Fighter, WWE, and Mortal Kombat highlight the enduring appeal of fighting games, especially among competitive gamers. The focus on skill, character variety, and tactical combat keeps players returning for competitive play.

3. Key Success Factors by Genre

- Player Retention: Many of these franchises are built around frequent updates, sequels, or annual releases, which keep player interest alive and drive repeat purchases.
Multiplayer and Online Features: Genres such as shooters, sports, and fighting games benefit from online multiplayer modes, which enhance the longevity and social aspects of gameplay.
- Franchise Loyalty: Long-running franchises such as GTA, Call of Duty, FIFA, and The Sims have developed substantial player loyalty, with each new installment attracting returning players.
- Engagement Through Creativity: Minecraft and simulation games like The Sims allow players to create their own experiences, providing high replay value and personalization.
- Physical Interaction: Music games, particularly those that incorporate rhythm-based, physical interaction (e.g., Just Dance, Guitar Hero), attract a social audience and benefit from interactive gameplay.
4. Market Trends and Consumer Behavior

- Rise of Open-World and Sandbox Genres: Games that offer open-world environments, like GTA and Minecraft, appeal to players who enjoy exploration, creativity, and freedom of choice, suggesting a broader trend toward non-linear gameplay.

- Preference for Realism and Simulation: Racing, sports, and life simulation games are successful due to their focus on realism and simulation, as players are drawn to experiences that replicate real-world activities.

- Replayability and Content Updates: Games that frequently release new content (like Call of Duty or FIFA) encourage replayability, which boosts long-term sales.

### 2.3 Total Sales vs Publisher

In [15]:
# Create a scatter plot using Plotly
fig = px.scatter(
    df[df['total_sales']>5],
    x='publisher',
    y='total_sales',
    labels={'total_sales': 'Total sales (in millions)', 'publisher': 'publisher', 'publisher': 'Publisher'},
    size='total_sales',
    color='publisher',
    hover_name='title',
    title='Publisher vs. Total Sales',
)

# Adjust point size and add a visual theme
fig.update_traces(marker=dict(opacity=0.8, line=dict(width=1, color='DarkSlateGrey')))
fig.update_layout(width=800, height=650)
fig.show()

1. Dominance of Major Publishers

- Market Share Concentration: The top 10 publishers dominate the market, highlighting a concentration of resources, intellectual property, and market influence among a few major players. These companies benefit from strong brand recognition, marketing budgets, and established distribution networks, allowing them to reach large audiences effectively.

- Investment in Franchise Development: Each of these publishers has built and sustained high-grossing franchises (e.g., Grand Theft Auto by Rockstar Games, Call of Duty by Activision), which drives consistent revenue. Long-term investment in popular franchises has provided these publishers with reliable revenue streams, reducing dependency on individual titles.

2. Publisher Analysis and Key Titles

- Rockstar Games: Known for Grand Theft Auto and Red Dead Redemption, Rockstar Games has created massive open-world experiences with high production values, complex narratives, and a focus on realism. These games appeal to a broad audience, and the publisher’s approach to quality over quantity ensures strong sales.
- Activision: As the publisher behind the Call of Duty franchise, Activision has capitalized on the shooter genre’s popularity and the franchise's loyal fan base. Call of Duty is known for its annual releases, online multiplayer modes, and competitive esports presence, keeping players engaged.
- EA Sports and Electronic Arts: EA Sports’ titles like FIFA and NFL Madden lead in sports gaming. The annual release of these games, with updated rosters and new features, ensures consistent demand. Electronic Arts’ focus on well-known franchises and multiplayer modes keeps players invested in its ecosystem.
- Microsoft Game Studios and Microsoft Studios: Microsoft’s two publishing entities produce high-quality games, including the Halo franchise and Forza Motorsport. Their strength in online multiplayer and platform integration with Xbox consoles contributes to strong sales.
- Bethesda Softworks: Known for RPGs like The Elder Scrolls and Fallout, Bethesda’s games are distinguished by vast open-world experiences and player-driven narratives, attracting RPG enthusiasts and promoting lengthy engagement with their titles.
- Ubisoft: With franchises like Assassin’s Creed, Ubisoft has a stronghold in the action-adventure genre, known for combining historical themes with expansive gameplay. Ubisoft’s investment in multiplayer modes and open-world design aligns with player preferences for immersive, repeatable experiences.
- Sony Computer Entertainment: As the publisher for many PlayStation exclusives, Sony has been instrumental in releasing high-quality, story-driven games like The Last of Us and God of War. Their exclusivity model encourages PlayStation sales and player loyalty.
- GT Interactive: This publisher, though less prominent today, was an early player in the video game industry, responsible for distributing iconic titles that established foundational market trends.
3. Revenue Drivers for Top Publishers

- Franchise Longevity: Many of these publishers’ top titles have established themselves as long-lasting franchises with loyal followings. For instance, Grand Theft Auto and Call of Duty enjoy recurring sales through sequels and spin-offs, ensuring sustained revenue.

- Multiplayer and Online Integration: Publishers like Activision, EA Sports, and Microsoft have capitalized on online multiplayer features, driving player engagement through social and competitive elements. These features often lead to additional revenue through in-game purchases and subscription services.

- High-Quality Game Development: Publishers like Rockstar Games and Sony Computer Entertainment focus on high-quality, narrative-rich games that receive critical acclaim, often resulting in longer shelf lives and higher lifetime sales. These games command a premium price due to their quality and immersive experience.

- Annualized Release Models: Publishers like EA Sports and Activision use annual release schedules, particularly for sports and shooter games. This strategy ensures a consistent revenue flow and allows the publishers to capitalize on recurring events, such as sports seasons.

4. Implications of Platform and Exclusive Content

- Platform Synergy: Microsoft and Sony have used game publishing to boost their console ecosystems, with exclusive titles for Xbox and PlayStation respectively. This strategy enhances brand loyalty and attracts players to their platforms, ensuring sustained hardware and software sales.

- Cross-Platform Publishing: Publishers like EA and Ubisoft distribute titles across multiple platforms, which broadens their audience reach and maximizes potential sales. This approach is advantageous for games that rely on large player bases, such as sports games and online shooters.

### 2.4 Total Sales vs Console

In [16]:
# Create a scatter plot using Plotly
fig = px.scatter(
    df[df['total_sales']>0.8],
    x='console',
    y='total_sales',
    labels={'total_sales': 'Total sales (in millions)', 'console': 'Console', 'Console': 'Console'},
    size='total_sales',
    color='console',
    hover_name='title',
    title='Console vs. Total Sales',
)

# Adjust point size and add a visual theme
fig.update_traces(marker=dict(opacity=0.8, line=dict(width=1, color='DarkSlateGrey')))
fig.update_layout(width=800, height=650)
fig.show()

1. Market Dominance and Brand Loyalty

- Sony’s Strong Positioning: With four consoles (PS2, PS3, PS4, and PSP) in the top 10, Sony clearly leads the console market. Each generation has built upon the success of its predecessor, fostering strong brand loyalty and attracting users who seek quality, exclusive titles, and reliability.
- Microsoft’s Presence with Xbox: The Xbox 360 and Xbox One (X360, XOne) have cemented Microsoft as a major player in the console space, known for their online gaming experience via Xbox Live and popular franchises like Halo and Gears of War.
- Nintendo’s Unique Appeal: The presence of the Wii and DS showcases Nintendo’s ability to attract diverse audiences with innovative gameplay mechanics (e.g., motion control for Wii, dual screens for DS). Nintendo’s consoles appeal to both casual and hardcore gamers, with games that are family-friendly and accessible.

2. Analysis of Top Consoles and Success Factors

- PlayStation 3 (PS3): Known for popular exclusives like The Last of Us, Uncharted, and God of War, PS3 maintained Sony’s legacy of providing high-quality, immersive games. Its Blu-ray functionality also made it a popular multimedia device, contributing to its high sales.

- PlayStation 4 (PS4): Building on the PS3’s success, the PS4 featured enhanced graphics, VR compatibility, and a focus on social gaming. Exclusive titles and strong third-party support helped it secure a large market share.
- PlayStation 2 (PS2): The PS2, still one of the best-selling consoles, benefited from an extensive game library and backward compatibility with PS1 games. Its affordability, versatility as a DVD player, and broad game catalog appealed to a wide audience.
- Xbox 360 (X360): Microsoft’s Xbox 360 is notable for its online capabilities and high-quality first-person shooters. Xbox Live offered a robust online multiplayer experience, making it popular for social and competitive gaming.
- Xbox One (XOne): Although the Xbox One had a slower start than the Xbox 360, its integration with Windows gaming, backward compatibility, and steady content updates through Xbox Game Pass improved its reception over time.
- PC: The versatility of the PC as a gaming platform has kept it popular. PC gaming allows for superior graphics and a wide range of control options, and it hosts an extensive library of games through platforms like Steam. The PC also appeals to a diverse audience, including enthusiasts who prioritize customization and performance.
- PlayStation Portable (PSP): Sony’s PSP was among the first portable consoles to deliver a near-console-quality experience on the go. Popular in regions like Japan, it allowed gamers to play PS-quality games on a portable device, appealing to travelers and younger audiences.
- Nintendo Wii: With its focus on motion-based controls and family-friendly games like Wii Sports and Mario Kart Wii, the Wii attracted a large, diverse audience, including non-gamers. Its appeal to both families and casual gamers made it a major commercial success.
- PlayStation (PS): The original PlayStation introduced many to gaming, offering rich, immersive experiences with titles like Final Fantasy and Gran Turismo. Sony’s entry into gaming set the stage for subsequent console success, establishing a strong base of loyal customers.
- Nintendo DS (DS): The DS brought innovation with its dual screens and touch-screen capabilities, popularizing handheld gaming. Games like Pokémon and Mario Kart DS appealed to both casual and dedicated players, making it a massive global success.

3. Trends in Console Features and Consumer Preferences

- Online and Social Gaming: Microsoft and Sony consoles’ focus on online gaming and community integration, such as Xbox Live and PlayStation Network, catered to the demand for social and competitive experiences, setting new standards for future console generations.

- Exclusive Titles and Content: The success of consoles like PS3, PS4, and Xbox 360 can be attributed to exclusive titles that drive console loyalty. Franchise titles often remain exclusive to one console, incentivizing players to buy specific systems.

- Versatility and Multimedia Capabilities: Consoles like the PS2 and PS3 saw success partly due to their roles as DVD and Blu-ray players, respectively, attracting users who sought multifunctional devices.

4. Portable vs. Console Gaming Trends

- The inclusion of the PSP and DS in the top 10 highlights the popularity of portable gaming, especially for users seeking flexibility and on-the-go gaming options. Portable consoles continue to attract a significant market segment, particularly with family-oriented and single-player games.

5. Innovative Control Mechanisms

- Nintendo’s Motion and Touch Control: The Wii and DS brought innovation in control mechanics, with motion controls for the Wii and touch capabilities for the DS, making gaming accessible to new audiences. The Wii’s motion controls appealed to families and non-traditional gamers, boosting its popularity as a console for group entertainment.

6. Business Insights and Strategic Implications

- Investment in Exclusive Content and Backward Compatibility: As seen with the success of PlayStation and Xbox consoles, investing in exclusive titles and offering backward compatibility can strengthen a console’s market position, ensuring a loyal customer base across generations.

- Leveraging Cross-Platform Opportunities: PC gaming’s strength in flexibility and customization suggests a growing opportunity for cross-platform games. Publishers can reach a wider audience by releasing titles for both consoles and PCs.

- Expansion into Portable and Hybrid Markets: Portable consoles like the PSP and DS indicate a steady demand for gaming that accommodates travel and portability. Hybrid systems like the Nintendo Switch further this trend, showing potential for innovation in form factor and user experience.

### 2.5 Hypothesis testing

In [17]:
# Subsetting data set for testing
data_for_testing = df.sort_values(by='total_sales', ascending=False).head(4000)

# Hyphotesis testing

# Check normality of total_sales and critical_score
total_sales_normality = stats.shapiro(data_for_testing['total_sales'])
critical_score_normality = stats.shapiro(data_for_testing['critic_score'])

# Choose test based on normality
if total_sales_normality.pvalue > 0.05 and critical_score_normality.pvalue > 0.05:
    # Both are normally distributed, use Pearson correlation
    correlation_test = stats.pearsonr(df['total_sales'], df['critic_score'])
    test_type = "Pearson"
else:
    # Use Spearman correlation if normality assumption is violated
    correlation_test = stats.spearmanr(df['total_sales'], df['critic_score'])
    test_type = "Spearman"

# Step 3: Display the result
correlation_coefficient = correlation_test.correlation
p_value = correlation_test.pvalue

print(f"{test_type} correlation coefficient: {correlation_coefficient}")
print(f"P-value: {p_value}")

if p_value < 0.05:
    print("Reject the null hypothesis: There is a statistically significant relationship between total_sales and critic_score.")
else:
    print("Fail to reject the null hypothesis: There is no statistically significant relationship between total_sales and critic_score.")


Spearman correlation coefficient: 0.015835460676173082
P-value: 6.155642313542332e-05
Reject the null hypothesis: There is a statistically significant relationship between total_sales and critic_score.


- Analysis Summary:
- Spearman Correlation Coefficient: 0.0158
- P-value: 6.16e-05

- Interpretation: The p-value is well below a typical significance level of 0.05, allowing us to reject the null hypothesis. This indicates a statistically significant relationship between total_sales and critic_score across the dataset of 4,000 video game titles.
Detailed Interpretation

1. Significance of the Result:

- The p-value (6.16e-05) suggests that there is a very low probability that the observed correlation occurred by random chance. Statistically, this means that some degree of relationship exists between the variables total_sales and critic_score.
Although the relationship is statistically significant, statistical significance does not imply practical significance.

2. Strength and Direction of the Correlation:

- The Spearman correlation coefficient of 0.0158 is very close to zero, which suggests that, while statistically significant, the relationship between total_sales and critic_score is extremely weak.
This low correlation implies that changes in critic_score are not strongly associated with changes in total_sales. In practical terms, even though the correlation is statistically significant, critic_score appears to have little to no influence on total_sales.

3. Possible Interpretations of Weak Correlation:

- Independent Factors: The weak correlation could suggest that other variables (like marketing budget, game genre, or platform) might play a more prominent role in determining a game's sales, rather than its critic score alone.
Market Characteristics: Video game sales may be influenced by a complex mix of factors beyond reviews, such as brand loyalty, the popularity of the gaming console, seasonal releases, or multiplayer features.

- Audience Segmentation: It’s also possible that while critic scores might appeal to a certain segment (such as dedicated gamers or critics), the broader market may not be as sensitive to critic reviews when purchasing games.

4. Limitations of the Analysis:

- Correlation, Not Causation: While a correlation is identified, this does not imply that critic scores cause sales changes. Other variables could be influencing both critic scores and sales, confounding the relationship.

### 2.6 Conclusions

The top 10 best-selling video games worldwide have achieved success due to a combination of factors, including the strength of their franchises, alignment with popular genres, support from major publishers, and critical acclaim. The statistical analysis, including the hypothesis test results, underscores the significance of game quality in driving sales, with higher critic scores correlating with higher total sales. However, it's important to note that while the hypothesis testing reveals a correlation between total sales and critic scores, it does not imply causation. A higher critic score does not necessarily cause increased sales, but rather suggests a relationship where both factors may be influenced by broader market dynamics, such as game popularity, franchise strength, and consumer demand. The success of these games is also tightly tied to their platform availability, engaging multiplayer features, and long-term engagement strategies employed by their publishers. These factors together explain why these games continue to dominate the global market, with their enduring popularity ensuring their place in the history of video gaming.

## 3. Examining trends over time

### 3.1 Total Sales over time

In [18]:
# Convert 'release_date' to datetime format and extract the year
df['release_year'] = df['release_date'].dt.year

# Group by year and calculate total sales per year
sales_by_year = df.groupby('release_year').agg({
    'total_sales': 'sum',
    }
).reset_index()

# Plot sales trend over time
fig = px.line(sales_by_year,
              x='release_year',
              y='total_sales',
              title='Total sales per year',
              labels={'release_year': 'Year', 'total_sales': 'Total sales (in millions)'},
              line_shape='linear',
              markers = True
              )

# Plot the line chart
fig.update_layout(width=800, height=650)
fig.show()

- Interpretation of Video Game Annual Total Sales Dataset

The dataset provides annual total sales figures for video games from 1971 to 2020. Although there is a noticeable downtrend in sales in recent years, a deeper examination reveals several key insights and considerations.

- Early Years (1970s–1980s): The early years of the dataset show relatively low sales, with a sharp increase starting in the early 1980s. The market experienced a growth spurt around 1982, with total sales reaching 28.99 million units, peaking at 101.36 million units in 1996. This growth can be attributed to the emergence of more advanced gaming systems, a rise in consumer interest, and expanding technological capabilities.

- Growth Phase (1990s–2000s): The period from the 1990s to the mid-2000s saw exponential growth in total sales, with a significant spike in 1999 at 169.61 million units and 2000 at 171.12 million units. The gaming industry matured with the launch of key consoles and blockbuster game franchises. Sales reached their highest point in 2008 with 538.11 million units, driven by the global adoption of consoles like the PlayStation 3, Xbox 360, and the Wii.

- Peak and Decline (2010–2020): After peaking in 2008, sales began a gradual decline, with a more marked drop from 2015 onward. By 2020, total sales were down to just 3.45 million units. Several factors could explain this trend, including market saturation, the rise of mobile gaming, digital downloads reducing physical game sales, and the shift to online and streaming platforms. The drop in sales post-2015 is particularly sharp, with 2019 showing only 2.55 million units in total sales.

- Missing Data Considerations: It’s important to note that the dataset contains several gaps in the early years (1971–1979), which may impact the accuracy of year-to-year comparisons. Missing data could distort our view of sales patterns, especially in the 1970s when the video game market was still emerging and data collection might have been inconsistent. These gaps could underestimate the early sales volume, leading to an inaccurate representation of the industry's initial growth.

- Overall Trend: Despite the fluctuations and missing data points, the general trend in the dataset suggests a peak in sales between 2008 and 2010, followed by a gradual and steep decline. This trend could be reflective of the transition in the gaming market from physical to digital, with newer distribution channels like online stores and digital downloads potentially reducing the need for physical game sales. Additionally, the increased competition from mobile gaming and other entertainment options could have contributed to the decline in sales from 2015 onward.

### 3.2 Titles released over time

In [19]:
# group by the count of different titles released per year
games_per_year = df.groupby('release_year').agg({
    'title': 'nunique'  # Distinct count of titles per year
}).reset_index()

# Rename the column for clarity
games_per_year.rename(columns={'title': 'distinct_game_count'}, inplace=True)

# Create the line chart
fig = px.line(
    games_per_year,
    x='release_year',
    y='distinct_game_count',
    title='Amount of titles released per year',
    labels={'release_year': 'Release Year', 'distinct_game_count': 'Count of unique titles'},
    markers=True
)

#Plot the line chart
fig.update_layout(width=800, height=650)
fig.show()

- This barchart provides the count of distinct video game releases per year, spanning from 1971 to 2024. Here's a concise analysis of the key trends:

- Early Years (1970s–1980s): The early years (1970s to 1980s) show a slow but steady increase in the number of video game releases. In the 1970s, the industry was still in its nascent stages, with only a few titles being released each year. The numbers started to rise significantly in the early 1980s, particularly after 1982, with a large jump in 1983 (260 games), indicating the growth of the gaming market.

- Rapid Growth (1990s–2000s): From the 1990s through the early 2000s, the industry saw explosive growth in the number of new game releases. The number of releases soared from around 700 in 1991 to over 1,000 by the mid-1990s, reaching a peak in 2009 with 3,520 distinct releases. This period corresponds with the rapid growth of the gaming industry, the expansion of platforms, and the introduction of key consoles such as the PlayStation, Xbox, and PC gaming. The surge in game development is reflective of the increasing global popularity of video games.

- Stabilization and Decline (2010s–2020s): Starting around 2010, the number of releases begins to stabilize but still remained relatively high compared to earlier years. By 2015, the industry saw a slight dip in the number of new releases (984 games), with a more substantial decline starting in 2018. In 2020, there was a slight resurgence with 726 releases, potentially influenced by the COVID-19 pandemic and the increase in home entertainment consumption. However, this was followed by a continued decline, reaching only 8 releases in 2024.

- Missing Data Considerations: It's important to acknowledge that the dataset might be missing data in the early years (e.g., 1971–1979) and in recent years (e.g., 2024), which could distort the interpretation of trends. These gaps could mean that the numbers underrepresent the number of games released during those periods, particularly in the early years when the gaming industry was still evolving and data collection might not have been as robust.

- Overall Trend: The general trend shows explosive growth in the late 1990s and early 2000s, followed by a peak in 2009. Since then, the number of releases has gradually declined. The decreasing number of new releases could reflect several factors, including market saturation, the shift to digital games and downloadable content (DLC), increased reliance on established franchises, and a consolidation in the industry as fewer, larger game developers dominate the market.

### 3.3 Regional Sales over time

In [20]:
# Aggregate sales by year and region
sales_by_year_region = df.groupby(['release_year'])[['na_sales', 'jp_sales', 'pal_sales', 'other_sales']].sum().reset_index()

# Create a line chart for regional sales over time
fig = px.line(sales_by_year_region,
              x='release_year',
              y=['na_sales', 'jp_sales', 'pal_sales', 'other_sales'],
              title='Regional Sales Over Time',
              labels={'release_year': 'Year', 'value': 'Sales in Millions'},
              markers=True)

fig.update_layout(width=800, height=650)
fig.show()

- Interpretation of Regional Sales Data

- The line chart displaying regional video game sales reveals several important trends and insights:

- North America Dominates: The majority of video game sales consistently came from North America throughout the years. This is consistent with the fact that the U.S. and Canada have been major markets for the gaming industry, driven by high consumer demand, the presence of major gaming companies, and the widespread adoption of consoles and PC gaming. The North American market likely played a pivotal role in shaping the overall industry, contributing to a large portion of total global sales.

- Europe and Australia (PAL Region): Sales in Europe and Australia (PAL region) have consistently been the second largest contributor to global sales. This region showed steady growth over the years and has maintained a strong position in the gaming market. The popularity of major consoles and games in these regions reflects the broad adoption of gaming across Europe and Australia.

- Rest of the World: The "Rest of the World" category has seen moderate sales, trailing North America and Europe/Australia. While this region likely includes markets such as South America and parts of Asia, it has not reached the same level of sales as the primary regions. However, this trend may reflect emerging markets where gaming adoption was slower in the earlier years but has been growing in recent times.

- Japan's Unique Trend: Japan, which was historically a dominant force in the gaming industry, shows a distinct pattern. During the mid-90s, Japan's sales surpassed those of the "Rest of the World" category, reflecting the country's strong gaming culture, homegrown console manufacturers (like Sony and Nintendo), and a significant gaming development industry. However, after the mid-90s, Japanese sales began to decline, indicating a potential shift in consumer behavior, including the growing global appeal of Western gaming companies and the increasing dominance of North American and European markets. This trend could also reflect the rise of mobile gaming in Japan and changing consumer preferences.

### 3.4 Conclusions

- Sales Trends: Video game sales experienced significant growth from the 1990s through the late 2000s, driven by the rise of major consoles and expanding global markets. Sales peaked in 2009 and have since shown a downward trend, reflecting potential market saturation and industry shifts, such as the rise of digital and mobile gaming.

- Regional Insights: North America has consistently been the largest market for video games, followed by Europe and Australia. Japan, once a dominant player, saw a decline in sales after the mid-90s, possibly due to changing consumer preferences and the global shift towards Western gaming companies and new gaming platforms.

- Release Trends: The number of new game releases has grown exponentially, peaking in 2009. However, the number of releases has decreased in recent years, which could be attributed to industry consolidation, the dominance of established franchises, and the growing importance of downloadable content and digital platforms.

- Industry Maturity: The overall trends suggest that the video game industry has matured, with the rapid growth of the late 90s and early 2000s slowing down. The decline in new releases and sales in recent years might reflect evolving market dynamics, including mobile gaming, cloud gaming, and subscription-based services.

## 4. Discovering genre specializations across consoles

### 4.1 Genre specialization of the most popular video game consoles

In [21]:
top_consoles = df.groupby('console')['total_sales'].sum().nlargest(10).index
top_10_selling_consoles = df[df['console'].isin(top_consoles)]

# Group by 'console' and 'genre' to get sum of total_sales per combination
genre_sales = top_10_selling_consoles.groupby(['console', 'genre_top10'])['percent_total_sales'].sum().reset_index()



# Create the chart in Plotly
fig = px.bar(
    genre_sales,
    x='console',
    y='percent_total_sales',
    color='genre_top10',
    title="Genre specialization of the most popular video game consoles",
    labels={'percent_total_sales': 'Total sales (in millions)', 'console': 'Console', 'genre_top10': 'Genre'},
    barmode='stack',
    text='percent_total_sales'
)

# Plot the chart
fig.update_traces(texttemplate='%{text:.2f}%', textposition='inside')
fig.update_layout(barmode='stack', xaxis_title="Console", yaxis_title="Sales Percentage (%)")
fig.update_layout(width=800, height=650)
fig.show()

- Action and Shooter Dominance: Across most consoles, Action and Shooter genres are prominent. For instance, consoles like PS3, PS4, X360, and XOne show a high percentage of Action (ranging from 14.21% to 22.76%) and Shooter (up to 33.57% for XOne) games. This highlights the significant appeal of action-packed, fast-paced gaming experiences on these platforms.

- Sports Genre Across Platforms: The Sports genre consistently appears as one of the most popular genres, with especially high percentages on consoles like PS2 (25.85%), PS3 (18.30%), and PS4 (20.18%). This is indicative of the strong presence of sports gaming franchises, such as FIFA, NBA, and Madden, which have broad fanbases and maintain high engagement on these platforms.

- Genre Variation by Console:

- DS shows a more balanced distribution, with Misc (14.91%) and Action (14.21%) genres also contributing significantly, suggesting a diversity of game types available on this handheld console.

- The Wii, known for its motion control and family-friendly appeal, stands out with a high concentration of Misc games (26.60%), alongside Sports (20.02%) and Action (13.85%). This reflects the Wii's emphasis on casual, social gaming experiences and active participation.

- PS2 and PSP also feature notable shares of Sports and Action games, but their respective genre distributions vary slightly, with PS2 favoring Sports and PSP showing a strong presence of Action and Sports games.

- Consoles with High Specialization: Consoles like XOne and PS4 are highly specialized in the Shooter genre, with XOne showing 33.57% and PS4 26.85% of games in this category. These consoles likely benefit from major shooter titles like the Call of Duty and Battlefield franchises, which are highly popular on these platforms.

### 4.2 Genre specialization per Manufacturer

In [22]:
genre_sales = top_10_selling_consoles.groupby(['manufacturer', 'genre'])['percent_total_sales_manufacturer'].sum().reset_index()
genre_sales['percent_total_sales_manufacturer'] = genre_sales['percent_total_sales_manufacturer'].round(2)

fig= px.bar(
    genre_sales,
    x='manufacturer',
    y='percent_total_sales_manufacturer',
    color='genre',
    title="Genre specialization per Manufacturer",
    labels={'percent_total_sales_manufacturer': 'Percent of total Sales per Manufacturer', 'console': 'Console', 'genre': 'Genre', 'manufacturer': 'Manufacturer'},
    barmode='stack',
    text='percent_total_sales_manufacturer'
    )

# Plot the chart
fig.update_layout(width=800, height=650)
fig.show()

1. Microsoft:

Shooter (26.95%) is the dominant genre, reflecting the success of major franchises like Halo and Gears of War, which are staple titles on Microsoft platforms.
Action (18.29%) and Sports (17.8%) are also notable, indicating that Microsoft games appeal to a broad audience with a mix of action-packed and sports-oriented experiences, such as Forza Motorsport (racing) and sports games like Madden NFL.

2. Nintendo:

Misc (20.75%) holds the largest share, which suggests that Nintendo caters to a diverse range of gaming genres, especially family-friendly and casual games, including titles like Mario Kart and Animal Crossing.
Action (13.97%) and Sports (13.42%) also play a significant role in Nintendo’s portfolio, while Simulation (13.1%) highlights the presence of immersive, life-simulation games like The Sims series and Wii Sports.
Nintendo’s strong Misc category emphasizes its broad appeal to non-traditional gaming audiences, focusing on entertainment for all ages.

3. Sony:

Sports (21.27%) and Action (17.8%) are the leading genres for Sony, which aligns with its dominance in popular franchises like FIFA, NBA 2K, and Gran Turismo.
Shooter (15.22%) is also significant, suggesting that while Sony is known for sports and action games, it also caters to the shooter genre, with popular titles such as Killzone and Destiny.

### 4.3 Total Sales per Manufacturer

In [23]:
genre_sales = top_10_selling_consoles.groupby(['manufacturer', 'genre_top10'])['total_sales'].sum().reset_index()

# Create the bar chart
fig= px.bar(
    genre_sales,
    x='manufacturer',
    y='total_sales',
    color='genre_top10',
    title="Genre specialization of the most popular video game consoles",
    labels={'total_sales': 'Total sales (in millions)', 'manufacturer': 'Manufacturer', 'genre_top10': 'Genre'},
    barmode='stack',
    text='total_sales'
    )

# Plot the chart
fig.update_layout(width=800, height=650)
fig.update_xaxes(categoryorder='total descending')
fig.show()

- Sony leads in total sales, driven by the popularity and success of the PlayStation consoles across multiple generations. Its diversified game library and large fanbase contribute to its top position.

- Microsoft ranks second, benefiting from a strong foothold in the shooter and action game genres but still trailing Sony in overall sales. Despite this, Microsoft continues to capture a significant share of the gaming market with its Xbox consoles.

- Nintendo, while innovative and successful, especially with its portable consoles, lags behind both Sony and Microsoft in overall sales. Its focus on casual gaming and unique hardware sets it apart but results in lower total sales compared to the other two manufacturers.

- These insights suggest that Sony’s dominance is rooted in its broad appeal across diverse gaming genres, while Microsoft and Nintendo each appeal to specific gaming demographics.

### 4.4 Conclusions on genre specializations and manufacturers

1. Sony's Genre Specialization:

- Diverse Appeal: Sony consistently dominates in total sales, and its genre specialization reflects a balance of genres to cater to a wide audience. It leads in Sports and Action, with strong representations in Shooter and Role-Playing genres as well. This broad genre presence has contributed to Sony’s success, particularly with exclusive titles like Gran Turismo, God of War, and The Last of Us, which attract a diverse group of gamers.

- Sports and Action Focus: Sony's consoles, particularly the PS3, PS4, and PS5, have a high concentration of sports games (e.g., FIFA, NBA 2K) and action games (e.g., Uncharted, The Last of Us), which align with global gaming trends and contribute to the PlayStation brand's success.

2. Microsoft's Genre Specialization:

- Shooter and Action-Centric: Microsoft’s consoles, especially the Xbox series, are heavily associated with Shooter and Action games, with iconic franchises like Halo and Gears of War. The Shooter genre makes up a significant portion of Microsoft’s portfolio, indicating its appeal to a more action-oriented and competitive gaming audience.

- Sports Games: Sports titles also play a major role in Microsoft's sales, especially with franchises like Madden NFL and Forza Motorsport. However, despite these strong sales, Microsoft lags behind Sony in terms of overall genre diversity, relying more on its core gaming demographic focused on action and shooting games.

3. Nintendo's Genre Specialization:

- Casual and Family-Friendly Games: Nintendo’s specialization is clearly in Miscellaneous games, which often include family-friendly titles and games with broad appeal like Super Mario, Animal Crossing, and The Legend of Zelda. The Misc category also indicates Nintendo's focus on innovation and accessible gaming experiences, with consoles like the Wii and Switch offering unique gameplay experiences for all age groups.

- Action and Sports: While Action and Sports genres are also prevalent, they are not as dominant as in Sony or Microsoft consoles. Nintendo’s focus is less on hardcore gaming experiences and more on inclusive, accessible gaming for all, particularly through its handheld consoles like the Switch and 3DS.

4. Console-Specific Trends:

- Handheld Consoles: Nintendo has a stronger presence in the Miscellaneous genre due to the success of handheld consoles like the Nintendo DS and Switch. These consoles appeal to a different segment of gamers, focusing more on casual and mobile gaming experiences.

- Home Consoles: Sony and Microsoft dominate in the Action, Shooter, and Sports genres on their home consoles, with Sony having a slight edge in the breadth of genre offerings. Microsoft has a more concentrated focus on shooters, while Sony offers a more diverse selection of genres, contributing to its higher sales figures.

5. Publisher Influence:

- Sony and Microsoft benefit from their strong first-party franchises, which heavily influence the genre specialization of their consoles. Sony’s broad portfolio includes a mix of action, RPG, and sports games, while Microsoft’s shooters and action games cater to a specific gaming demographic.

- Nintendo focuses more on unique gameplay experiences, offering innovative and family-friendly genres that differentiate it from Sony and Microsoft. While Nintendo’s genre specialization is more niche, it is still incredibly successful in its own right.

## 5. Regional analysis

### 5.1 Sales Overview across regions

In [24]:
# Calculate the total sales per region across all games
total_region_sales = df[['na_sales', 'jp_sales', 'pal_sales', 'other_sales']].sum()

# Create a pie chart showing the regional sales share
fig = px.pie(values=total_region_sales, names=total_region_sales.index, title='Regional Sales Share')
fig.update_layout(width=800, height=650)
fig.show()

1. North America’s Dominance: North America stands out as the clear leader in global video game sales, reflecting a highly developed gaming ecosystem with a broad consumer base, established gaming infrastructure, and high disposable incomes for entertainment.

2. Strong European and Australian Contribution: Europe and Australia also play an important role in the global gaming market, with significant sales figures reflecting diverse gaming preferences across different countries.

3. Japan’s Niche but Critical Influence: Japan’s contribution is modest compared to North America and Europe, but it remains influential given the country's deep history and cultural impact on the gaming industry.

4. Growth Potential in Other Regions: Markets outside of NA, PAL, and Japan show growth potential, especially in regions like Latin America and Asia, but are still developing in terms of total sales impact.

These findings suggest that while the North American market drives the majority of sales, other regions like Europe and Japan also remain key players, with "Other regions" showing potential for future growth in the gaming market.

### 5.2 North American Sales

#### 5.2.1 Top 10 Games in North America

In [25]:
# Aggregate total sales by game and region
regional_sales = df[['title', 'na_sales', 'jp_sales', 'pal_sales', 'other_sales']].groupby('title').sum().reset_index()

# Sort by total sales to get the top games
regional_sales_sorted = regional_sales.sort_values(by='na_sales', ascending=False).head(10)

# Create a grouped bar chart
fig = px.bar(regional_sales_sorted,
             x='title',
             y=['na_sales'],
             title='Top 10 Games in North America',
             labels={'title': 'Game Name', 'value': 'Sales in Millions'},
             barmode='group')

fig.update_layout(width=800, height=650)
fig.show()

- Top 10 Selling Games:

- Dominance of Rockstar Games and Call of Duty: The presence of Grand Theft Auto V (GTA V) and Grand Theft Auto IV in the top-selling games in North America highlights the significant influence of the GTA franchise, which has become a cultural phenomenon in the region. The Call of Duty franchise, with multiple titles appearing in the top 10, reinforces the popularity of first-person shooter games, especially those with strong multiplayer modes. Guitar Hero III: Legends of Rock, in the 9th spot, indicates the popularity of rhythm-based music games during the mid-2000s, further showing the diversity of gaming preferences within North America.

#### 5.2.2 Top 10 Genres in North America

In [26]:
# Aggregate total sales by genre and region
regional_sales_genre = df[['genre', 'na_sales', 'jp_sales', 'pal_sales', 'other_sales']].groupby('genre').sum().reset_index()

# Sort by total sales to get the top genres
regional_sales_genre_sorted = regional_sales_genre.sort_values(by='na_sales', ascending=False).head(10)

# Create a grouped bar chart
fig = px.bar(regional_sales_genre_sorted,
             x='genre',
             y=['na_sales'],
             title='Top 10 video game genres in North America',
             labels={'genre': 'Genre', 'value': 'Sales in Millions'},
             barmode='group')

fig.update_layout(width=800, height=650)
fig.update_xaxes(categoryorder='total descending')
fig.show()

- Top Genres by Total Sales:

- Sports, Action, and Shooter Lead: The most popular genres in North America, by total sales, are Sports, Action, and Shooter, which makes sense given the region's strong interest in competitive sports and action-packed, immersive experiences.

- Sports games, especially those like FIFA, Madden NFL, and NBA 2K, consistently resonate with the North American audience, reflecting the region's passion for live sports.

- Action games, particularly open-world franchises like GTA, also show a strong hold, reflecting a preference for high-adrenaline, story-driven titles.

- Shooter games, including Call of Duty and Halo, continue to dominate in North America, driven by both the popularity of single-player campaigns and competitive online multiplayer modes.
Misc, Racing, and Platform genres suggest a wide range of preferences, indicating that the North American market is diverse, with players enjoying a variety of gaming experiences from racing simulators to platformers.

#### 5.2.3 Top 10 Consoles in North America

In [27]:
# Aggregate total sales by console and region
regional_sales_console = df[['console', 'na_sales', 'jp_sales', 'pal_sales', 'other_sales']].groupby('console').sum().reset_index()

# Sort by total sales to get the top games
regional_sales_console_sorted = regional_sales_console.sort_values(by='na_sales', ascending=False).head(10)

# Create a grouped bar chart
fig = px.bar(regional_sales_console_sorted,
             x='console',
             y=['na_sales'],
             title='Top 10 Consoles in North America',
             labels={'console': 'Console', 'value': 'Sales in Millions'},
             barmode='group')

fig.update_layout(width=800, height=650)
fig.update_xaxes(categoryorder='total descending')
fig.show()

- Top Consoles by Total Sales:

- Xbox 360 (X360) and PlayStation 2 (PS2) Lead the Pack: The Xbox 360 and PS2 are the top-selling consoles in North America. The X360's dominance reflects its success during the 2000s, driven by strong online services (Xbox Live) and exclusive titles. The PS2, despite being a previous generation console, maintained impressive sales over the years due to its affordability, extensive game library, and backward compatibility.

- Other consoles such as PS3, Wii, and PS4 further emphasize the strength of Sony and Nintendo in the region. The PS3 continued the PlayStation brand’s success in North America, while the Wii captured a broader, more casual audience.

- DS and PS (PlayStation 1) also remain strong players, with DS catering to portable gaming and PS having established the PlayStation brand in the region.

#### 5.2.4 Top 10 Publishers in North America

In [28]:
# Aggregate total sales by publisher and region
regional_sales_console = df[['console', 'na_sales', 'jp_sales', 'pal_sales', 'other_sales', 'publisher']].groupby('publisher').sum().reset_index()

# Sort by total sales to get the top games
regional_sales_console_sorted = regional_sales_console.sort_values(by='na_sales', ascending=False).head(10)

# Create a grouped bar chart
fig = px.bar(regional_sales_console_sorted,
             x='publisher',
             y=['na_sales'],
             title='Top 10 Publishers in North America',
             labels={'publisher': 'Publisher', 'value': 'Sales in Millions'},
             barmode='group')

fig.update_layout(width=800, height=650)
fig.update_xaxes(categoryorder='total descending')
fig.show()

- Activision and Electronic Arts Lead: Activision and Electronic Arts (EA) dominate the North American market, with their extensive game libraries and annual blockbuster franchises. Activision is especially powerful due to its Call of Duty series, while EA brings in consistent sales through its FIFA, Madden NFL, and NBA Live franchises.

- EA Sports further emphasizes the significance of sports gaming, with the brand’s long-standing success in North America.

- Ubisoft, THQ, Sony Computer Entertainment, and Rockstar Games also contribute to the diverse mix of publishers that drive North American sales, with Rockstar Games notably leading the way in open-world action with Grand Theft Auto.

#### 5.2.5 Conclusions

- Sports and Shooter Games Drive Sales: The combination of sports and shooter genres points to a strong demand for competitive and action-packed gaming experiences in North America. Titles like FIFA and Call of Duty dominate due to their multiplayer appeal and annual updates.

- GTA Franchise is a Strong Performer: The presence of GTA V and GTA IV in the top-selling games indicates the enduring popularity of the Grand Theft Auto series, reflecting the success of open-world, narrative-driven games with expansive gameplay.

- Xbox and PlayStation Consoles Dominate: Xbox 360 and PS2 lead the sales figures, but newer consoles like the PS4 and Xbox One continue to perform strongly, underlining the preference for both PlayStation and Xbox in North America. Sales for newer consoles may appear lower at the moment due to their recent release dates and fewer years of data compared to older consoles.

- Activision and EA's Consistent Performance: Activision and EA continue to dominate the sales charts with their key franchises, particularly Call of Duty and sports games, cementing their position as the leading publishers in North America.

- This analysis provides a comprehensive view of the key factors influencing the North American gaming market, highlighting the dominance of sports and shooter games, the success of consoles like the Xbox 360 and PS2, and the continuing influence of major publishers like Activision and EA. Sales data for newer consoles might appear lower, but this can be attributed to the shorter amount of time since their release, with these consoles still ramping up their market penetration.

### 5.3 Sales in Japan

#### 5.3.1 Top 10 Games in Japan

In [29]:
# Aggregate total sales by game and region
regional_sales = df[['title', 'na_sales', 'jp_sales', 'pal_sales', 'other_sales']].groupby('title').sum().reset_index()

# Sort by total sales to get the top games
regional_sales_sorted = regional_sales.sort_values(by='jp_sales', ascending=False).head(10)

# Create a grouped bar chart
fig = px.bar(regional_sales_sorted,
             x='title',
             y=['jp_sales'],
             title='Top 10 Games in Japan',
             labels={'title': 'Game Name', 'value': 'Sales in Millions'},
             barmode='group')

fig.update_traces(marker_color='#EF553B')
fig.update_layout(width=800, height=650)
fig.show()

- The top 10 best-selling games in Japan highlight a clear preference for locally popular titles and genres. Games like Hot Shots Golf, Famista 89 - Kaimaku Han!!, RBI Baseball, and Dragon Quest XI indicate that sports and RPG (role-playing game) genres are highly appealing to Japanese consumers. The list also includes niche titles like Super Puyo Puyo and Tomodachi Collection: New Life, which are well-tailored to the tastes of Japanese gamers. Notably, the appearance of Minecraft and GTA V suggests a growing interest in globally popular titles but not to the same extent as in Western markets.

#### 5.3.2 Top 10 Genres in Japan

In [30]:
# Aggregate total sales by genre and region
regional_sales_genre = df[['genre', 'na_sales', 'jp_sales', 'pal_sales', 'other_sales']].groupby('genre').sum().reset_index()

# Sort by total sales to get the top genres
regional_sales_genre_sorted = regional_sales_genre.sort_values(by='jp_sales', ascending=False).head(10)

# Create a grouped bar chart
fig = px.bar(regional_sales_genre_sorted,
             x='genre',
             y=['jp_sales'],
             title='Top 10 video game genres in Japan',
             labels={'genre': 'Genre', 'value': 'Sales in Millions'},
             barmode='group')

fig.update_traces(marker_color='#EF553B')
fig.update_layout(width=800, height=650)
fig.update_xaxes(categoryorder='total descending')
fig.show()

- The dominance of role-playing games (RPGs) and sports games aligns with Japan’s longstanding gaming culture, where RPGs like Dragon Quest and Final Fantasy have historically been popular. RPGs in particular resonate due to their storytelling, complex character development, and immersive worlds, which appeal to the Japanese market. The presence of fighting, action, and adventure games further reflects Japan’s affinity for competitive and interactive genres that emphasize skill and story-driven gameplay.

#### 5.3.3 Top 10 Consoles in Japan

In [31]:
# Aggregate total sales by console and region
regional_sales_console = df[['console', 'na_sales', 'jp_sales', 'pal_sales', 'other_sales']].groupby('console').sum().reset_index()

# Sort by total sales to get the top games
regional_sales_console_sorted = regional_sales_console.sort_values(by='jp_sales', ascending=False).head(10)

# Create a grouped bar chart
fig = px.bar(regional_sales_console_sorted,
             x='console',
             y=['jp_sales'],
             title='Top 10 Consoles in Japan',
             labels={'console': 'Console', 'value': 'Sales in Millions'},
             barmode='group')

fig.update_traces(marker_color='#EF553B')
fig.update_layout(width=800, height=650)
fig.update_xaxes(categoryorder='total descending')
fig.show()

- Japanese gamers favor consoles from the PlayStation family (PS, PS2, PS3, PSP, PS4) and Nintendo's DS and 3DS, with legacy consoles like the SNES and NES still having a strong presence in total sales. This preference indicates a strong loyalty to Sony and Nintendo, both Japanese companies that have a deep understanding of the local market. While newer consoles like the PS4 are rising in popularity, their sales figures may appear lower due to the more limited data on recent releases.

#### 5.3.4 Top 10 Publishers in Japan

In [32]:
# Aggregate total sales by publisher and region
regional_sales_console = df[['console', 'na_sales', 'jp_sales', 'pal_sales', 'other_sales', 'publisher']].groupby('publisher').sum().reset_index()

# Sort by total sales to get the top games
regional_sales_console_sorted = regional_sales_console.sort_values(by='jp_sales', ascending=False).head(10)

# Create a grouped bar chart
fig = px.bar(regional_sales_console_sorted,
             x='publisher',
             y=['jp_sales'],
             title='Top 10 Publishers in Japan',
             labels={'publisher': 'Publisher', 'value': 'Sales in Millions'},
             barmode='group')

fig.update_traces(marker_color='#EF553B')
fig.update_layout(width=800, height=650)
fig.update_xaxes(categoryorder='total descending')
fig.show()

- Konami, Nintendo, Sega, Capcom, and Sony Computer Entertainment lead as the top publishers in Japan, demonstrating the Japanese market’s loyalty to local developers and established brands. These companies produce games that cater well to regional preferences, such as RPGs and simulation games, which appeal to Japanese players’ tastes. Square Enix and Bandai Namco also make the top publishers list, reflecting the impact of their popular RPG franchises, including Final Fantasy and Dragon Quest.

#### 5.3.5 Conclusions

- The Japanese gaming market is characterized by a strong preference for locally developed games, particularly in the RPG and sports genres. PlayStation and Nintendo consoles dominate, highlighting brand loyalty to these Japanese manufacturers. In addition, the leading publishers show Japan’s preference for familiar brands and locally relevant content. While global hits like Minecraft and GTA V appear in the top-sellers, Japanese gamers prioritize titles that align more closely with traditional gaming tastes in Japan, such as RPGs and strategy-driven games.

### 5.4 Sales in Europe and Australia

#### 5.4.1 Top 10 Games in Europe and Australia

In [33]:
# Aggregate total sales by game and region
regional_sales = df[['title', 'na_sales', 'jp_sales', 'pal_sales', 'other_sales']].groupby('title').sum().reset_index()

# Sort by total sales to get the top games
regional_sales_sorted = regional_sales.sort_values(by='pal_sales', ascending=False).head(10)

# Create a grouped bar chart
fig = px.bar(regional_sales_sorted,
             x='title',
             y=['pal_sales'],
             title='Top 10 Games in Europe and Australia',
             labels={'title': 'Game Name', 'value': 'Sales in Millions'},
             barmode='group')

fig.update_traces(marker_color='#AB63FA')
fig.update_layout(width=800, height=650)
fig.show()

- In the PAL region (primarily Europe, Australia, and surrounding markets), sports and action games dominate, with Grand Theft Auto V leading as the most popular game, followed closely by several installments of the FIFA franchise and Call of Duty titles. This lineup highlights a strong preference for action-packed and competitive games—GTA V for its open-world action appeal and FIFA for its immersive sports experience, which resonates well in markets where soccer (football) holds high cultural significance. The repeated presence of FIFA games from consecutive years also suggests a consistent demand for updated sports titles that reflect recent real-world leagues and teams.

#### 5.4.2 Top 10 Genres in Europe and Australia

In [34]:
# Aggregate total sales by genre and region
regional_sales_genre = df[['genre', 'na_sales', 'jp_sales', 'pal_sales', 'other_sales']].groupby('genre').sum().reset_index()

# Sort by total sales to get the top genres
regional_sales_genre_sorted = regional_sales_genre.sort_values(by='pal_sales', ascending=False).head(10)

# Create a grouped bar chart
fig = px.bar(regional_sales_genre_sorted,
             x='genre',
             y=['pal_sales'],
             title='Top 10 video game genres in Europe and Australia',
             labels={'genre': 'Genre', 'value': 'Sales in Millions'},
             barmode='group')

fig.update_traces(marker_color='#AB63FA')
fig.update_layout(width=800, height=650)
fig.update_xaxes(categoryorder='total descending')
fig.show()

- The top genres in the PAL region show Action and Sports leading, followed by Shooter, Racing, and Platform genres. This distribution reflects a diverse but competitive gaming culture that favors high-energy, skill-based experiences. The inclusion of Racing and Platform games also aligns with European markets, where racing franchises have historically been popular. Genres like Role-Playing and Simulation, while present, do not capture as large of a market share, likely due to a preference for genres that allow for more competitive or action-oriented gameplay.

#### 5.4.3 Top 10 Consoles in Europe and Australia

In [35]:
# Aggregate total sales by console and region
regional_sales_console = df[['console', 'na_sales', 'jp_sales', 'pal_sales', 'other_sales']].groupby('console').sum().reset_index()

# Sort by total sales to get the top games
regional_sales_console_sorted = regional_sales_console.sort_values(by='pal_sales', ascending=False).head(10)

# Create a grouped bar chart
fig = px.bar(regional_sales_console_sorted,
             x='console',
             y=['pal_sales'],
             title='Top 10 Consoles in Europe and Australia',
             labels={'console': 'Console', 'value': 'Sales in Millions'},
             barmode='group')

fig.update_traces(marker_color='#AB63FA')
fig.update_layout(width=800, height=650)
fig.update_xaxes(categoryorder='total descending')
fig.show()

- The PAL region demonstrates a strong preference for PlayStation consoles, with the PS3, PS2, and PS4 leading in sales, and X360 and PS also featuring prominently. The significant market share for PlayStation consoles reflects the brand's popularity and historical success in Europe and Australia. Nintendo Wii and DS also make the list, underscoring the impact of Nintendo’s family-friendly and interactive titles in this region. Lower sales for newer consoles like the PS4 and XONE may also reflect limited data on recent releases rather than a lack of interest.

#### 5.4.4 Top 10 Publishers in Europe and Australia

In [36]:
# Aggregate total sales by publisher and region
regional_sales_console = df[['console', 'na_sales', 'jp_sales', 'pal_sales', 'other_sales', 'publisher']].groupby('publisher').sum().reset_index()

# Sort by total sales to get the top games
regional_sales_console_sorted = regional_sales_console.sort_values(by='pal_sales', ascending=False).head(10)

# Create a grouped bar chart
fig = px.bar(regional_sales_console_sorted,
             x='publisher',
             y=['pal_sales'],
             title='Top 10 Publishers in Europe and Australia',
             labels={'publisher': 'Publisher', 'value': 'Sales in Millions'},
             barmode='group')

fig.update_traces(marker_color='#AB63FA')
fig.update_layout(width=800, height=650)
fig.update_xaxes(categoryorder='total descending')
fig.show()

- Top publishers in the PAL region include Electronic Arts, Activision, and Ubisoft, with EA Sports also prominent due to its role in producing FIFA. The success of FIFA titles in this region underlines EA’s strong brand identity and connection to European markets, where soccer is highly popular. Additionally, Sony Computer Entertainment and Rockstar Games perform well, likely driven by GTA and various exclusive titles. The presence of THQ, Sega, and Konami indicates a demand for diverse content, from sports to action and RPG genres, reflecting PAL’s diverse gaming audience.

#### 5.4.5 Conclusions

- The PAL region is characterized by a preference for sports and action games, with a notable focus on FIFA titles that appeal to soccer fans, and high sales of GTA V, reflecting interest in open-world action games. The region’s gamers favor PlayStation consoles but also show support for competitive titles on Xbox and Nintendo systems. Electronic Arts leads as the top publisher, benefitting significantly from its association with FIFA. This market is diverse but strongly oriented toward high-energy, action, and sports games that cater to local cultural interests.


### 5.5 Other Regions Sales

#### 5.5.1 Top 10 Games in Other Regions

In [37]:
# Aggregate total sales by game and region
regional_sales = df[['title', 'na_sales', 'jp_sales', 'pal_sales', 'other_sales']].groupby('title').sum().reset_index()

# Sort by total sales to get the top games
regional_sales_sorted = regional_sales.sort_values(by='other_sales', ascending=False).head(10)

# Create a grouped bar chart
fig = px.bar(regional_sales_sorted,
             x='title',
             y=['other_sales'],
             title='Top 10 Games in Other Regions',
             labels={'title': 'Game Name', 'value': 'Sales in Millions'},
             barmode='group')

fig.update_traces(marker_color='#B6E880')
fig.update_layout(width=800, height=650)
fig.show()

- In other regions outside of North America, Japan, and the PAL region, Grand Theft Auto V emerges as the top-selling game, followed closely by various Call of Duty titles, including Black Ops II, Ghosts, and Black Ops 3. This lineup reflects a preference for action-oriented and competitive gaming experiences, particularly in franchises with a strong global following. Additionally, FIFA Soccer 08 stands out among these top titles, indicating a demand for sports games, particularly those that tap into the global appeal of soccer (football).

#### 5.5.2 Top 10 Genres in Other Regions

In [38]:
# Aggregate total sales by genre and region
regional_sales_genre = df[['genre', 'na_sales', 'jp_sales', 'pal_sales', 'other_sales']].groupby('genre').sum().reset_index()

# Sort by total sales to get the top genres
regional_sales_genre_sorted = regional_sales_genre.sort_values(by='other_sales', ascending=False).head(10)

# Create a grouped bar chart
fig = px.bar(regional_sales_genre_sorted,
             x='genre',
             y=['other_sales'],
             title='Top 10 video game genres in Other Regions',
             labels={'genre': 'Genre', 'value': 'Sales in Millions'},
             barmode='group')

fig.update_traces(marker_color='#B6E880')
fig.update_layout(width=800, height=650)
fig.update_xaxes(categoryorder='total descending')
fig.show()

- In these regions, the most popular genres by sales include Sports, Action, and Shooter, followed by Misc, Racing, and Role-Playing. The high ranking of Sports and Shooter genres suggests that players in these regions favor competitive and high-energy experiences, whether in a sports setting or through action-packed gameplay. The presence of Racing and Role-Playing genres also suggests an appetite for diverse gaming experiences, though they trail behind more universally popular categories. The lower ranks occupied by Platform, Adventure, Fighting, and Simulation genres reflect more niche interests within these categories.

#### 5.5.3 Top 10 Consoles in Other Regions

In [39]:
# Aggregate total sales by console and region
regional_sales_console = df[['console', 'na_sales', 'jp_sales', 'pal_sales', 'other_sales']].groupby('console').sum().reset_index()

# Sort by total sales to get the top games
regional_sales_console_sorted = regional_sales_console.sort_values(by='other_sales', ascending=False).head(10)

# Create a grouped bar chart
fig = px.bar(regional_sales_console_sorted,
             x='console',
             y=['other_sales'],
             title='Top 10 Consoles in Other Regions',
             labels={'console': 'Console', 'value': 'Sales in Millions'},
             barmode='group')

fig.update_traces(marker_color='#B6E880')
fig.update_layout(width=800, height=650)
fig.update_xaxes(categoryorder='total descending')
fig.show()

- The console preferences in these regions lean strongly towards PlayStation consoles, with the PS2, PS3, and PS4 leading in total sales, followed by Xbox 360 and Wii. This suggests a high level of brand loyalty to PlayStation, which has established itself as a dominant force in these markets. The presence of handheld consoles like the PSP and DS also indicates some demand for portable gaming, perhaps due to unique lifestyle factors in these regions that favor on-the-go entertainment.


#### 5.5.4 Top 10 Publishers in Other Regions

In [40]:
# Aggregate total sales by publisher and region
regional_sales_console = df[['console', 'na_sales', 'jp_sales', 'pal_sales', 'other_sales', 'publisher']].groupby('publisher').sum().reset_index()

# Sort by total sales to get the top games
regional_sales_console_sorted = regional_sales_console.sort_values(by='other_sales', ascending=False).head(10)

# Create a grouped bar chart
fig = px.bar(regional_sales_console_sorted,
             x='publisher',
             y=['other_sales'],
             title='Top 10 Publishers in Other Regions',
             labels={'publisher': 'Publisher', 'value': 'Sales in Millions'},
             barmode='group')

fig.update_traces(marker_color='#B6E880')
fig.update_layout(width=800, height=650)
fig.update_xaxes(categoryorder='total descending')
fig.show()

- The most popular publishers in these regions are Activision and Electronic Arts, reflecting the success of the Call of Duty and FIFA franchises, respectively. Ubisoft and Sony Computer Entertainment also rank highly, with Ubisoft’s catalog of action and adventure games likely contributing to its popularity. THQ, Rockstar Games, Konami, SEGA, and Bethesda Softworks round out the list, demonstrating that players in these regions are interested in a variety of game genres and styles beyond just sports and shooters, including role-playing and action-adventure experiences.

#### 5.5.5 Conclusions

- These regions exhibit a strong preference for action-packed, competitive games, with a marked interest in franchises like Call of Duty and FIFA, which continue to perform well in sales. The top genres of Sports, Action, and Shooter underscore the popularity of competitive gameplay, while the success of PlayStation consoles suggests that the brand has significant traction in these areas. Leading publishers like Activision and Electronic Arts dominate, driven by their global appeal in competitive sports and action genres. Overall, these regions have a diverse but competitively inclined gaming culture, with strong loyalty to leading franchises and consoles.

## 6. General Conclusions from Video Game Sales Analysis


1. **Top-Selling Games Worldwide:** The analysis revealed a consistent global preference for high-impact franchises. Games like GTA V and Call of Duty series dominate worldwide sales, indicating a strong, universal appeal for immersive, high-action titles with robust gameplay and multiplayer options. Sports games, particularly FIFA, also have a global presence, appealing to sports fans across region, the rise of games like Minecraft have achieved remarkable worldwide success due to their unique, open-world gameplay and broad appeal across diverse age groups, making them standout titles in terms of sales and cultural impact.

2. **Genre Specializations across Consoles and Manufacturers:** Each console and manufacturer shows distinct genre specializations, aligning with their target audiences. PlayStation consoles are highly versatile but excel in sports and action genres, while Xbox platforms show a stronger preference for shooter games, largely due to the influence of the Call of Duty series. Nintendo, on the other hand, focuses on a broader variety with a strong share in miscellaneous (e.g., Minecraft), action, and simulation games, catering to a wider demographic range.

3. **Sales Trends Over Time:** Video game sales reached a peak in the late 2000s, followed by a gradual decline. This trend may be influenced by shifts in gaming preferences, the rise of digital distribution over physical sales, and limited data for newer games that have yet to fully saturate the market. Additionally, the drop in sales for certain years could indicate missing data, especially in recent years, as gaming transitions increasingly to digital platforms.

4. **Titles Released Over Time:** The number of new games released peaked in the late 2000s, suggesting a period of high growth and diversity in the gaming market. However, this trend declined in recent years, likely due to increased development costs, consolidation within the gaming industry, and the rise of quality-focused production over quantity. The increase in smaller, indie titles through digital platforms may not be fully reflected in this dataset.

**Regional Preferences:**

- North America favors high-action games, with Call of Duty series and sports titles being particularly popular. The region also shows strong support for Xbox consoles and brands like Activision and EA.
Japan shows distinct preferences with Hot Shots Golf and a strong role-playing game (RPG) presence. Popular consoles here include older PlayStation generations and Nintendo consoles, with local publishers like Konami and Square Enix leading in sales.

- PAL Regions tend to favor sports and action genres, with FIFA games among the top sellers. PlayStation consoles dominate this region, showing consistent popularity across generations.
Other Regions similarly display strong action and shooter preferences, with GTA V and Call of Duty titles consistently ranking high. Sony, Activision, and EA lead the publisher rankings, indicating a cross-regional brand presence.

- Market Dominance by Console and Publisher: Sony leads in total sales, followed by Microsoft and Nintendo, reflecting their broad, established presence in key regions and appeal across multiple game genres. Activision, EA, and Ubisoft rank as the top publishers due to their popular franchises and ability to produce high-demand titles in action, shooter, and sports genres.

- **Final Thoughts**
This analysis highlights the interconnected factors driving game sales: genre specialization, platform compatibility, and publisher influence all play significant roles. While correlation between critic scores and sales was observed, causation cannot be assumed, as sales depend on multiple complex variables including marketing, regional preferences, and game availability. Going forward, further data on digital sales and emerging console popularity could provide more insights into the evolving dynamics of the gaming market.