# Project 8: Video Game Sales Analysis - Grouping & Aggregation

### Context

Video games creation is essentially a software development process.  A group of individuals or a company create a video game with the aim of massive profits. Generally, publishers such as EA Sports, Atari, Rockstar Games etc. fund the game development process. However, for publishers, it is very important to estimate the cost of development of a video game. Additionally, they also have to allocate the right amount of budget. If the budget is too little, the quality of a game will get compromised whereas if the budget is too much, their profit margin will reduce. Most of the commercial games do not generate adequate profit. 

A video game is an interactive visual story. A new game must provide novelty and must be a product of innovation. Otherwise, it may turn out repetitive and boring. Many individual groups of game developers shut down their development process and exit the market because they cannot find a publisher or they can't fund the development by themselves as the initial capital investment requirement is humongous. Nevertheless, once the companies become financially stable by making sufficient profits, they may expand to develop newer games or sequels to the initial ones such as FIFA, Call of Duty, Age of Empires etc.

An average development budget for a multiplatform (PC, PS, Xbox etc.) game is US \$18 to 28 million, with high-profile games often exceeding US $40 million.



---

### Problem Statement

Imagine that you work for one of the world's biggest tech giants as a data analyst. The company intends to venture into the video game development business by either creating their own video games and gaming platforms or by funding a group of individual game developers.

As a part of market research, your CEO wants to come up with a business strategy to enable your company to enter into the video game development business. However, in the best interest of companies financial investment in this project, it is important to know whether there are enough buyers, in the long run, do the number of buyers increase so that they stay invested in this project.

Your CEO would like to know what kind of games are most popular in terms of the most units sold, what are the most commonly used gaming platforms such as PS4, Xbox, PC etc.

---

### Dataset Description

You are provided with a video games sales dataset. It consists of the following features:

1. `Rank` - Rank based on the number of units sold of a game. The most sold game is ranked 1.

2. `Name` - The name of a video game.

3. `Platform` - The platform (PC, PS4, XBox etc.) for which a game is released.

4. `Year` - The release year of a video game.

5. `Genre` - The genre of a video game.

6. `Publisher` - The publisher of a video game.

7. `NA_Sales` - Approximately, the total number of units sold (in million) of a video game in North America.

8. `EU_Sales` - Approximately, the total number of units sold (in million) of a video game in Europe.

9. `JP_Sales` - Approximately, the total number of units sold (in million) of a video game in Japan.

10. `Other_Sales` - Approximately, the total number of units sold (in million) of a video game in the rest of the world.

11. `Global_Sales` - Approximately, the total number of units sold (in million) of a video game all over the world.

Here's a link to the dataset:

https://student-datasets-bucket.s3.ap-south-1.amazonaws.com/whitehat-ds-datasets/video-games-sales/video-game-sales.csv

---

### Things To Do

- The `Year` and `Publisher` columns contain few missing values. Treat them accordingly.

- Convert the values contained in the `Year` column into integer values.

- Find out:

  1. The trend of growth in the number of total units sold across the given regions and the world. Also create year-wise line plots for the total number of units sold across different regions and the world.
  
  2. Top 10 most sold genres of video games but at least 100 million units sold globally. Also create genre-wise line plots for the total number of units sold across different regions and the world.

  3. Top 10 best publishers of video games but at least 100 million units sold globally. 
  
  4. Top 10 most commonly used gaming platform but at least 100 million units sold globally.

---

In [None]:
n

#### 1. Import Modules & Load Data

In [None]:
# Import the modules required.
import pandas as pd
import numpy as np

In [None]:
# Load the dataset.
data=pd.read_csv("https://student-datasets-bucket.s3.ap-south-1.amazonaws.com/whitehat-ds-datasets/video-games-sales/video-game-sales.csv")
data

Unnamed: 0,Rank,Name,Platform,Year,Genre,Publisher,NA_Sales,EU_Sales,JP_Sales,Other_Sales,Global_Sales
0,1,Wii Sports,Wii,2006.0,Sports,Nintendo,41.49,29.02,3.77,8.46,82.74
1,2,Super Mario Bros.,NES,1985.0,Platform,Nintendo,29.08,3.58,6.81,0.77,40.24
2,3,Mario Kart Wii,Wii,2008.0,Racing,Nintendo,15.85,12.88,3.79,3.31,35.82
3,4,Wii Sports Resort,Wii,2009.0,Sports,Nintendo,15.75,11.01,3.28,2.96,33.00
4,5,Pokemon Red/Pokemon Blue,GB,1996.0,Role-Playing,Nintendo,11.27,8.89,10.22,1.00,31.37
...,...,...,...,...,...,...,...,...,...,...,...
16593,16596,Woody Woodpecker in Crazy Castle 5,GBA,2002.0,Platform,Kemco,0.01,0.00,0.00,0.00,0.01
16594,16597,Men in Black II: Alien Escape,GC,2003.0,Shooter,Infogrames,0.01,0.00,0.00,0.00,0.01
16595,16598,SCORE International Baja 1000: The Official Game,PS2,2008.0,Racing,Activision,0.00,0.00,0.00,0.00,0.01
16596,16599,Know How 2,DS,2010.0,Puzzle,7G//AMES,0.00,0.01,0.00,0.00,0.01


In [None]:
# Get the dataset information.
data.shape

(16598, 11)

---

#### 2. Treat Null Values

In [None]:
# Check for the null values in all the columns.
data.isnull().sum()

Rank              0
Name              0
Platform          0
Year            271
Genre             0
Publisher        58
NA_Sales          0
EU_Sales          0
JP_Sales          0
Other_Sales       0
Global_Sales      0
dtype: int64

In [None]:
# Remove the rows/columns containing the null values.


In [None]:
# Convert the data-type of the year values into integer values.


---

#### 3. Yearly Total Units Sold

In [None]:
# Find out the total number of units sold yearly across different regions and the world.


In [None]:
# Create the line plots for the total number of units sold yearly across different regions and the world.


**Q:** In which year, the most number of games were sold globally and how many?

**A:** 

In [None]:
# In which year, the most number of games were sold globally and how many?


---

#### 4. Genre-wise Total Units Sold

In [None]:
# Find out the genre-wise total number of units sold across different regions and the world.


In [None]:
# Create line plots for genre-wise total number of units sold across different regions and the world.


**Q:** What genre of video game is most popular in Japan in terms of the total number of units sold? Also, provide the total number of units sold in Japan for that genre.

**A:** 

In [None]:
# What genre of video game is most popular in Japan in terms of the total number of units sold?


In [None]:
# Genre-wise total number of units sold across different regions and the world in descending order.


**Q:** Which genre of the video games sells the most globally and how much?

**A:** 

---

#### 5. Publisher-wise Total Units Sold

In [None]:
# Find out the publisher-wise total number of units sold across different regions and the world in the descending order.


**Q:** Which video game publisher sells the most number of units globally and how much?

**A:** 

---

#### 6. Platform-wise Total Units Sold

In [None]:
# Find out the platform-wise the total number of units sold across different regions and the world in the descending order.


**Q:** For which platform of a video game, the most number of units are sold globally and how much?

**A:** 

---