# Video Game Sales
##### Dataset: `vgsales`.
##### Author: Batool Malkawi.
##### Date: 14/12/2020

___
## import packages
___

In [1]:
import pandas as pd
import numpy as np

In [2]:
df = pd.read_csv('./vgsales.csv')

___
## 1. Which company is the most common video game publisher?
___

In [3]:
publishers = df['Publisher']
most_common_publisher = publishers.mode()
print (most_common_publisher[0])

Electronic Arts


___
## 2. What’s the most common platform?
___

In [4]:
platforms = df['Platform']
most_common_platform = platforms.mode()
print (most_common_platform[0])

DS


___
## 3. What about the most common genre?
___

In [5]:
genres = df['Genre']
most_common_genre = genres.mode()
print (most_common_genre[0])

Action


___
## 4. What are the top 20 highest grossing games?
___

In [6]:
highest_grossing_games = df[['Name','Global_Sales']]
sorted_highest_grossing_games = highest_grossing_games.sort_values('Global_Sales', ascending=False)
top_20_highest_grossing_games = sorted_highest_grossing_games.head(20)
print (top_20_highest_grossing_games)

                                            Name  Global_Sales
0                                     Wii Sports         82.74
1                              Super Mario Bros.         40.24
2                                 Mario Kart Wii         35.82
3                              Wii Sports Resort         33.00
4                       Pokemon Red/Pokemon Blue         31.37
5                                         Tetris         30.26
6                          New Super Mario Bros.         30.01
7                                       Wii Play         29.02
8                      New Super Mario Bros. Wii         28.62
9                                      Duck Hunt         28.31
10                                    Nintendogs         24.76
11                                 Mario Kart DS         23.42
12                   Pokemon Gold/Pokemon Silver         23.10
13                                       Wii Fit         22.72
14                                  Wii Fit Plus       

___
## 5. For North American video game sales, what’s the median?
___

In [7]:
na_video_game_sales = df['NA_Sales']
na_median = na_video_game_sales.median()
print ("North America Median: ", na_median)

na_country_median = df[df['NA_Sales'] == na_median]
na_games_surrounding_median = len(na_country_median) 

middle_point = round (na_games_surrounding_median/2) 
ten_games_surrounding_median= na_country_median[(middle_point - 6):(middle_point + 4)]

print ("***************************************")
print("10 games surrounding the medians sales: ")
print ("***************************************")
for game in ten_games_surrounding_median['Name']:
    print(game)



North America Median:  0.08
***************************************
10 games surrounding the medians sales: 
***************************************
Spider-Man: Edge of Time
Turok: Evolution
Deadpool
GT Advance 2: Rally Racing
A Witch's Tale
Nickelodeon Dance
Phantasy Star Collection
LEGO Knights' Kingdom
Family Game Night 4: The Game Show
NBA Jam 2002


___
## 6. For the top-selling game of all time, how many standard deviations above/below the mean are its sales for North America?
___

*to be answered*

___
## 7. The Nintendo Wii seems to have outdone itself with games. How does its average number of sales compare with all of the other platforms?
___

In [8]:
other_platforms_mean = df[df['Platform'] !='Wii'].Global_Sales.mean()
wii_platform_mean = df[df['Platform'] =='Wii'].Global_Sales.mean()
diff = abs(wii_platform_mean - other_platforms_mean)
print("Mean of other platforms: ", other_platforms_mean)
print("Mean of Wii platform: ", wii_platform_mean)
print ("Difference: ", diff)

Mean of other platforms:  0.5233896418516336
Mean of Wii platform:  0.6994037735849057
Difference:  0.17601413173327207


___
## 8. Top 20 Puzzle games?
___


In [9]:

df[df['Genre'] == 'Puzzle'].sort_values('Global_Sales', ascending=False).head(20)

Unnamed: 0,Rank,Name,Platform,Year,Genre,Publisher,NA_Sales,EU_Sales,JP_Sales,Other_Sales,Global_Sales
5,6,Tetris,GB,1989.0,Puzzle,Nintendo,23.2,2.26,4.22,0.58,30.26
27,28,Brain Age 2: More Training in Minutes a Day,DS,2005.0,Puzzle,Nintendo,3.44,5.36,5.32,1.18,15.3
89,90,Pac-Man,2600,1982.0,Puzzle,Atari,7.28,0.45,0.0,0.08,7.81
155,156,Tetris,NES,1988.0,Puzzle,Nintendo,2.97,0.69,1.81,0.11,5.58
170,171,Dr. Mario,GB,1989.0,Puzzle,Nintendo,2.18,0.96,2.0,0.2,5.34
177,178,Professor Layton and the Curious Village,DS,2007.0,Puzzle,Nintendo,1.22,2.48,1.03,0.52,5.26
215,216,Dr. Mario,NES,1990.0,Puzzle,Nintendo,2.62,0.6,1.52,0.1,4.85
300,301,Professor Layton and the Diabolical Box,DS,2007.0,Puzzle,Nintendo,0.92,1.78,0.92,0.37,4.0
399,400,Professor Layton and the Unwound Future,DS,2008.0,Puzzle,Nintendo,0.65,1.61,0.82,0.28,3.36
488,489,Pac-Man Collection,GBA,2001.0,Puzzle,Atari,2.07,0.77,0.05,0.05,2.94


___
## 9. What is the most popular game in Japan?
___

In [10]:
most_popular_in_japan = df.sort_values('JP_Sales', ascending=False)
print ("Most popular game in Japan is: ", most_popular_in_japan.iloc[0]['Name'])

Most popular game in Japan is:  Pokemon Red/Pokemon Blue


___
## 10. What is the least popular game in Japan?
___

In [11]:
least_popular_in_japan = df.sort_values('JP_Sales', ascending=True)
print ("Least popular game in Japan is: ", least_popular_in_japan.iloc[0]['Name'])

NameError: name 'east_popular_in_japan' is not defined