### Video Game Sales Analysis

Data set: vgsales

Timothy Viccari June 2, 2021

In [2]:
import pandas as pd

In [19]:
df = pd.read_csv('./vgsales.csv')
df.head()

Unnamed: 0,Rank,Name,Platform,Year,Genre,Publisher,NA_Sales,EU_Sales,JP_Sales,Other_Sales,Global_Sales
0,1,Wii Sports,Wii,2006.0,Sports,Nintendo,41.49,29.02,3.77,8.46,82.74
1,2,Super Mario Bros.,NES,1985.0,Platform,Nintendo,29.08,3.58,6.81,0.77,40.24
2,3,Mario Kart Wii,Wii,2008.0,Racing,Nintendo,15.85,12.88,3.79,3.31,35.82
3,4,Wii Sports Resort,Wii,2009.0,Sports,Nintendo,15.75,11.01,3.28,2.96,33.0
4,5,Pokemon Red/Pokemon Blue,GB,1996.0,Role-Playing,Nintendo,11.27,8.89,10.22,1.0,31.37


### Most Common Publisher

In [157]:
most_common_publisher = df['Publisher'].mode()[0]
most_common_publisher

'Electronic Arts'

### Most Common Platform

In [159]:
most_common_platform = df['Platform'].mode()[0]
most_common_platform

'DS'

### Most Common Genre

In [161]:
most_common_genre = df['Genre'].mode()[0]
most_common_genre

'Action'

### Top 20 Highest Grossing Games

In [162]:
top_twenty_highest_grossing_games = df[['Name','Global_Sales']].sort_values('Global_Sales', ascending =False).head(20).set_index("Global_Sales")
top_twenty_highest_grossing_games

Unnamed: 0_level_0,Name
Global_Sales,Unnamed: 1_level_1
82.74,Wii Sports
40.24,Super Mario Bros.
35.82,Mario Kart Wii
33.0,Wii Sports Resort
31.37,Pokemon Red/Pokemon Blue
30.26,Tetris
30.01,New Super Mario Bros.
29.02,Wii Play
28.62,New Super Mario Bros. Wii
28.31,Duck Hunt


### Median Sales in North America (in Millions)

In [172]:
na_median_sales = df['NA_Sales'].median()
df[df['NA_Sales'] == .08].head(10)

Unnamed: 0,Rank,Name,Platform,Year,Genre,Publisher,NA_Sales,EU_Sales,JP_Sales,Other_Sales,Global_Sales
446,447,Dragon Warrior IV,NES,1990.0,Role-Playing,Enix Corporation,0.08,0.0,3.03,0.01,3.12
497,498,World Soccer Winning Eleven 7 International,PS2,2003.0,Sports,Konami Digital Entertainment,0.08,1.24,1.13,0.45,2.9
1617,1619,Farming Simulator 2015,PC,2014.0,Simulation,Focus Home Interactive,0.08,1.02,0.0,0.13,1.23
1926,1928,Pro Evolution Soccer 2008,X360,2007.0,Sports,Konami Digital Entertainment,0.08,0.9,0.04,0.05,1.07
2067,2069,Winning Eleven: Pro Evolution Soccer 2007 (All...,X360,2006.0,Sports,Konami Digital Entertainment,0.08,0.9,0.02,0.0,1.0
2373,2375,Phantasy Star Portable 2,PSP,2009.0,Role-Playing,Sega,0.08,0.11,0.62,0.06,0.88
2579,2581,The Sims 2: Castaway,PSP,2007.0,Simulation,Electronic Arts,0.08,0.46,0.0,0.25,0.8
3186,3188,SingStar Queen,PS2,2009.0,Misc,Sony Computer Entertainment,0.08,0.12,0.0,0.44,0.63
3503,3505,Top Spin 3,PS3,2008.0,Action,Take-Two Interactive,0.08,0.37,0.0,0.12,0.57
3703,3705,Sonic & All-Stars Racing Transformed,PS3,2012.0,Racing,Sega,0.08,0.33,0.01,0.11,0.54


### Stdev of top selling game from average

In [91]:
max = df[['NA_Sales']].max()[0]
stdev = df[['NA_Sales']].std()[0]
mean = df[['NA_Sales']].mean()[0]
answer = ((max-mean)/stdev).round(2)

print(f'The top selling game in North America is {answer} standard deviations from the mean')

The top selling game in North America is 50.48 standard deviations from the mean


In [139]:
mean_sales = df[['Global_Sales']].mean()[0]
wii = df[df['Platform'] == 'Wii']
wii_avg = wii[['Global_Sales']].mean()[0]
difference = ((wii_avg - mean_sales)*1000000).round(0)
percent = ((wii_avg - mean_sales)/mean_sales*100).round(0)
print(f'Wii sales crushed it. Their average global revenue per game is ${difference} more then the average platform sales which represents a {percent} percent increase')


Wii sales crushed it. Their average global revenue per game is $161963.0 more then the average platform sales which represents a 30.0 percent increase


### Global Sales trends through the years

In [150]:
df[['Year', 'Platform', 'Global_Sales']].groupby("Year").sum()


Unnamed: 0_level_0,Global_Sales
Year,Unnamed: 1_level_1
1980.0,11.38
1981.0,35.77
1982.0,28.86
1983.0,16.79
1984.0,50.36
1985.0,53.94
1986.0,37.07
1987.0,21.74
1988.0,47.22
1989.0,73.45


In [175]:
def test():

    def assert_equal(actual,expected):
        assert actual == expected, f"Expected {expected} but got {actual}"

    assert_equal(most_common_publisher, "Electronic Arts")
    assert_equal(most_common_platform, "DS")
    assert_equal(most_common_genre, "Action")
    assert_equal(top_twenty_highest_grossing_games.iloc[0].Name, 'Wii Sports')
    assert_equal(top_twenty_highest_grossing_games.iloc[19].Name, 'Brain Age: Train Your Brain in Minutes a Day')
    assert_equal(na_median_sales, .08)
    assert_equal(difference, 161963.0)
    assert_equal(percent, 30.0)

    print("Success!!!")

test()

Success!!!
