### Heroes Of Pymoli Data Analysis
* Of the 1163 active players, the vast majority are male (84%). There also exists, a smaller, but notable proportion of female players (14%).

* Our peak age demographic falls between 20-24 (44.8%) with secondary groups falling between 15-19 (18.60%) and 25-29 (13.4%).  
-----

### Note
* Instructions have been included for each segment. You do not have to follow them exactly, but they are included to help you think through the steps.

In [2]:
# Dependencies and Setup
import pandas as pd
import numpy as np

# File to Load (Remember to Change These)
file_to_load = "Resources/purchase_data.csv"

# Read Purchasing File and store into Pandas data frame
purchase_data = pd.read_csv(file_to_load)
purchase_data.head()

Unnamed: 0,Purchase ID,SN,Age,Gender,Item ID,Item Name,Price
0,0,Lisim78,20,Male,108,"Extraction, Quickblade Of Trembling Hands",3.53
1,1,Lisovynya38,40,Male,143,Frenzied Scimitar,1.56
2,2,Ithergue48,24,Male,92,Final Critic,4.88
3,3,Chamassasya86,24,Male,100,Blindscythe,3.27
4,4,Iskosia90,23,Male,131,Fury,1.44


## Player Count

* Display the total number of players


In [3]:
num_players = str(purchase_data['SN'].nunique())
print(num_players + " total players")

576 total players


## Purchasing Analysis (Total)

* Run basic calculations to obtain number of unique items, average price, etc.


* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame


In [4]:
unique_items = purchase_data['Item ID'].nunique()
avg_price = purchase_data['Price'].mean()
total_purchases = len(purchase_data)
total_revenue = purchase_data['Price'].sum()
total_purchasing_analysis = [[unique_items, avg_price, total_purchases, total_revenue]]
summary_df = pd.DataFrame(total_purchasing_analysis, columns = ['Number of Unique Items', 'Average Price', 'Number of Purchases', 'Total Revenue'])
summary_df

Unnamed: 0,Number of Unique Items,Average Price,Number of Purchases,Total Revenue
0,183,3.050987,780,2379.77


## Gender Demographics

* Percentage and Count of Male Players


* Percentage and Count of Female Players


* Percentage and Count of Other / Non-Disclosed




In [5]:
gender_data = purchase_data.groupby('Gender')
m_players = gender_data.get_group('Male')
num_m_players = m_players['SN'].nunique()
f_players = gender_data.get_group('Female')
num_f_players = f_players['SN'].nunique()
total_players = purchase_data['SN'].nunique()
num_o_players = int(total_players) - int(num_m_players) - int(num_f_players)

pct_m_players = (float(num_m_players) / float(total_players)) * 100
pct_f_players = (float(num_f_players) / float(total_players)) * 100
pct_o_players = (float(num_o_players) / float(total_players)) * 100

print('Count of Male Players: ' + str(num_m_players))
print('Percentage of Male Players: ' + str(pct_m_players))
print('Count of Female Players: ' + str(num_f_players))
print('Percentage of Female Players: ' + str(pct_f_players))
print('Count of Other / Non - Disclosed Players: ' + str(num_o_players))
print('Percentage of Other / Non - Disclosed Players: ' + str(pct_o_players))

Count of Male Players: 484
Percentage of Male Players: 84.02777777777779
Count of Female Players: 81
Percentage of Female Players: 14.0625
Count of Other / Non - Disclosed Players: 11
Percentage of Other / Non - Disclosed Players: 1.9097222222222223



## Purchasing Analysis (Gender)

* Run basic calculations to obtain purchase count, avg. purchase price, avg. purchase total per person etc. by gender




* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame

In [87]:
f_purchase_count = len(f_players)
m_purchase_count = len(m_players)
o_purchase_count = len(purchase_data) - f_purchase_count - m_purchase_count
print('Female purchase count: ' + str(f_purchase_count))
print('Male purchase count: ' + str(m_purchase_count))
print('Other / Non - Disclosed purchase count: ' + str(o_purchase_count))

print('----------------------------------------------')

f_avg_price = f_players['Price'].mean()
m_avg_price = m_players['Price'].mean()

o_players = gender_data.get_group('Other / Non-Disclosed')
o_avg_price = o_players['Price'].mean()

print('Female average purchase price: ' + str(f_avg_price))
print('Male average purchase price: ' + str(m_avg_price))
print('Other / Non - Disclosed average purchase price: ' + str(o_avg_price))

print('----------------------------------------------')

f_total_value = f_players['Price'].sum()
m_total_value = m_players['Price'].sum()
o_total_value = o_players['Price'].sum()

print('Female total purchase value: ' + str(f_total_value))
print('Male total purchase value: ' + str(m_total_value))
print('Other / Non - Disclosed total purchase value: ' + str(o_total_value))

print('----------------------------------------------')

f_hold = f_players.groupby('SN').sum()
f_avg_price_per = f_hold['Price'].mean()
m_hold = m_players.groupby('SN').sum()
m_avg_price_per = m_hold['Price'].mean()
o_hold = o_players.groupby('SN').sum()
o_avg_price_per = o_hold['Price'].mean()

print('Female average total purchase per person: ' + str(f_avg_price_per))
print('Male average total purchase per person: ' + str(m_avg_price_per))
print('Other / Non - Disclosed average total purchase per person: ' + str(o_avg_price_per))

Female purchase count: 113
Male purchase count: 652
Other / Non - Disclosed purchase count: 15
----------------------------------------------
Female average purchase price: 3.203008849557519
Male average purchase price: 3.0178527607361953
Other / Non - Disclosed average purchase price: 3.3460000000000005
----------------------------------------------
Female total purchase value: 361.94
Male total purchase value: 1967.64
Other / Non - Disclosed total purchase value: 50.19
----------------------------------------------
Female average total purchase per person: 4.468395061728394
Male average total purchase per person: 4.06537190082645
Other / Non - Disclosed average total purchase per person: 4.5627272727272725
----------------------------------------------
Age Range
<10       24
10-14     15
15-19    107
20-24    258
25-29     77
30-34     52
34-39     31
40+       12
Name: SN, dtype: int64
Age Range
<10      0.041667
10-14    0.026042
15-19    0.185764
20-24    0.447917
25-29    0.13368

## Age Demographics

* Establish bins for ages


* Categorize the existing players using the age bins. Hint: use pd.cut()


* Calculate the numbers and percentages by age group


* Create a summary data frame to hold the results


* Optional: round the percentage column to two decimal points


* Display Age Demographics Table


In [107]:
bins = [0, 10, 14, 19, 24, 29, 34, 39, 120]
group_labels = ["<10", "10-14", "15-19", "20-24", "25-29", "30-34",
                "34-39", "40+"]
purchase_data['Age Range'] = pd.cut(purchase_data['Age'], bins, labels=group_labels)
binned_data = purchase_data.groupby('Age Range')
num_age = binned_data['SN'].nunique()
pct_age = (binned_data['SN'].nunique() / total_players)*100
age_figures = pd.concat([num_age, pct_age], axis=1)
age_figures.columns = ('Number of players', 'Percentage of players')
age_figures

Unnamed: 0_level_0,Number of players,Percentage of players
Age Range,Unnamed: 1_level_1,Unnamed: 2_level_1
<10,24,4.166667
10-14,15,2.604167
15-19,107,18.576389
20-24,258,44.791667
25-29,77,13.368056
30-34,52,9.027778
34-39,31,5.381944
40+,12,2.083333


## Purchasing Analysis (Age)

* Bin the purchase_data data frame by age


* Run basic calculations to obtain purchase count, avg. purchase price, avg. purchase total per person etc. in the table below


* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame

Unnamed: 0,Purchase Count,Average Purchase Price,Total Purchase Value,Avg Total Purchase per Person
10-14,28,$2.96,$82.78,$3.76
15-19,136,$3.04,$412.89,$3.86
20-24,365,$3.05,"$1,114.06",$4.32
25-29,101,$2.90,$293.00,$3.81
30-34,73,$2.93,$214.00,$4.12
35-39,41,$3.60,$147.67,$4.76
40+,13,$2.94,$38.24,$3.19
<10,23,$3.35,$77.13,$4.54


## Top Spenders

* Run basic calculations to obtain the results in the table below


* Create a summary data frame to hold the results


* Sort the total purchase value column in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the summary data frame



In [6]:
SN_grouped = purchase_data.groupby('SN')
spenders =  SN_grouped['Purchase ID'].nunique()
spenders.nlargest(5)

spender_one = SN_grouped.get_group('Lisosia93')
print('Average purchase price for Lisosia93: ' + str(spender_one['Price'].mean()))
print('Total purchase price for Lisosia93: ' + str(spender_one['Price'].sum()))
print('----------------------------------------------')
spender_two = SN_grouped.get_group('Idastidru52')
print('Average purchase price for Idastidru52: ' + str(spender_two['Price'].mean()))
print('Total purchase price for Idastidru52: ' + str(spender_two['Price'].sum()))
print('----------------------------------------------')
spender_three = SN_grouped.get_group('Iral74')
print('Average purchase price for Iral74: ' + str(spender_three['Price'].mean()))
print('Total purchase price for Iral74: ' + str(spender_three['Price'].sum()))
print('----------------------------------------------')
spender_four = SN_grouped.get_group('Aelin32')
print('Average purchase price for Aelin32: ' + str(spender_four['Price'].mean()))
print('Total purchase price for Aelin32: ' + str(spender_four['Price'].sum()))
print('----------------------------------------------')
spender_five = SN_grouped.get_group('Aina42')
print('Average purchase price for Aina42: ' + str(spender_five['Price'].mean()))
print('Total purchase price for Aina42: ' + str(spender_five['Price'].sum()))

Average purchase price for Lisosia93: 3.7920000000000003
Total purchase price for Lisosia93: 18.96
----------------------------------------------
Average purchase price for Idastidru52: 3.8625
Total purchase price for Idastidru52: 15.45
----------------------------------------------
Average purchase price for Iral74: 3.4049999999999994
Total purchase price for Iral74: 13.619999999999997
----------------------------------------------
Average purchase price for Aelin32: 2.9933333333333336
Total purchase price for Aelin32: 8.98
----------------------------------------------
Average purchase price for Aina42: 3.073333333333333
Total purchase price for Aina42: 9.219999999999999


## Most Popular Items

* Retrieve the Item ID, Item Name, and Item Price columns


* Group by Item ID and Item Name. Perform calculations to obtain purchase count, item price, and total purchase value


* Create a summary data frame to hold the results


* Sort the purchase count column in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the summary data frame



In [9]:
item_grouped = purchase_data.groupby(['Item ID', 'Item Name', 'Price'])
pop_items = item_grouped['Purchase ID'].count()
tpv = item_grouped['Price'].sum()
table_names = pop_items.nlargest(5)
print(table_names)
total_prices = tpv.where(['Item ID' == int(178), int(82), int(108), int(145), int(19)])
total_prices

Unnamed: 0_level_0,Unnamed: 1_level_0,Purchase Count,Item Price,Total Purchase Value
Item ID,Item Name,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
178,"Oathbreaker, Last Hope of the Breaking Storm",12,$4.23,$50.76
145,Fiery Glass Crusader,9,$4.58,$41.22
108,"Extraction, Quickblade Of Trembling Hands",9,$3.53,$31.77
82,Nirvana,9,$4.90,$44.10
19,"Pursuit, Cudgel of Necromancy",8,$1.02,$8.16


## Most Profitable Items

* Sort the above table by total purchase value in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the data frame



In [10]:
tpv.nlargest(5)

Unnamed: 0_level_0,Unnamed: 1_level_0,Purchase Count,Item Price,Total Purchase Value
Item ID,Item Name,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
178,"Oathbreaker, Last Hope of the Breaking Storm",12,$4.23,$50.76
82,Nirvana,9,$4.90,$44.10
145,Fiery Glass Crusader,9,$4.58,$41.22
92,Final Critic,8,$4.88,$39.04
103,Singed Scalpel,8,$4.35,$34.80
