# Heroes Of Pymoli Data Analysis
---
* Out of the 576 total players *(Fig 1)*, the vast majority are male, but the game creators would do better to try to attract more female players as they tend to spend more per transaction than males do *(Fig 4)*. 
---
*  While ages 20-24 by far outspend all other age groups, it seems that ages 35-39 spend more per transaction (followed by <10 year olds, who's giving these kids that much money??) *(Fig 6)*.
---
* Taking some time with the Most Popular Items *(Fig 8)*, we see that the least popular items are the least expensive. The most popular items are near the top in terms of cost. We have yet to achieve a bell curve, and should create more higher end items until we start to see a noticable decline in popularity. 
-----

In [2]:
# load library
import pandas as pd

In [3]:
# Pull in the file and add it to a dataframe
file = "Resources/purchase_data.csv"
hopdf = pd.read_csv(file)

## Fig 1. Player Count

In [4]:
totp = hopdf["SN"].nunique()
print(f'Total Players: {totp}')

Total Players: 576


## Fig 2. Purchasing Analysis (Total)

In [14]:
# Sets up the purchasing analysis total table
pat = pd.DataFrame({'Number of Unique Items': [hopdf["Item Name"].nunique()], 
                              'Average Purchase Price': [hopdf["Price"].mean()],
                             'Total Number of Purchases': [hopdf["Purchase ID"].count()],
                              'Total Revenue': [hopdf["Price"].sum()]})
# reformats the currency values
pat['Average Purchase Price'] = pat['Average Purchase Price'].map("${:.2f}".format)
pat['Total Revenue'] = pat['Total Revenue'].map("${:,.2f}".format)
pat

Unnamed: 0,Number of Unique Items,Average Purchase Price,Total Number of Purchases,Total Revenue
0,179,$3.05,780,"$2,379.77"


## Fig 3. Gender Demographics

In [6]:
# groups the recorded genders
gdgrp = hopdf.groupby(['Gender'])

# sets unique counts of Total Players in this grouping
guni = gdgrp["SN"].nunique()

# sets up the gender analysis table
gat = pd.DataFrame({'Total Count':guni,'Percentage of Players': guni/totp})

# reformats percentage
gat['Percentage of Players'] = gat['Percentage of Players'].map("{:.2%}".format)
gat = gat.sort_values(by=['Percentage of Players'],ascending=False)
gat

Unnamed: 0_level_0,Total Count,Percentage of Players
Gender,Unnamed: 1_level_1,Unnamed: 2_level_1
Male,484,84.03%
Female,81,14.06%
Other / Non-Disclosed,11,1.91%


## Fig 4. Purchasing Analysis (Gender)

In [7]:
# calculate Purchase Count
pc =  gdgrp['Purchase ID'].count()

# calculate Average Purchase Price
app = gdgrp['Price'].mean()

# calculate Total Purchase Value
tpv = gdgrp['Price'].sum()

# calculate Avg Total Purchase Per Person
atpp = tpv / guni

# wrap it all up
pag = pd.DataFrame({'Purchase Count':pc, 'Average Purchase Price':app,'Total Purchase Value':tpv,
                     'Avg Total Purchase per Person':atpp})

# make it look nice
pag['Average Purchase Price'] = pag['Average Purchase Price'].map("${:.2f}".format)
pag['Total Purchase Value'] = pag['Total Purchase Value'].map("${:,.2f}".format)
pag['Avg Total Purchase per Person'] = pag['Avg Total Purchase per Person'].map("${:.2f}".format)
pag

Unnamed: 0_level_0,Purchase Count,Average Purchase Price,Total Purchase Value,Avg Total Purchase per Person
Gender,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
Female,113,$3.20,$361.94,$4.47
Male,652,$3.02,"$1,967.64",$4.07
Other / Non-Disclosed,15,$3.35,$50.19,$4.56


## Fig 5. Age Demographics

In [19]:
# define bins and their names
bins = [0, 9.90, 14.90, 19.90, 24.90, 29.90, 34.90, 39.90, 99999]
group_names = ["<10", "10-14", "15-19", "20-24", "25-29", "30-34", "35-39", "40+"]

# cut up the data into the bins
hopdf["Age Ranges"] = pd.cut(hopdf["Age"], bins, labels=group_names)

# group the ranges
agedf = hopdf.groupby("Age Ranges")

# count the unique ids
auni = agedf["SN"].nunique()

# add the math
agedemo = pd.DataFrame({'Total Count': auni, 'Percent of Players': (auni/totp)})

# make it look nice
agedemo['Percent of Players'] = agedemo['Percent of Players'].map("{:.2%}".format)
agedemo

Unnamed: 0_level_0,Total Count,Percent of Players
Age Ranges,Unnamed: 1_level_1,Unnamed: 2_level_1
<10,17,2.95%
10-14,22,3.82%
15-19,107,18.58%
20-24,258,44.79%
25-29,77,13.37%
30-34,52,9.03%
35-39,31,5.38%
40+,12,2.08%


## Fig 6. Purchasing Analysis (Age)

In [20]:
# calculate Purchase Count
agepc =  agedf['Purchase ID'].count()

# calculate Average Purchase Price
ageapp = agedf['Price'].mean()

# calculate Total Purchase Value
agetpv = agedf['Price'].sum()

# calculate Avg Total Purchase Per Person
ageatpp = agetpv / auni

# wrap it all up
paa = pd.DataFrame({'Purchase Count':agepc, 'Average Purchase Price':ageapp,'Total Purchase Value':agetpv,
                     'Avg Total Purchase per Person':ageatpp})

# make it look nice
paa['Average Purchase Price'] = paa['Average Purchase Price'].map("${:.2f}".format)
paa['Total Purchase Value'] = paa['Total Purchase Value'].map("${:,.2f}".format)
paa['Avg Total Purchase per Person'] = paa['Avg Total Purchase per Person'].map("${:.2f}".format)
paa

Unnamed: 0_level_0,Purchase Count,Average Purchase Price,Total Purchase Value,Avg Total Purchase per Person
Age Ranges,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
<10,23,$3.35,$77.13,$4.54
10-14,28,$2.96,$82.78,$3.76
15-19,136,$3.04,$412.89,$3.86
20-24,365,$3.05,"$1,114.06",$4.32
25-29,101,$2.90,$293.00,$3.81
30-34,73,$2.93,$214.00,$4.12
35-39,41,$3.60,$147.67,$4.76
40+,13,$2.94,$38.24,$3.19


## Fig 7. Top Spenders

In [28]:
# group spenders
sndf = hopdf.groupby('SN')

# calculate Purchase Count
tspc = sndf['Purchase ID'].count()

# calculate Average Purchase Price
tsapp = sndf['Price'].mean()

# calculate Total Purchase Value
tstpv = sndf['Price'].sum()

# wrap it all up
tsaa = pd.DataFrame({'Purchase Count':tspc, 'Average Purchase Price':tsapp,'Total Purchase Value':tstpv})

# make it look nice
tsaa['Average Purchase Price'] = tsaa['Average Purchase Price'].map("${:.2f}".format)
tsaa['Total Purchase Value'] = tsaa['Total Purchase Value'].map("${:,.2f}".format)

tsaa = tsaa.sort_values(by=['Purchase Count'],ascending=False)
tsaa.head()

Unnamed: 0_level_0,Purchase Count,Average Purchase Price,Total Purchase Value
SN,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Lisosia93,5,$3.79,$18.96
Iral74,4,$3.40,$13.62
Idastidru52,4,$3.86,$15.45
Asur53,3,$2.48,$7.44
Inguron55,3,$3.70,$11.11


## Fig 8. Most Popular Items

In [68]:
# group items
mpdf = hopdf.groupby(['Item ID','Item Name'])

# calculate item count
mpic = mpdf['Purchase ID'].count()

#calculate total purchase value
mptpv = mpdf['Price'].sum()

# store item price
ip = mptpv/mpic

# wrap it all up
mpia = pd.DataFrame({'Purchase Count':mpic,'Item Price':ip,'Total Purchase Value':mptpv})

# saving this for the next table
mppia = mpia.copy()
# make it look nice
mpia['Item Price'] = mpia['Item Price'].map("${:,.2f}".format)
mpia['Total Purchase Value'] = mpia['Total Purchase Value'].map("${:,.2f}".format)

mpia = mpia.sort_values(by=['Purchase Count'],ascending=False)
mpia.head()

Unnamed: 0_level_0,Unnamed: 1_level_0,Purchase Count,Item Price,Total Purchase Value
Item ID,Item Name,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
92,Final Critic,13,$4.61,$59.99
178,"Oathbreaker, Last Hope of the Breaking Storm",12,$4.23,$50.76
145,Fiery Glass Crusader,9,$4.58,$41.22
132,Persuasion,9,$3.22,$28.99
108,"Extraction, Quickblade Of Trembling Hands",9,$3.53,$31.77


## Fig 9. Most Profitable Items

In [70]:
# make it look nice
mppia = mppia.sort_values(by=['Total Purchase Value'],ascending=False)

mppia['Item Price'] = mppia['Item Price'].map("${:,.2f}".format)
mppia['Total Purchase Value'] = mppia['Total Purchase Value'].map("${:,.2f}".format)

mppia.head()

Unnamed: 0_level_0,Unnamed: 1_level_0,Purchase Count,Item Price,Total Purchase Value
Item ID,Item Name,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
92,Final Critic,13,4.614615,59.99
178,"Oathbreaker, Last Hope of the Breaking Storm",12,4.23,50.76
82,Nirvana,9,4.9,44.1
145,Fiery Glass Crusader,9,4.58,41.22
103,Singed Scalpel,8,4.35,34.8
