### Heroes Of Pymoli Data Analysis
* Of the 1163 active players, the vast majority are male (84%). There also exists, a smaller, but notable proportion of female players (14%).

* Our peak age demographic falls between 20-24 (44.8%) with secondary groups falling between 15-19 (18.60%) and 25-29 (13.4%).  
-----

### Note
* Instructions have been included for each segment. You do not have to follow them exactly, but they are included to help you think through the steps.

In [3]:
# Dependencies and Setup
import pandas as pd

# File to Load (Remember to Change These)
file_to_load = "Resources/purchase_data.csv"

# Read Purchasing File and store into Pandas data frame
purchase_data = pd.read_csv(file_to_load)


## Player Count

* Display the total number of players


In [4]:
tplay = purchase_data.SN.nunique()
tplay

576

## Purchasing Analysis (Total)

* Run basic calculations to obtain number of unique items, average price, etc.


* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame


In [5]:
uniitem = purchase_data["Item Name"].nunique()
uniitem

179

In [6]:
avgprice = purchase_data.Price.mean()
avgprice

3.050987179487176

In [7]:
purchsum = pd.DataFrame({"Total_Players":[tplay], "Unique_Items":[uniitem], "Average_Price": [avgprice]})
purchsum["Average_Price"] = purchsum["Average_Price"].map("${:.2f}".format)
purchsum.head()

Unnamed: 0,Total_Players,Unique_Items,Average_Price
0,576,179,$3.05


## Gender Demographics

* Percentage and Count of Male Players


* Percentage and Count of Female Players


* Percentage and Count of Other / Non-Disclosed




In [56]:
#Groupby gender, unique count of SN
gen_gb = purchase_data.groupby(["Gender"])
gen_count = gen_gb.nunique()
gen_count.head()

Unnamed: 0_level_0,Purchase ID,SN,Age,Gender,Item ID,Item Name,Price
Gender,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
Female,113,81,22,1,90,90,79
Male,652,484,39,1,178,178,144
Other / Non-Disclosed,15,11,8,1,13,13,12


In [60]:
#Counts by Gender
malecount = gen_count.loc["Male", "SN"]
femalecount = gen_count.loc["Female", "SN"]
othercount = gen_count.loc["Other / Non-Disclosed", "SN"]


576

In [61]:
#Calc percentages
maleperc = 100*malecount/tplay
femaleperc = 100*femalecount/tplay
otherperc = 100*othercount/tplay


100.0

In [111]:
#Summary table
gendersum = pd.DataFrame({"Gender": ["Male", "Female", "Other / Non-Disclosed", "Total"],
                          "Unique Players": [malecount, femalecount, othercount, tplay],
                         "% of Players": [maleperc, femaleperc, otherperc, "100%"]})

gendersum.head()

Unnamed: 0,Gender,Unique Players,% of Players
0,Male,484,84.0278
1,Female,81,14.0625
2,Other / Non-Disclosed,11,1.90972
3,Total,576,100%



## Purchasing Analysis (Gender)

* Run basic calculations to obtain purchase count, avg. purchase price, avg. purchase total per person etc. by gender




* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame

In [71]:
#total volume of purchases by gender
gen_purch = gen_gb.count()
malepur = gen_purch.loc["Male", "Purchase ID"]
femalepur = gen_purch.loc["Female", "Purchase ID"]
otherpur = gen_purch.loc["Other / Non-Disclosed", "Purchase ID"]


15

In [98]:
#Purchases per player by gender
malepurplay = malepur / malecount
femalepurplay = femalepur / femalecount
otherpurplay = otherpur / othercount


In [100]:
#avg purchase price by gender
gen_avgp = gen_gb.mean()
maleavgp = gen_avgp.loc["Male", "Price"]
femaleavgp = gen_avgp.loc["Female", "Price"]
otheravgp = gen_avgp.loc["Other / Non-Disclosed", "Price"]
gen_avgp

Unnamed: 0_level_0,Purchase ID,Age,Item ID,Price
Gender,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
Female,379.380531,21.345133,85.477876,3.203009
Male,392.516871,22.917178,93.095092,3.017853
Other / Non-Disclosed,334.6,24.2,80.8,3.346


In [101]:
#total spent by gender
gen_pr = gen_gb.sum()
malespent = gen_pr.loc["Male", "Price"]
femalespent = gen_pr.loc["Female", "Price"]
otherspent = gen_pr.loc["Other / Non-Disclosed", "Price"]

In [102]:
#avg spent per player by gender
maleavgspent = malespent / malecount
femaleavgspent = femalespent / femalecount
otheravgspent = otherspent / othercount


In [108]:
#New summart table
gender_aggs = pd.DataFrame ({"Gender": ["Male", "Female", "Other / Non-Disclosed"],
                            "Total Purchases": [malepur, femalepur, otherpur],
                            "Purchases per Player": [malepurplay, femalepurplay, otherpurplay],
                             "Avg Purchase Price": [maleavgp, femaleavgp, otheravgp],
                             "Total Spent": [malespent, femalespent, otherspent],
                             "Avg Spent per Player": [maleavgspent, femaleavgspent, otheravgspent]})
#cleaner formatting
#Format to %
gender_aggs["Purchases per Player"] = gender_aggs["Purchases per Player"].map("{:.2f}".format)
gender_aggs["Avg Purchase Price"] = gender_aggs["Avg Purchase Price"].map("${:.2f}".format)
gender_aggs["Total Spent"] = gender_aggs["Total Spent"].map("${:.2f}".format)
gender_aggs["Avg Spent per Player"] = gender_aggs["Avg Spent per Player"].map("${:.2f}".format)
gender_aggs.head()

Unnamed: 0,Gender,Total Purchases,Purchases per Player,Avg Purchase Price,Total Spent,Avg Spent per Player
0,Male,652,1.35,$3.02,$1967.64,$4.07
1,Female,113,1.4,$3.20,$361.94,$4.47
2,Other / Non-Disclosed,15,1.36,$3.35,$50.19,$4.56


## Age Demographics

* Establish bins for ages


* Categorize the existing players using the age bins. Hint: use pd.cut()


* Calculate the numbers and percentages by age group


* Create a summary data frame to hold the results


* Optional: round the percentage column to two decimal points


* Display Age Demographics Table


## Purchasing Analysis (Age)

* Bin the purchase_data data frame by age


* Run basic calculations to obtain purchase count, avg. purchase price, avg. purchase total per person etc. in the table below


* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame

## Top Spenders

* Run basic calculations to obtain the results in the table below


* Create a summary data frame to hold the results


* Sort the total purchase value column in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the summary data frame



## Most Popular Items

* Retrieve the Item ID, Item Name, and Item Price columns


* Group by Item ID and Item Name. Perform calculations to obtain purchase count, item price, and total purchase value


* Create a summary data frame to hold the results


* Sort the purchase count column in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the summary data frame



## Most Profitable Items

* Sort the above table by total purchase value in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the data frame

