### Heroes Of Pymoli Data Analysis
* Of the 1163 active players, the vast majority are male (84%). There also exists, a smaller, but notable proportion of female players (14%).

* Our peak age demographic falls between 20-24 (44.8%) with secondary groups falling between 15-19 (18.60%) and 25-29 (13.4%).  
-----

### Note
* Instructions have been included for each segment. You do not have to follow them exactly, but they are included to help you think through the steps.

In [2]:
# Dependencies and Setup
import pandas as pd
import numpy as np

# File to Load (Remember to Change These)
file_to_load = "Resources/purchase_data.csv"

# Read Purchasing File and store into Pandas data frame
purchase_data = pd.read_csv(file_to_load)

## Player Count

* Display the total number of players


In [3]:
groupeddata = purchase_data.groupby(["SN"])
playercount = len(groupeddata)
print(f"There are {playercount} unique players.")

There are 576 unique players.


## Purchasing Analysis (Total)

* Run basic calculations to obtain number of unique items, average price, etc.


* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame


In [4]:
uniqueitems = purchase_data["Item ID"].unique()
averageprice = purchase_data["Price"]
numpurchases = purchase_data.groupby(["Purchase ID"])
totalrevenue = purchase_data["Price"]
PurchaseSum = pd.DataFrame({
    "Total Unique Items" :[len(uniqueitems)],
    "Average Purchase Price" :[round(averageprice.mean(),2)],
    "Total Number of Purchases" :[len(numpurchases)],
    "Total Revenue" :[sum(totalrevenue)]
})
PurchaseSum


Unnamed: 0,Total Unique Items,Average Purchase Price,Total Number of Purchases,Total Revenue
0,183,3.05,780,2379.77


## Gender Demographics

* Percentage and Count of Male Players


* Percentage and Count of Female Players


* Percentage and Count of Other / Non-Disclosed




In [5]:
genderdata = groupeddata['Gender'].max().value_counts()

malecount = genderdata['Male']
malepercent = (malecount/playercount)*100

femcount = genderdata['Female']
fempercent = (femcount/playercount)*100

othcount = genderdata['Other / Non-Disclosed']
othpercent = (othcount/playercount)*100

genderdf = pd.DataFrame({
    "Gender":['Male','Female','Other/Non-Disclosed'],
    "Max" :[malecount,femcount,othcount],
    "Percent" :[malepercent,fempercent,othpercent],
})

genderdf = genderdf.set_index('Gender')
genderdf = genderdf.style.format({'Percent': '{:,.2f}%'})
genderdf

Unnamed: 0_level_0,Max,Percent
Gender,Unnamed: 1_level_1,Unnamed: 2_level_1
Male,484,84.03%
Female,81,14.06%
Other/Non-Disclosed,11,1.91%



## Purchasing Analysis (Gender)

* Run basic calculations to obtain purchase count, avg. purchase price, avg. purchase total per person etc. by gender




* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame

In [6]:
maleonlydf = purchase_data[purchase_data['Gender']=='Male']
femonlydf = purchase_data[purchase_data['Gender']=='Female']
otheronlydf = purchase_data[purchase_data['Gender']=='Other / Non-Disclosed']

#male demographics
malepurchasecount = maleonlydf['Purchase ID'].count()
maleavgprice = maleonlydf['Price'].mean()
maletotalrev = maleonlydf['Price'].sum()
maleperperson = (maletotalrev/malecount)

#female demographics
femepurchasecount = femonlydf['Purchase ID'].count()
femavgprice = femonlydf['Price'].mean()
femtotalrev = femonlydf['Price'].sum()
femperperson = (femtotalrev/femcount)

#other demographics
otherpurchasecount = otheronlydf['Purchase ID'].count()
otheravgprice = otheronlydf['Price'].mean()
othertotalrev = otheronlydf['Price'].sum()
otherperperson = (othertotalrev/othcount)


genderdemodf = pd.DataFrame({
    "Gender":['Male','Female','Other/Non-Disclosed'],
    "Total Purchases":[malepurchasecount,femepurchasecount,otherpurchasecount],
    "Average Price":[maleavgprice,femavgprice,otheravgprice],
    "Total Revenue":[maletotalrev,femtotalrev,othertotalrev],
    "Avg Price/Gender":[maleperperson,femperperson,otherperperson]

 })
genderdemodf = genderdemodf.set_index('Gender')
genderdemodf = genderdemodf.style.format({'Average Price': '${:,.2f}','Total Revenue': '${:,.2f}','Avg Price/Gender': '${:,.2f}'})
genderdemodf


Unnamed: 0_level_0,Total Purchases,Average Price,Total Revenue,Avg Price/Gender
Gender,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
Male,652,$3.02,"$1,967.64",$4.07
Female,113,$3.20,$361.94,$4.47
Other/Non-Disclosed,15,$3.35,$50.19,$4.56


## Age Demographics

* Establish bins for ages


* Categorize the existing players using the age bins. Hint: use pd.cut()


* Calculate the numbers and percentages by age group


* Create a summary data frame to hold the results


* Optional: round the percentage column to two decimal points


* Display Age Demographics Table


In [7]:
# datatobin = pd.DataFrame(purchase_data['SN'].unique())
datatobin = pd.DataFrame([purchase_data['SN'],purchase_data['Age']]).transpose()
datatobin.drop_duplicates('SN', keep='first', inplace=True)

# datatobin = pd.merge(datatobin,datatoright, how = 'left', left_on='SN', right_on='SN' )
# datatobin

# bins = [0, 10,15, 20, 25, 30, 35, 40, 200]


binrange = [10 + (5*i) for i in range(7)]
bins = [0]+binrange+[200]
labels = ['<10','10-14','15-19','20-24','25-29','30-34','35-39','+40']
whatever = pd.cut(datatobin['Age'], bins = bins, labels = labels)

binnedpd = purchase_data
binnedpd['whatever'] = whatever

groupedbin = binnedpd.groupby(["whatever"])
binnedcount = groupedbin['SN'].count()
agepercent = (binnedcount / playercount)*100
# agepercent
# #####ASSIGN BINNED COUNT PER AGE RANGE THEN CAN USE IN CALCULATIONS BELOW
# binnedcount
agedf1 = pd.DataFrame({"Total Count":binnedcount,
                      "Percentage of Players":agepercent})
agedf1["Percentage of Players"] = agedf1["Percentage of Players"].map("{:.2f}%".format)
agedf1
                                                                       


Unnamed: 0_level_0,Total Count,Percentage of Players
whatever,Unnamed: 1_level_1,Unnamed: 2_level_1
<10,24,4.17%
10-14,41,7.12%
15-19,150,26.04%
20-24,232,40.28%
25-29,59,10.24%
30-34,37,6.42%
35-39,26,4.51%
+40,7,1.22%


## Purchasing Analysis (Age)

* Bin the purchase_data data frame by age


* Run basic calculations to obtain purchase count, avg. purchase price, avg. purchase total per person etc. in the table below


* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame

In [11]:
binrange = [10 + (5*i) for i in range(7)]
bins = [0]+binrange+[200]
labels = ['<10','10-14','15-19','20-24','25-29','30-34','35-39','+40']
PurchaseAnalysis = pd.cut(purchase_data['Age'], bins = bins, labels = labels)
binnedPA = purchase_data
binnedPA['PurchaseAnalysis'] = PurchaseAnalysis

lessthan10df = purchase_data[purchase_data['PurchaseAnalysis']=='<10']
tentofourteendf = purchase_data[purchase_data['PurchaseAnalysis']=='10-14']
fifteentonineteendf = purchase_data[purchase_data['PurchaseAnalysis']=='15-19']
twentytotwentyfourdf = purchase_data[purchase_data['PurchaseAnalysis']=='20-24']
twentyfivetotwentyninedf = purchase_data[purchase_data['PurchaseAnalysis']=='25-29']
thirtytothirtyfourdf = purchase_data[purchase_data['PurchaseAnalysis']=='30-34']
thirtyfivetothirtyninedf = purchase_data[purchase_data['PurchaseAnalysis']=='35-39']
overfortydf = purchase_data[purchase_data['PurchaseAnalysis']=='+40']


# otherpurchasecount = otheronlydf['Purchase ID'].count()
# otheravgprice = otheronlydf['Price'].mean()
# othertotalrev = otheronlydf['Price'].sum()
# otherperperson = (othertotalrev/othcount)


## Top Spenders

* Run basic calculations to obtain the results in the table below


* Create a summary data frame to hold the results


* Sort the total purchase value column in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the summary data frame



## Most Popular Items

* Retrieve the Item ID, Item Name, and Item Price columns


* Group by Item ID and Item Name. Perform calculations to obtain purchase count, item price, and total purchase value


* Create a summary data frame to hold the results


* Sort the purchase count column in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the summary data frame



## Most Profitable Items

* Sort the above table by total purchase value in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the data frame

