### Heroes Of Pymoli Data Analysis
* Of the 1163 active players, the vast majority are male (84%). There also exists, a smaller, but notable proportion of female players (14%).

* Our peak age demographic falls between 20-24 (44.8%) with secondary groups falling between 15-19 (18.60%) and 25-29 (13.4%).  
-----

### Note
* Instructions have been included for each segment. You do not have to follow them exactly, but they are included to help you think through the steps.

In [242]:
# Dependencies and Setup
import pandas as pd
import numpy as np

# File to Load (Remember to Change These)
file_to_load = "Resources/purchase_data.csv"

# Read Purchasing File and store into Pandas data frame
purchase_data = pd.read_csv(file_to_load)
purchase_data.head()

Unnamed: 0,Purchase ID,SN,Age,Gender,Item ID,Item Name,Price
0,0,Lisim78,20,Male,108,"Extraction, Quickblade Of Trembling Hands",3.53
1,1,Lisovynya38,40,Male,143,Frenzied Scimitar,1.56
2,2,Ithergue48,24,Male,92,Final Critic,4.88
3,3,Chamassasya86,24,Male,100,Blindscythe,3.27
4,4,Iskosia90,23,Male,131,Fury,1.44


## Player Count

* Display the total number of players


In [166]:
#Group by name
player_group_cnt = len(purchase_data.groupby(['SN']))
player_countdf = pd.DataFrame({"Player Count": [player_group_cnt]})
player_countdf

Unnamed: 0,Player Count
0,576


## Purchasing Analysis (Total)

* Run basic calculations to obtain number of unique items, average price, etc.


* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame


In [243]:
#Group by items
summary_df = pd.DataFrame({'Num of Unique Items':[len(purchase_data.groupby(['Item ID']))],
                           'Avg Price':[purchase_data['Price'].mean()],
                           'Num of Purchases': [purchase_data['Purchase ID'].count()], 
                           'Total Revenue':[purchase_data['Price'].sum()]
                          })
summary_df 

Unnamed: 0,Num of Unique Items,Avg Price,Num of Purchases,Total Revenue
0,183,3.050987,780,2379.77


## Gender Demographics

* Percentage and Count of Male Players


* Percentage and Count of Female Players


* Percentage and Count of Other / Non-Disclosed




In [248]:
#Group by gender and name
gender_name_grp = purchase_data.groupby(['SN', 'Gender'])
gender_name_cnt = gender_name_grp['Gender'].count()
#gender_name_cnt.head()

#cast to a dataframe
gender_namedf = pd.DataFrame(gender_name_cnt)
#gender_namedf.head()

#rename column
gender_namedf = gender_namedf.rename(columns={"Gender": "Cnt"})
#gender_namedf.head()

# group by gender then cast to a dataframe
gender_grp = gender_namedf.groupby(['Gender'])
gender_cnt = gender_grp.count()
#gender_cnt.head()
Gender_Demo_df = pd.DataFrame(gender_cnt)
#Gender_Demo_df.head()

#Get total count
total = gender_cnt.sum()
#576

PlayerPercentage = Gender_Demo_df["Cnt"]/576

#Final
Gender_Demo_df["Percentage of Players"] = PlayerPercentage.astype(float).round(2)
Gender_Demo_df

Unnamed: 0_level_0,Cnt,Percentage of Players
Gender,Unnamed: 1_level_1,Unnamed: 2_level_1
Female,81,0.14
Male,484,0.84
Other / Non-Disclosed,11,0.02



## Purchasing Analysis (Gender)

* Run basic calculations to obtain purchase count, avg. purchase price, avg. purchase total per person etc. by gender




* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame

In [249]:
TotalPurchase = purchase_data.groupby(['Gender'])['Price'].agg('sum')
AvgPurchasePrice = purchase_data.groupby(['Gender'])['Price'].agg('mean')
PurchaseCnt = purchase_data.groupby(['Gender'])['Price'].agg('count')

ByGenderSummardf =  pd.DataFrame(PurchaseCnt)
ByGenderSummardf["Average Purchase Price"] = AvgPurchasePrice.round(2)
ByGenderSummardf["Total Purchase Value"] = TotalPurchase.round(2)
ByGenderSummardf["Avg Total Purchase per Person"] = (TotalPurchase / PurchaseCnt).round(2)
ByGenderSummardf = ByGenderSummardf.rename(columns={"Price": "Purchase Count"})
ByGenderSummardf

Unnamed: 0_level_0,Purchase Count,Average Purchase Price,Total Purchase Value,Avg Total Purchase per Person
Gender,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
Female,113,3.2,361.94,3.2
Male,652,3.02,1967.64,3.02
Other / Non-Disclosed,15,3.35,50.19,3.35


## Age Demographics

* Establish bins for ages


* Categorize the existing players using the age bins. Hint: use pd.cut()


* Calculate the numbers and percentages by age group


* Create a summary data frame to hold the results


* Optional: round the percentage column to two decimal points


* Display Age Demographics Table


In [250]:
#bins
bins=[0,10,15,20,25,30,35,40,50]
group_labels =["<10", "10-14", "15-19", "20-24", "25-29", "30-34", "35-39", "40+"]
#gender_name_grp
purchase_data["Age Bin"] = pd.cut(purchase_data["Age"], bins, labels=group_labels)
#total Players
totalPlayer = age_name_cnt.count()
#totalPlayer
#576

#Group by gender and age
age_name_grp = purchase_data.groupby(['SN', 'Age', 'Age Bin'])
age_name_cnt = age_name_grp['Age Bin'].count()
age_name_cnt.head()

#cast to dataframe
agebindf = pd.DataFrame(age_name_cnt)
agebindf = agebindf.rename(columns={"Age Bin": "Total Count"})
agebindf

#group by age bin
age_group_df = agebindf.groupby('Age Bin').agg('count')
PercentageOfPlayers = age_group_df['Total Count']/576
age_group_df["Percentage of Players"] = PercentageOfPlayers.round(2)
age_group_df

Unnamed: 0_level_0,Total Count,Percentage of Players
Age Bin,Unnamed: 1_level_1,Unnamed: 2_level_1
<10,24,0.04
10-14,41,0.07
15-19,150,0.26
20-24,232,0.4
25-29,59,0.1
30-34,37,0.06
35-39,26,0.05
40+,7,0.01


## Purchasing Analysis (Age)

* Bin the purchase_data data frame by age


* Run basic calculations to obtain purchase count, avg. purchase price, avg. purchase total per person etc. in the table below


* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame

In [251]:
AgeBinTotalPurchase = purchase_data.groupby(['Age Bin'])['Price'].agg('sum')
AgeBinAvgPurchasePrice = purchase_data.groupby(['Age Bin'])['Price'].agg('mean')
AgeBinPurchaseCnt = purchase_data.groupby(['Age Bin'])['Price'].agg('count')

ByAgeBinSummarydf =  pd.DataFrame(AgeBinPurchaseCnt)
ByAgeBinSummarydf["Average Purchase Price"] = AgeBinAvgPurchasePrice.round(2)
ByAgeBinSummarydf["Total Purchase Value"] = AgeBinTotalPurchase.round(2)
ByAgeBinSummarydf["Avg Total Purchase per Person"] = (AgeBinTotalPurchase / AgeBinPurchaseCnt).round(2)
ByAgeBinSummarydf = ByAgeBinSummarydf.rename(columns={"Price": "Purchase Count"})
ByAgeBinSummarydf

Unnamed: 0_level_0,Purchase Count,Average Purchase Price,Total Purchase Value,Avg Total Purchase per Person
Age Bin,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
<10,32,3.4,108.96,3.4
10-14,54,2.9,156.6,2.9
15-19,200,3.11,621.56,3.11
20-24,325,3.02,981.64,3.02
25-29,77,2.88,221.42,2.88
30-34,52,2.99,155.71,2.99
35-39,33,3.4,112.35,3.4
40+,7,3.08,21.53,3.08


## Top Spenders

* Run basic calculations to obtain the results in the table below


* Create a summary data frame to hold the results


* Sort the total purchase value column in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the summary data frame



In [252]:
#Group by name
NameTotalPurchase = purchase_data.groupby(['SN'])['Price'].agg('sum')
NameAvgPurchasePrice = purchase_data.groupby(['SN'])['Price'].agg('mean')
NamePurchaseCnt = purchase_data.groupby(['SN'])['Price'].agg('count')

#cast to a dataframe and get the top 5
NameSummarydf =  pd.DataFrame(NamePurchaseCnt)
NameSummarydf["Average Purchase Price"] = NameAvgPurchasePrice.round(2)
NameSummarydf["Total Purchase Value"] = NameTotalPurchase.round(2)
NameSummarydf = NameSummarydf.rename(columns={"Price": "Purchase Count"})
Newdf = NameSummarydf.sort_values("Total Purchase Value", ascending=False)
Newdf.head(5)


Unnamed: 0_level_0,Purchase Count,Average Purchase Price,Total Purchase Value
SN,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Lisosia93,5,3.79,18.96
Idastidru52,4,3.86,15.45
Chamjask73,3,4.61,13.83
Iral74,4,3.4,13.62
Iskadarya95,3,4.37,13.1


## Most Popular Items

* Retrieve the Item ID, Item Name, and Item Price columns


* Group by Item ID and Item Name. Perform calculations to obtain purchase count, item price, and total purchase value


* Create a summary data frame to hold the results


* Sort the purchase count column in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the summary data frame



In [253]:
ItemTotalPurchase = purchase_data.groupby(['Item ID', 'Item Name', 'Price'])['Price'].agg('sum')
ItemPurchaseCnt = purchase_data.groupby(['Item ID', 'Item Name', 'Price'])['Price'].agg('count')

ByItemSummarydf =  pd.DataFrame(ItemPurchaseCnt)
ByItemSummarydf["Total Purchase Value"] = ItemTotalPurchase.round(2)
ByItemSummarydf = ByItemSummarydf.rename(columns={"Price": "Purchase Count"})
Newitemdf = ByItemSummarydf.sort_values("Purchase Count", ascending=False)
Newitemdf.head(5)

Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,Purchase Count,Total Purchase Value
Item ID,Item Name,Price,Unnamed: 3_level_1,Unnamed: 4_level_1
178,"Oathbreaker, Last Hope of the Breaking Storm",4.23,12,50.76
145,Fiery Glass Crusader,4.58,9,41.22
108,"Extraction, Quickblade Of Trembling Hands",3.53,9,31.77
82,Nirvana,4.9,9,44.1
19,"Pursuit, Cudgel of Necromancy",1.02,8,8.16


## Most Profitable Items

* Sort the above table by total purchase value in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the data frame



In [254]:
mostProfitable = ByItemSummarydf.sort_values("Total Purchase Value", ascending=False)
mostProfitable.head(5)

Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,Purchase Count,Total Purchase Value
Item ID,Item Name,Price,Unnamed: 3_level_1,Unnamed: 4_level_1
178,"Oathbreaker, Last Hope of the Breaking Storm",4.23,12,50.76
82,Nirvana,4.9,9,44.1
145,Fiery Glass Crusader,4.58,9,41.22
92,Final Critic,4.88,8,39.04
103,Singed Scalpel,4.35,8,34.8


In [None]:
#Observations
#1 84% out of 576 players identify as male.
#2 The age group that purchases are from 15 to 24.
#3 Out of 183 total items (by item ID), the most profitable item is also them most purchased.  
    #It is item #178, Oathbreaker, Last Hope of the Breaking Storm