### Heroes Of Pymoli Data Analysis
* Of the 1163 active players, the vast majority are male (84%). There also exists, a smaller, but notable proportion of female players (14%).

* Our peak age demographic falls between 20-24 (44.8%) with secondary groups falling between 15-19 (18.60%) and 25-29 (13.4%).  
-----

### Note
* Instructions have been included for each segment. You do not have to follow them exactly, but they are included to help you think through the steps.

In [1]:
# Dependencies and Setup
import pandas as pd
import numpy as np

# File to Load (Remember to Change These)
file_to_load = "Resources/purchase_data.csv"

# Read Purchasing File and store into Pandas data frame
purchase_data = pd.read_csv(file_to_load)

## Player Count

* Display the total number of players


In [2]:
purchase_data.head()

Unnamed: 0,Purchase ID,SN,Age,Gender,Item ID,Item Name,Price
0,0,Lisim78,20,Male,108,"Extraction, Quickblade Of Trembling Hands",3.53
1,1,Lisovynya38,40,Male,143,Frenzied Scimitar,1.56
2,2,Ithergue48,24,Male,92,Final Critic,4.88
3,3,Chamassasya86,24,Male,100,Blindscythe,3.27
4,4,Iskosia90,23,Male,131,Fury,1.44


In [3]:
total_players = len(purchase_data['SN'].unique())
df = pd.DataFrame([{'Total Players': total_players}])

In [4]:
df

Unnamed: 0,Total Players
0,576


## Purchasing Analysis (Total)

* Run basic calculations to obtain number of unique items, average price, etc.


* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame


In [5]:
unique_items = purchase_data['Item ID'].unique()
unique_items = len(unique_items)
avg_price = purchase_data['Price'].mean()
purchase = purchase_data['Purchase ID'].count()
revenue = purchase_data['Price'].sum()
df = pd.DataFrame([{'Number of Unique Items': unique_items, 'Average Price': avg_price, 
                   'Number of Purchase': purchase, 'Total Revenue': revenue}])
df['Average Price'] = df['Average Price'].map("${:.2f}".format)
df['Total Revenue'] = df['Total Revenue'].map("${:,.2f}".format)
df

Unnamed: 0,Average Price,Number of Purchase,Number of Unique Items,Total Revenue
0,$3.05,780,183,"$2,379.77"


## Gender Demographics

* Percentage and Count of Male Players


* Percentage and Count of Female Players


* Percentage and Count of Other / Non-Disclosed




In [14]:
total = purchase_data['Gender'].count()
male = 0
female = 0
other = 0
for gender in purchase_data['Gender']:
    if gender == 'Male':
        male = male + 1
    elif gender == 'Female':
        female = female + 1
    else:
        other = other + 1
perMale = male / total
perFemale = female / total
perOther = other / total
gender = [{'Gender': 'Male', 'Total Count': male, 'Percentage of Players': perMale},
          {'Gender': 'Female', 'Total Count': female, 'Percentage of Players': perFemale},
          {'Gender': 'Other/Non-Disclosed', 'Total Count': other, 'Percentage of Players': perOther}]
df = pd.DataFrame(gender)
df['Percentage of Players'] = df['Percentage of Players'].map("{0:.2f}%".format)
df = df.set_index('Gender')
df

Unnamed: 0_level_0,Percentage of Players,Total Count
Gender,Unnamed: 1_level_1,Unnamed: 2_level_1
Male,0.84%,652
Female,0.14%,113
Other/Non-Disclosed,0.02%,15



## Purchasing Analysis (Gender)

* Run basic calculations to obtain purchase count, avg. purchase price, avg. purchase total per person etc. by gender




* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame

In [55]:
dft = purchase_data.drop_duplicates(subset='SN')
dft = dft.reset_index(drop=False)
dft

Unnamed: 0,index,Purchase ID,SN,Age,Gender,Item ID,Item Name,Price
0,0,0,Lisim78,20,Male,108,"Extraction, Quickblade Of Trembling Hands",3.53
1,1,1,Lisovynya38,40,Male,143,Frenzied Scimitar,1.56
2,2,2,Ithergue48,24,Male,92,Final Critic,4.88
3,3,3,Chamassasya86,24,Male,100,Blindscythe,3.27
4,4,4,Iskosia90,23,Male,131,Fury,1.44
5,5,5,Yalae81,22,Male,81,Dreamkiss,3.61
6,6,6,Itheria73,36,Male,169,"Interrogator, Blood Blade of the Queen",2.18
7,7,7,Iskjaskst81,20,Male,162,Abyssal Shard,2.67
8,8,8,Undjask33,22,Male,21,Souleater,1.10
9,9,9,Chanosian48,35,Other / Non-Disclosed,136,Ghastly Adamantite Protector,3.58


In [33]:
i = 0
mCount = 0
fCount = 0
oCount = 0
mTotal = 0
fTotal = 0
oTotal = 0
for gender in purchase_data['Gender']:
    if gender == 'Male':
        mCount=mCount+1
        mTotal = mTotal + purchase_data['Price'][i]
    elif gender == 'Female':
        fCount=fCount+1
        fTotal = fTotal + purchase_data['Price'][i]
    else:
        oCount=oCount+1
        oTotal = oTotal + purchase_data['Price'][i]
    i = i + 1
mAvg = mTotal / mCount
fAvg = fTotal / fCount
oAvg = oTotal / oCount

j = 0
mPCount = 0
fPCount = 0
oPCount = 0
for gender in dft['Gender']:
    if gender == 'Male':
        mPCount = mPCount + 1
    elif gender == 'Female':
        fPCount = fPCount + 1
    else:
        oPCount = oPCount + 1
    j = j + 1
mPPAvg = mTotal / mPCount
fPPAvg = fTotal / fPCount
oPPAvg = oTotal / oPCount

print(mCount, fCount, oCount)
print(mTotal, fTotal, oTotal)
print(mAvg, fAvg, oAvg)
print(mPPAvg, fPPAvg, oPPAvg)

652 113 15
1967.6399999999994 361.93999999999966 50.190000000000005
3.0178527607361953 3.203008849557519 3.3460000000000005
4.065371900826445 4.4683950617283905 4.562727272727273


In [46]:
gender = [{'': 'Male', 'Purchase Count': mCount, 'Average Purchase Price': mAvg, 'Total Purchase Value': mTotal, 'Avg Purchase Per Person': mPPAvg},
          {'': 'Female', 'Purchase Count': fCount, 'Average Purchase Price': fAvg, 'Total Purchase Value': fTotal, 'Avg Purchase Per Person': fPPAvg},
          {'': 'Other/Non-Disclosed', 'Purchase Count': oCount, 'Average Purchase Price': oAvg, 'Total Purchase Value': oTotal, 'Avg Purchase Per Person': oPPAvg}]
df = pd.DataFrame(gender)
df['Average Purchase Price'] = df['Average Purchase Price'].map("${:.2f}".format)
df['Total Purchase Value'] = df['Total Purchase Value'].map("${:,.2f}".format)
df['Avg Purchase Per Person'] = df['Avg Purchase Per Person'].map("${:.2f}".format)
df = df.set_index('')
df = df[['Purchase Count', 'Average Purchase Price', 'Total Purchase Value', 'Avg Purchase Per Person']]
df

Unnamed: 0,Purchase Count,Average Purchase Price,Total Purchase Value,Avg Purchase Per Person
,,,,
Male,652.0,$3.02,"$1,967.64",$4.07
Female,113.0,$3.20,$361.94,$4.47
Other/Non-Disclosed,15.0,$3.35,$50.19,$4.56


## Age Demographics

* Establish bins for ages


* Categorize the existing players using the age bins. Hint: use pd.cut()


* Calculate the numbers and percentages by age group


* Create a summary data frame to hold the results


* Optional: round the percentage column to two decimal points


* Display Age Demographics Table


In [89]:
bins = [0, 9, 14, 19, 24, 29, 34, 39, 130]
bin_names = ['<10', '10-14', '15-19', '20-24', '25-29', '30-34', '35-39', '40+']
dftbins = pd.cut(dft['Age'], bins, labels=bin_names)
total = dftbins.count()
bin1 = 0
bin2 = 0
bin3 = 0
bin4 = 0
bin5 = 0
bin6 = 0
bin7 = 0
bin8 = 0  
for x in dftbins:
    if x == '<10':
        bin1 = bin1 + 1
    elif x == '10-14':
        bin2 = bin2 + 1
    elif x == '15-19':
        bin3 = bin3 + 1
    elif x == '20-24':
        bin4 = bin4 + 1
    elif x == '25-29':
        bin5 = bin5 + 1
    elif x == '30-34':
        bin6 = bin6 + 1
    elif x == '35-39':
        bin7 = bin7 + 1
    elif x == '40+':
        bin8 = bin8 + 1
print(bin1, bin2, bin3, bin4, bin5, bin6, bin7, bin8) 
binCount = [{'': '<10', 'Total Count': bin1, 'Percentage of Players': bin1/total*100},
           {'': '10-14', 'Total Count': bin2, 'Percentage of Players': bin2/total*100},
           {'': '15-19', 'Total Count': bin3, 'Percentage of Players': bin3/total*100},
           {'': '20-24', 'Total Count': bin4, 'Percentage of Players': bin4/total*100},
           {'': '25-29', 'Total Count': bin5, 'Percentage of Players': bin5/total*100},
           {'': '30-34', 'Total Count': bin6, 'Percentage of Players': bin6/total*100},
           {'': '35-39', 'Total Count': bin7, 'Percentage of Players': bin7/total*100},
           {'': '40+', 'Total Count': bin8, 'Percentage of Players': bin8/total*100}]
df = pd.DataFrame(binCount)
df = df.set_index('')
df['Percentage of Players'] = df['Percentage of Players'].map("{:2f}%".format)
df

17 22 107 258 77 52 31 12


Unnamed: 0,Percentage of Players,Total Count
,,
<10,2.951389%,17.0
10-14,3.819444%,22.0
15-19,18.576389%,107.0
20-24,44.791667%,258.0
25-29,13.368056%,77.0
30-34,9.027778%,52.0
35-39,5.381944%,31.0
40+,2.083333%,12.0


## Purchasing Analysis (Age)

* Bin the purchase_data data frame by age


* Run basic calculations to obtain purchase count, avg. purchase price, avg. purchase total per person etc. in the table below


* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame

## Top Spenders

* Run basic calculations to obtain the results in the table below


* Create a summary data frame to hold the results


* Sort the total purchase value column in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the summary data frame



## Most Popular Items

* Retrieve the Item ID, Item Name, and Item Price columns


* Group by Item ID and Item Name. Perform calculations to obtain purchase count, item price, and total purchase value


* Create a summary data frame to hold the results


* Sort the purchase count column in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the summary data frame



## Most Profitable Items

* Sort the above table by total purchase value in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the data frame

