### Heroes Of Pymoli Data Analysis
* Of the 1163 active players, the vast majority are male (84%). There also exists, a smaller, but notable proportion of female players (14%).

* Our peak age demographic falls between 20-24 (44.8%) with secondary groups falling between 15-19 (18.60%) and 25-29 (13.4%).  
-----

### Note
* Instructions have been included for each segment. You do not have to follow them exactly, but they are included to help you think through the steps.

In [7]:
# Dependencies and Setup
import pandas as pd
import numpy as np

# File to Load (Remember to Change These)
file_to_load = "Resources/purchase_data.csv"

# Read Purchasing File and store into Pandas data frame
purchase_data = pd.read_csv(file_to_load)
purchase_data.head()

Unnamed: 0,Purchase ID,SN,Age,Gender,Item ID,Item Name,Price
0,0,Lisim78,20,Male,108,"Extraction, Quickblade Of Trembling Hands",3.53
1,1,Lisovynya38,40,Male,143,Frenzied Scimitar,1.56
2,2,Ithergue48,24,Male,92,Final Critic,4.88
3,3,Chamassasya86,24,Male,100,Blindscythe,3.27
4,4,Iskosia90,23,Male,131,Fury,1.44


In [75]:
## Player Count 


players = len(purchase_data["SN"].unique())
tot_player = pd.DataFrame({
    "Total Players": [players]
        })
tot_player

Unnamed: 0,Total Players
0,576


In [77]:
totitem_purch = purchase_data["Item ID"].count()
totitem_purch

780

In [78]:
## Purchasing Analysis (Total)
totval = purchase_data["Price"].sum()
totval

2379.77

In [79]:
#Run basic calculations to obtain number of unique items, average price, etc.
uniqueitem = len(purchase_data["Item ID"].unique())
uniqueitem

183

In [80]:
avg_price = purchase_data["Price"].mean()
avg_price

3.050987179487176

In [81]:
# Create a summary data frame to hold the results
# Optional: give the displayed data cleaner formatting
# Display the summary data frame

summary = pd.DataFrame({"Number of Uniqe Items":[uniqueitem],
                        "Average Price" : round(avg_price,2),
                        "Total Purchases":[totitem_purch],
                        "Total Revenue": [totval] 
                        })
summary

Unnamed: 0,Number of Uniqe Items,Average Price,Total Purchases,Total Revenue
0,183,3.05,780,2379.77


In [82]:
#Count of total players
totplayers = purchase_data["SN"].nunique()
totplayers

576

In [83]:
#Count of total male 
male_count = purchase_data[purchase_data["Gender"] == "Male"]["SN"].nunique()
male_count

484

In [84]:
#count of total female
female_count = purchase_data[purchase_data["Gender"] == "Female"]["SN"].nunique()
female_count

81

In [85]:
#Count of other players
other_count = totplayers - male_count - female_count
other_count

11

In [86]:
# Percentage count of male players
male_percent = round((male_count/totplayers * 100), 2)
male_percent

84.03

In [87]:
#percentage count of female players 
fem_percent = round((female_count/totplayers *100),2)
fem_percent

14.06

In [88]:
# percentage count of other players 
oth_percent = round((other_count/totplayers * 100),2)
oth_percent

1.91

In [89]:
gen_demo = pd.DataFrame({
            "Gender" : ["Male","Female","Other/Non-Disclosed"],
            "Total Count" : [male_count,female_count,other_count],
            "Percentage of Players" : [male_percent,fem_percent,oth_percent]
            })
gen_demo

Unnamed: 0,Gender,Total Count,Percentage of Players
0,Male,484,84.03
1,Female,81,14.06
2,Other/Non-Disclosed,11,1.91



## Purchasing Analysis (Gender)

In [90]:
#Purchasing Analysis (Gender)
#Run basic calculations to obtain purchase count, avg. purchase price, 
#avg. purchase total per person etc. by gender
#Create a summary data frame to hold the results
#Optional: give the displayed data cleaner formatting
#Display the summary data frame

In [91]:
purchase_data.head()

Unnamed: 0,Purchase ID,SN,Age,Gender,Item ID,Item Name,Price
0,0,Lisim78,20,Male,108,"Extraction, Quickblade Of Trembling Hands",3.53
1,1,Lisovynya38,40,Male,143,Frenzied Scimitar,1.56
2,2,Ithergue48,24,Male,92,Final Critic,4.88
3,3,Chamassasya86,24,Male,100,Blindscythe,3.27
4,4,Iskosia90,23,Male,131,Fury,1.44


In [92]:
tot_purch = purchase_data["Gender"].count()
tot_purch

780

In [93]:
male_purch = purchase_data[purchase_data["Gender"]=="Male"]["Price"].count()
male_purch

652

In [94]:
fem_purch = purchase_data[purchase_data["Gender"]=="Female"]["Price"].count()
fem_purch

113

In [95]:
other_purch = tot_purch - male_purch - fem_purch
other_purch

15

In [96]:
male_purch_avg = round(purchase_data[purchase_data["Gender"]== "Male"]["Price"].mean(),2)
male_purch_avg

3.02

In [97]:
fem_purch_avg = round(purchase_data[purchase_data["Gender"]=="Female"]["Price"].mean(),2)
fem_purch_avg

3.2

In [98]:
other_purch_avg = round(purchase_data[purchase_data["Gender"] == "Other / Non-Disclosed"]["Price"].mean(),2)
other_purch_avg

3.35

In [109]:
male_tot_price = purchase_data[purchase_data["Gender"] == "Male"]["Price"].sum()
male_tot_price

1967.64

In [110]:
fem_tot_price = purchase_data[purchase_data["Gender"] == "Female"]["Price"].sum()
fem_tot_price

361.94

In [111]:
other_tot_price = purchase_data[purchase_data["Gender"] == "Other / Non-Disclosed"]["Price"].sum()
other_tot_price

50.19

#Purchasing Analysis (Gender)
#Run basic calculations to obtain purchase count, avg. purchase price, 
#avg. purchase total per person etc. by gender
#Create a summary data frame to hold the results
#Optional: give the displayed data cleaner formatting
#Display the summary data frame

In [114]:
avg_male_purch = round((male_tot_price / male_count),2)
avg_male_purch

4.07

In [115]:
avg_fem_purch = round((fem_tot_price/female_count),2)
avg_fem_purch

4.47

In [116]:
avg_other_purch = round((other_tot_price/other_count),2)
avg_other_purch

4.56

In [117]:
#Purchasing analysis
purch_analysis = pd.DataFrame({
    "Gender" : ["Male","Female","Other/Non-Disclosed"],
    "Purchase Count": [male_purch, fem_purch, other_purch],
    "Average Purchase Price" : [male_purch_avg, fem_purch_avg, other_purch_avg],
    "Total Purchse Value" : [male_tot_price, fem_tot_price, other_tot_price],
    "Avg Total Purchase per Person" : [avg_male_purch, avg_fem_purch, avg_other_purch]
})
purch_analysis

Unnamed: 0,Gender,Purchase Count,Average Purchase Price,Total Purchse Value,Avg Total Purchase per Person
0,Male,652,3.02,1967.64,4.07
1,Female,113,3.2,361.94,4.47
2,Other/Non-Disclosed,15,3.35,50.19,4.56


In [122]:
purchase_data.head()

Unnamed: 0,Purchase ID,SN,Age,Gender,Item ID,Item Name,Price
0,0,Lisim78,20,Male,108,"Extraction, Quickblade Of Trembling Hands",3.53
1,1,Lisovynya38,40,Male,143,Frenzied Scimitar,1.56
2,2,Ithergue48,24,Male,92,Final Critic,4.88
3,3,Chamassasya86,24,Male,100,Blindscythe,3.27
4,4,Iskosia90,23,Male,131,Fury,1.44


In [123]:
lessten = purchase_data[purchase_data['Age']<10]
ten = lessten['SN'].nunique()
p_ten = round(ten / players * 100,2)
p_ten


2.95

In [124]:
tentoforteen = purchase_data[(purchase_data['Age'] >=10) & (purchase_data['Age'] <= 14)]
less15 = tentoforteen['SN'].nunique()
p_less15 = round(less15 / players * 100,2)
p_less15

3.82

In [125]:
nineteen = purchase_data[(purchase_data['Age'] >= 15) & (purchase_data['Age'] <= 19)]
less19 = nineteen['SN'].nunique()
p_less19 = round(less19 / players * 100,2)
p_less19

18.58

In [126]:
twentytofour = purchase_data[(purchase_data['Age'] >= 20) & (purchase_data['Age'] <= 24)]
less24 = twentytofour['SN'].nunique()
p_less24 = round(less24 / players * 100,2)
p_less24

44.79

In [127]:
twentyfivetonine = purchase_data[(purchase_data['Age'] >= 25) & (purchase_data['Age'] <= 29)]
less29 = twentyfivetonine['SN'].nunique()
p_less29 = round(less29/players * 100,2)
p_less29

13.37

In [128]:
thirtytofour= purchase_data[(purchase_data['Age'] >= 30) & (purchase_data['Age'] <= 34)]
less34 = thirtytofour['SN'].nunique()
p_less34 = round(less34/players * 100,2)
p_less34

9.03

In [129]:
thirtyfive = purchase_data[(purchase_data['Age'] >= 35) & (purchase_data['Age'] <= 39)]
less39 = thirtyfive['SN'].nunique()
p_less39 = round(less39/players * 100,2)
p_less39

5.38

In [130]:
fourty = purchase_data[(purchase_data['Age'] >= 40)]
plus40 = fourty['SN'].nunique()
p_plus40 = round(plus40/players * 100,2)
p_plus40

2.08

## Age Demographics

In [131]:
age_demographics = pd.DataFrame({
    "Age" : ["<10","10-14","15-19","20-24","25-29","30-34","35-39","40+"],
    "Total Count" : [ten,less15,less19,less24,less29,less34,less39,plus40],
    "Percentage of Players" : [p_ten,p_less15,p_less19,p_less24,p_less29,p_less34,p_less39,p_plus40]
                               })
age_demographics

Unnamed: 0,Age,Total Count,Percentage of Players
0,<10,17,2.95
1,10-14,22,3.82
2,15-19,107,18.58
3,20-24,258,44.79
4,25-29,77,13.37
5,30-34,52,9.03
6,35-39,31,5.38
7,40+,12,2.08


## Purchasing Analysis (Age)

* Bin the purchase_data data frame by age


* Run basic calculations to obtain purchase count, avg. purchase price, avg. purchase total per person etc. in the table below


* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame

In [132]:
age_analysis = pd.DataFrame({
"Age" : ["<10","10-14","15-19","20-24","25-29","30-34","35-39","40+"],
"Purchase Count" : [lessten['Price'].count(),tentoforteen['Price'].count(),nineteen['Price'].count(),
                        twentytofour['Price'].count(),twentyfivetonine['Price'].count(),
                        thirtytofour['Price'].count(),thirtyfive['Price'].count(),fourty['Price'].count()],
"Average Purchase Price" : [lessten['Price'].mean(),tentoforteen['Price'].mean(),nineteen['Price'].mean(),
                        twentytofour['Price'].mean(),twentyfivetonine['Price'].mean(),
                        thirtytofour['Price'].mean(),thirtyfive['Price'].mean(),fourty['Price'].mean()],
"Total Purchase Value" : [lessten['Price'].sum(),tentoforteen['Price'].sum(),nineteen['Price'].sum(),
                        twentytofour['Price'].sum(),twentyfivetonine['Price'].sum(),
                        thirtytofour['Price'].sum(),thirtyfive['Price'].sum(),fourty['Price'].sum()],
"Average Total Purchase Per Person" : [lessten['Price'].sum()/lessten['SN'].nunique(),
                                    tentoforteen['Price'].sum()/tentoforteen['SN'].nunique(), 
                                    nineteen['Price'].sum()/nineteen['SN'].nunique(),
                                    twentytofour['Price'].sum()/twentytofour['SN'].nunique(),   
                                twentyfivetonine['Price'].sum()/twentyfivetonine['SN'].nunique(),
                                thirtytofour['Price'].sum()/thirtytofour['SN'].nunique(),      
                                thirtyfive['Price'].sum()/thirtyfive['SN'].nunique(),      
                               fourty['Price'].sum()/fourty['SN'].nunique()] 

})
age_analysis.style.format({"Average Purchase Price":"${:.2f}","Total Purchase Value":"${:.2f}","Avg Total Purchase Per Person":"${:.2f}"})




Unnamed: 0,Age,Purchase Count,Average Purchase Price,Total Purchase Value,Average Total Purchase Per Person
0,<10,23,$3.35,$77.13,4.53706
1,10-14,28,$2.96,$82.78,3.76273
2,15-19,136,$3.04,$412.89,3.85879
3,20-24,365,$3.05,$1114.06,4.31806
4,25-29,101,$2.90,$293.00,3.80519
5,30-34,73,$2.93,$214.00,4.11538
6,35-39,41,$3.60,$147.67,4.76355
7,40+,13,$2.94,$38.24,3.18667


## Top Spenders
Run basic calculations to obtain the results in the table below
Create a summary data frame to hold the results
Sort the total purchase value column in descending order
Optional: give the displayed data cleaner formatting
Display a preview of the summary data frame

In [133]:
top_spenders = purchase_data[["SN","Price"]].groupby("SN")
top_spenders_df = ((top_spenders.count()).merge(round(top_spenders.mean(),2), on = "SN")).merge(top_spenders.sum(), on='SN')

top_spenders_df = top_spenders_df.rename(columns={"Price_x":"Purchase Count", "Price_y":"Average Purchase Price", "Price":"Total Purchase Value"})
top_spenders_result_df = top_spenders_df.sort_values("Total Purchase Value", ascending=False).head()

top_spenders_result_df

Unnamed: 0_level_0,Purchase Count,Average Purchase Price,Total Purchase Value
SN,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Lisosia93,5,3.79,18.96
Idastidru52,4,3.86,15.45
Chamjask73,3,4.61,13.83
Iral74,4,3.4,13.62
Iskadarya95,3,4.37,13.1


Most Popular Items
Retrieve the Item ID, Item Name, and Item Price columns
Group by Item ID and Item Name. Perform calculations to obtain purchase count, item price, and total purchase value
Create a summary data frame to hold the results
Sort the purchase count column in descending order
Optional: give the displayed data cleaner formatting
Display a preview of the summary data frame

In [137]:
most_popular = purchase_data[["Item ID","Item Name"]].groupby("Item ID")
most_pop_df = ((most_popular.count()).merge(most_popular.sum()), on = "Item ID") 
               
most_pop_df = most_pop_df.rename(columns={"Price_x":"Purchase Count", 
                                          "Price":"Total Purchase Value"})



most_pop_result_df = most_pop_df.sort_values("Total Purchase Value", descending=False).head())



SyntaxError: invalid syntax (<ipython-input-137-e2bdf900bfe6>, line 2)

Unnamed: 0_level_0,Unnamed: 1_level_0,Purchase Count,Item Price,Total Purchase Value
Item ID,Item Name,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
178,"Oathbreaker, Last Hope of the Breaking Storm",12,$4.23,$50.76
145,Fiery Glass Crusader,9,$4.58,$41.22
108,"Extraction, Quickblade Of Trembling Hands",9,$3.53,$31.77
82,Nirvana,9,$4.90,$44.10
19,"Pursuit, Cudgel of Necromancy",8,$1.02,$8.16


## Most Profitable Items

* Sort the above table by total purchase value in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the data frame



Unnamed: 0_level_0,Unnamed: 1_level_0,Purchase Count,Item Price,Total Purchase Value
Item ID,Item Name,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
178,"Oathbreaker, Last Hope of the Breaking Storm",12,$4.23,$50.76
82,Nirvana,9,$4.90,$44.10
145,Fiery Glass Crusader,9,$4.58,$41.22
92,Final Critic,8,$4.88,$39.04
103,Singed Scalpel,8,$4.35,$34.80
