# Heroes Of Pymoli Data Analysis
* Of the 1163 active players, the vast majority are male (84%). There also exists, a smaller, but notable proportion of female players (14%).

* Our peak age demographic falls between 20-24 (44.8%) with secondary groups falling between 15-19 (18.60%) and 25-29 (13.4%).  
-----

# Note
* Instructions have been included for each segment. You do not have to follow them exactly, but they are included to help you think through the steps.

In [1]:
# Dependencies and Setup
import pandas as pd
import numpy as np

# File to Load (Remember to Change These)
file_to_load = "Resources/purchase_data.csv"

# Read Purchasing File and store into Pandas data frame
df = pd.read_csv(file_to_load)

## Player Count

* Display the total number of players


In [2]:
df.head()

player_sum = df["Gender"].count()

player_sum

780

## Purchasing Analysis (Total)

* Run basic calculations to obtain number of unique items, average price, etc.


* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame


In [3]:
#Number of Unique Items, Average Price, Number of Purchases, Total Revenue
#df.head()
mean = df["Price"].mean()
unique = df["Item ID"].unique()
count = len(unique)
purchases = df["Purchase ID"].count()
revenue = df["Price"].sum()
#revenue
#df.head()
#Summary Dataframe


summary_df = pd.DataFrame({"Number of Unique Items": [count],
                           "Average Price": [mean],
                           "Number of Purchases": [count],
                           "Total Revenue": [revenue]})

summary_df["Average Price"] = summary_df["Average Price"].astype(float).map("${:,.2f}".format)
summary_df["Total Revenue"] = summary_df["Total Revenue"].astype(float).map("${:,.2f}".format)

summary_df

Unnamed: 0,Number of Unique Items,Average Price,Number of Purchases,Total Revenue
0,183,$3.05,183,"$2,379.77"


## Gender Demographics

* Percentage and Count of Male Players


* Percentage and Count of Female Players


* Percentage and Count of Other / Non-Disclosed




In [4]:
#filter out genders and assign
men = df.loc[df["Gender"] == "Male"]
women = df.loc[df["Gender"] == "Female"]
other = df.loc[df["Gender"] == "Other / Non-Disclosed"]
#gender = df.groupby(["Gender"])

#men_count = men.loc[1,:].count()
men_count= men["Gender"].count() 
women_count = women["Gender"].count()
other_count = other["Gender"].count()

men_percent = men["Gender"].count() / player_sum * 100
women_percent = women["Gender"].count() / player_sum * 100
other_percent = other["Gender"].count() / player_sum *100



gender_df = pd.DataFrame({"Gender": ['Male', 'Female', 'Other/Non-Disclosed'],
                            "Percentage of Players": [men_percent, women_percent, other_percent],
                            "Total Count": [men_count, women_count, other_count]})


gender_df["Percentage of Players"] = gender_df["Percentage of Players"].astype(float).map("{:,.2f}%".format)
gender_df

Unnamed: 0,Gender,Percentage of Players,Total Count
0,Male,83.59%,652
1,Female,14.49%,113
2,Other/Non-Disclosed,1.92%,15



## Purchasing Analysis (Gender)

# * Run basic calculations to obtain purchase count, avg. purchase price, avg. purchase total per person etc. by gender




* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame

In [35]:
#--average purchase price
men_app= men["Price"].mean() 
women_app= women["Price"].mean() 
other_app = other["Price"].mean()


#--Total Purchase Value
men_total= men["Price"].sum() 
women_total= women["Price"].sum() 
other_total = other["Price"].sum()


gender_analysis = pd.DataFrame({"Gender": ['Male', 'Female', 'Other/Non-Disclosed'],
                            "Purchase Count": [men_count, women_count, other_count],
                               "Average Purchase Price" : [men_app, women_app, other_app],
                               "Total Purchase Value" : [men_total, women_total, other_total] })

####NEED TO CALCULATE AVG PURCHASE PRICE PER PERSON .unique is whats missing


#format for currency below here
gender_analysis["Average Purchase Price"] = gender_analysis["Average Purchase Price"].astype(float).map("${:,.2f}".format)
gender_analysis["Total Purchase Value"] = gender_analysis["Total Purchase Value"].astype(float).map("${:,.2f}".format)


gender_analysis

Unnamed: 0,Gender,Purchase Count,Average Purchase Price,Total Purchase Value
0,Male,652,$3.02,"$1,967.64"
1,Female,113,$3.20,$361.94
2,Other/Non-Disclosed,15,$3.35,$50.19


## Age Demographics

* Establish bins for ages


* Categorize the existing players using the age bins. Hint: use pd.cut()


* Calculate the numbers and percentages by age group


* Create a summary data frame to hold the results


* Optional: round the percentage column to two decimal points


* Display Age Demographics Table


In [108]:
ages= [0,10,15,19,25,30,35,40,100]
age_labels= ["<10","10-14","15-19", "20-24", "25-29", "30-34", "35-39", "40+"]
df["Age Range"] = pd.cut(df["Age"], ages, labels=age_labels)

age_group = df.groupby("Age Range")

age_count = age_group["SN"].count()
sumtotal = df["SN"].count()
percentage = (age_count / sumtotal) * 100

age_df = pd.DataFrame({"Total Count": age_count,
                         "Percentage of Players": percentage})
age_df["Percentage of Players"] = age_df["Percentage of Players"].map("{:.2f}%".format)
age_df


Unnamed: 0_level_0,Total Count,Percentage of Players
Age Range,Unnamed: 1_level_1,Unnamed: 2_level_1
<10,32,4.10%
10-14,54,6.92%
15-19,101,12.95%
20-24,424,54.36%
25-29,77,9.87%
30-34,52,6.67%
35-39,33,4.23%
40+,7,0.90%



## Purchasing Analysis (Age)

* Bin the purchase_data data frame by age


* Run basic calculations to obtain purchase count, avg. purchase price, avg. purchase total per person etc. in the table below


* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame

In [112]:
ages= [0,10,15,19,25,30,35,40,100]
age_labels= ["<10","10-14","15-19", "20-24", "25-29", "30-34", "35-39", "40+"]
df["Age Range"] = pd.cut(df["Age"], ages, labels=age_labels)

age_group = df.groupby("Age Range")

purchase_count = age_group["Price"].count()
avg_purchase_price_bin = age_group["Price"].mean()
total_bin = age_group["Price"].sum()

age_bin_breakout = pd.DataFrame({"Purchase Count": purchase_count,
                         "Average Purchase Price": avg_purchase_price_bin,
                            "Total Purchase Value":total_bin })

age_bin_breakout["Average Purchase Price"] = age_bin_breakout["Average Purchase Price"].astype(float).map("${:,.2f}".format)
age_bin_breakout["Total Purchase Value"] = age_bin_breakout["Total Purchase Value"].astype(float).map("${:,.2f}".format)


age_bin_breakout

#####Note need avg purchase price per person use .unique()



Unnamed: 0_level_0,Purchase Count,Average Purchase Price,Total Purchase Value
Age Range,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
<10,32,$3.40,$108.96
10-14,54,$2.90,$156.60
15-19,101,$3.04,$307.24
20-24,424,$3.06,"$1,295.96"
25-29,77,$2.88,$221.42
30-34,52,$2.99,$155.71
35-39,33,$3.40,$112.35
40+,7,$3.08,$21.53


## Top Spenders

* Run basic calculations to obtain the results in the table below


* Create a summary data frame to hold the results


* Sort the total purchase value column in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the summary data frame



In [119]:
sn_df= df.groupby(["SN"])

sn_count = sn_df["Purchase ID"].count()
sn_app = sn_df["Price"].mean()
sn_total = sn_df["Price"].sum()


sn_summary = pd.DataFrame({"Purchase Count": sn_count,
                         "Average Purchase Price": sn_app,
                            "Total Purchase Value":sn_total })

sn_summary["Average Purchase Price"] = sn_summary["Average Purchase Price"].astype(float).map("${:,.2f}".format)
sn_summary["Total Purchase Value"] = sn_summary["Total Purchase Value"].astype(float).map("${:,.2f}".format)
sn_summary = sn_summary.sort_values("Total Purchase Value", ascending=False) 

sn_summary.head()

Unnamed: 0_level_0,Purchase Count,Average Purchase Price,Total Purchase Value
SN,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Haillyrgue51,3,$3.17,$9.50
Phistym51,2,$4.75,$9.50
Lamil79,2,$4.64,$9.29
Aina42,3,$3.07,$9.22
Saesrideu94,2,$4.59,$9.18


## Most Popular Items

* Retrieve the Item ID, Item Name, and Item Price columns


* Group by Item ID and Item Name. Perform calculations to obtain purchase count, item price, and total purchase value


* Create a summary data frame to hold the results


* Sort the purchase count column in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the summary data frame



Unnamed: 0_level_0,Unnamed: 1_level_0,Purchase Count,Item Price,Total Purchase Value
Item ID,Item Name,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
178,"Oathbreaker, Last Hope of the Breaking Storm",12,$4.23,$50.76
145,Fiery Glass Crusader,9,$4.58,$41.22
108,"Extraction, Quickblade Of Trembling Hands",9,$3.53,$31.77
82,Nirvana,9,$4.90,$44.10
19,"Pursuit, Cudgel of Necromancy",8,$1.02,$8.16


## Most Profitable Items

* Sort the above table by total purchase value in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the data frame



Unnamed: 0_level_0,Unnamed: 1_level_0,Purchase Count,Item Price,Total Purchase Value
Item ID,Item Name,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
178,"Oathbreaker, Last Hope of the Breaking Storm",12,$4.23,$50.76
82,Nirvana,9,$4.90,$44.10
145,Fiery Glass Crusader,9,$4.58,$41.22
92,Final Critic,8,$4.88,$39.04
103,Singed Scalpel,8,$4.35,$34.80
