### Heroes Of Pymoli Data Analysis
* Of the 1163 active players, the vast majority are male (84%). There also exists, a smaller, but notable proportion of female players (14%).

* Our peak age demographic falls between 20-24 (44.8%) with secondary groups falling between 15-19 (18.60%) and 25-29 (13.4%).  
-----

### Note
* Instructions have been included for each segment. You do not have to follow them exactly, but they are included to help you think through the steps.

In [1]:
# Dependencies and Setup
import pandas as pd

# File to Load (Remember to Change These)
file_to_load = "Resources/purchase_data.csv"

# Read Purchasing File and store into Pandas data frame
purchase_data = pd.read_csv(file_to_load)

## Player Count

* Display the total number of players


In [27]:
players = purchase_data["SN"].value_counts()
players.count()


576

## Purchasing Analysis (Total)

* Run basic calculations to obtain number of unique items, average price, etc.


* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame


In [3]:
unique_item = len(purchase_data["Item ID"].unique())

price_average = purchase_data["Price"].mean()

purchase_num = purchase_data["Purchase ID"].count()

total_rev = purchase_data["Price"].sum()

Purchase_summary = pd.DataFrame({
    "Number of Unique Items": [unique_item],
     "Average Price": price_average,
     "Number of Purchases": purchase_num,
     "Total Revenue": total_rev
})
Purchase_summary["Average Price"] = Purchase_summary["Average Price"].astype(float).map("${0:,.2f}".format)
Purchase_summary["Total Revenue"] = Purchase_summary["Total Revenue"].astype(float).map("${0:,.2f}".format)
Purchase_summary

Unnamed: 0,Number of Unique Items,Average Price,Number of Purchases,Total Revenue
0,179,$3.05,780,"$2,379.77"


## Gender Demographics

* Percentage and Count of Male Players


* Percentage and Count of Female Players


* Percentage and Count of Other / Non-Disclosed




In [4]:
#Gender dataframes
males_df = purchase_data.loc[purchase_data["Gender"] == "Male"]
females_df = purchase_data.loc[purchase_data["Gender"] == "Female"]
others_df = purchase_data.loc[purchase_data["Gender"] == "Other / Non-Disclosed"]
#Gender Total players
gender_total = purchase_data["Gender"].count()
males_total = males_df.Gender.count()
females_total = females_df.Gender.count()
others_total = others_df.Gender.count()
#Players by gender percent
males_pct = males_total/gender_total
female_pct = females_total/gender_total
other_pct = others_total/gender_total

Gender_summary = pd.DataFrame({
     "Total Count": [males_total,females_total,others_total],
     "Percentage of Players": [males_pct,female_pct,other_pct]
})
Gender_summary.index = ["Male","Female","Other/Non-Disclosed"]
Gender_summary["Percentage of Players"] = Gender_summary["Percentage of Players"].astype(float).map("{:.2%}".format)
Gender_summary

Unnamed: 0,Total Count,Percentage of Players
Male,652,83.59%
Female,113,14.49%
Other/Non-Disclosed,15,1.92%



## Purchasing Analysis (Gender)

* Run basic calculations to obtain purchase count, avg. purchase price, avg. purchase total per person etc. by gender




* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame

In [5]:
#Avg purchase price
males_avg_p = males_df.Price.mean()
females_avg_p = females_df.Price.mean()
others_avg_p = others_df.Price.mean()
#Total purchase price
males_total_p = males_df.Price.sum()
females_total_p = females_df.Price.sum()
others_total_p = others_df.Price.sum()
#Avg Total Purchase per Person
males_avg_total_p = males_total_p/males_total
females_avg_total_p = females_total_p/females_total
others_avg_total_p = others_total_p/others_total

Gender_purchase_summary = pd.DataFrame({
    "Purchase Count": [males_total,females_total,others_total],
    "Average Purchase Price": [males_avg_p,females_avg_p,others_avg_p],
    "Total Purchase Value": [males_total_p,females_total_p,others_total_p],
    "Avg Total Purchase per Person": [males_avg_total_p,females_avg_total_p,others_avg_total_p]
})
Gender_purchase_summary.index = ["Male","Female","Other/Non-Disclosed"]
Gender_purchase_summary["Average Purchase Price"] = Gender_purchase_summary["Average Purchase Price"].astype(float).map("${0:,.2f}".format)
Gender_purchase_summary["Total Purchase Value"] = Gender_purchase_summary["Total Purchase Value"].astype(float).map("${0:,.2f}".format)
Gender_purchase_summary["Avg Total Purchase per Person"] = Gender_purchase_summary["Avg Total Purchase per Person"].astype(float).map("${0:,.2f}".format)
Gender_purchase_summary


Unnamed: 0,Purchase Count,Average Purchase Price,Total Purchase Value,Avg Total Purchase per Person
Male,652,$3.02,"$1,967.64",$3.02
Female,113,$3.20,$361.94,$3.20
Other/Non-Disclosed,15,$3.35,$50.19,$3.35


## Age Demographics

* Establish bins for ages


* Categorize the existing players using the age bins. Hint: use pd.cut()


* Calculate the numbers and percentages by age group


* Create a summary data frame to hold the results


* Optional: round the percentage column to two decimal points


* Display Age Demographics Table


In [28]:
purchase_data["Age"].value_counts()
something = pd.cut(purchase_data["Age"], Age_bin, labels=Age_labels, include_lowest=True)
something.count()

773

In [21]:
Age_bin = [0,10,15,20,25,30,35,40]
Age_labels = ["<10","15","20","25","30","35","40+"]
Age_bins_df = pd.cut(purchase_data["Age"], Age_bin, labels=Age_labels, include_lowest=True)

Age_bin_df

Unnamed: 0,Total Count,Percentage of Players
<10,,
15,,
20,,
25,,
30,,
35,,
40+,,


## Purchasing Analysis (Age)

* Bin the purchase_data data frame by age


* Run basic calculations to obtain purchase count, avg. purchase price, avg. purchase total per person etc. in the table below


* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame

## Top Spenders

* Run basic calculations to obtain the results in the table below


* Create a summary data frame to hold the results


* Sort the total purchase value column in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the summary data frame



## Most Popular Items

* Retrieve the Item ID, Item Name, and Item Price columns


* Group by Item ID and Item Name. Perform calculations to obtain purchase count, item price, and total purchase value


* Create a summary data frame to hold the results


* Sort the purchase count column in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the summary data frame



## Most Profitable Items

* Sort the above table by total purchase value in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the data frame

