### Heroes Of Pymoli Data Analysis
* Of the 1163 active players, the vast majority are male (84%). There also exists, a smaller, but notable proportion of female players (14%).

* Our peak age demographic falls between 20-24 (44.8%) with secondary groups falling between 15-19 (18.60%) and 25-29 (13.4%).  
-----

### Note
* Instructions have been included for each segment. You do not have to follow them exactly, but they are included to help you think through the steps.

In [1]:
# Dependencies and Setup
import pandas as pd
import numpy as np

# File to Load (Remember to Change These)
file_to_load = "Resources/purchase_data.csv"

pd.set_option("display.max_rows",999)

# Read Purchasing File and store into Pandas data frame
purchase_data = pd.read_csv(file_to_load)
purchase_data.head()



Unnamed: 0,Purchase ID,SN,Age,Gender,Item ID,Item Name,Price
0,0,Lisim78,20,Male,108,"Extraction, Quickblade Of Trembling Hands",3.53
1,1,Lisovynya38,40,Male,143,Frenzied Scimitar,1.56
2,2,Ithergue48,24,Male,92,Final Critic,4.88
3,3,Chamassasya86,24,Male,100,Blindscythe,3.27
4,4,Iskosia90,23,Male,131,Fury,1.44


* Display the total number of players


In [6]:
player_cnt = purchase_data["SN"].nunique()
player_cnt

576

## Purchasing Analysis (Total)

* Run basic calculations to obtain number of unique items, average price, etc.


* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame


In [12]:
#Number of unique items
unique_num = purchase_data["Item ID"].nunique()

#Average price of items
avg_price = round(purchase_data["Price"].mean(),2)

#Total number of items purchased
total_num_purch = purchase_data["Purchase ID"].nunique()

#Total revenue from items sold
total_rev = purchase_data["Price"].sum()

#Price of the most expensive item
hi_cost = round(purchase_data["Price"].max(),2)

#The most expensive item
hi_item = purchase_data.loc[purchase_data["Price"] == hi_cost, "Item Name"].iloc[0]

#Price of the cheapest item
lo_cost = round(purchase_data["Price"].min(),2)

#The cheapest item
lo_item = purchase_data.loc[purchase_data["Price"] == lo_cost, "Item Name"].iloc[0]

summary_df = pd.DataFrame({"Number of Unique Items":[unique_num],
                           "Average Cost of Items":[avg_price],
                           "Total Number of Purchases":[total_num_purch],
                           "Total Revenue":[total_rev],
                           "Most Expensive Item":[hi_item],
                           "Price of High Item":[hi_cost],
                           "Lowest Purchase":[lo_item],
                           "Price of Low Item":[lo_cost]
                         })
summary_df

Unnamed: 0,Number of Unique Items,Average Cost of Items,Total Number of Purchases,Total Revenue,Most Expensive Item,Price of High Item,Lowest Purchase,Price of Low Item
0,183,3.05,780,2379.77,Stormfury Mace,4.99,Whistling Mithril Warblade,1.0


## Gender Demographics

* Percentage and Count of Male Players


* Percentage and Count of Female Players


* Percentage and Count of Other / Non-Disclosed




In [13]:
#calulates the number of customers divided up by gender
gender_cnt = purchase_data.groupby("Gender")["SN"].nunique()
gender_cnt_df = gender_cnt.to_frame("Total Count").reset_index()

#calculates the percentage of genders
gender_percent = round(((gender_cnt)/player_cnt)*100,2)
gender_percent_df = gender_percent.to_frame("Percentage of Players").reset_index()

#merge the two dataframes to display a summary. 
gender_demographics_df = gender_cnt_df.merge(gender_percent_df).set_index("Gender")
gender_demographics_df

Unnamed: 0_level_0,Total Count,Percentage of Players
Gender,Unnamed: 1_level_1,Unnamed: 2_level_1
Female,81,14.06
Male,484,84.03
Other / Non-Disclosed,11,1.91



## Purchasing Analysis (Gender)

* Run basic calculations to obtain purchase count, avg. purchase price, avg. purchase total per person etc. by gender




* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame

In [17]:
#   * Purchase Count
gender_purchase = purchase_data.groupby("Gender")["Price"].count()
g_purch_df = gender_purchase.to_frame("Purchases").reset_index()

#   * Average Purchase Price
gender_purchase_avg = round(purchase_data.groupby("Gender")["Price"].mean(),2)
g_purch_avg_df = gender_purchase_avg.to_frame("Purchase Average").reset_index()

#   * Total Purchase Value
gender_total_value = purchase_data.groupby("Gender")["Price"].sum()
g_tot_val_df = gender_total_value.to_frame("Total Purchase Value").reset_index()

#   * Average Purchase Total per Person by Gender
cust_avg_total = round(gender_total_value/gender_cnt,2)
c_avg_tot_df = cust_avg_total.to_frame("Average Total Purchase per Person").reset_index()

#Merge the dataframes and set index to Gender. 
purch_gend_analysis_df = g_purch_df.merge(g_purch_avg_df).merge(g_tot_val_df).merge(c_avg_tot_df).set_index("Gender")
purch_gend_analysis_df

Unnamed: 0_level_0,Purchases,Purchase Average,Total Purchase Value,Average Total Purchase per Person
Gender,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
Female,113,3.2,361.94,4.47
Male,652,3.02,1967.64,4.07
Other / Non-Disclosed,15,3.35,50.19,4.56


* Establish bins for ages


* Categorize the existing players using the age bins. Hint: use pd.cut()


* Calculate the numbers and percentages by age group


* Create a summary data frame to hold the results


* Optional: round the percentage column to two decimal points


* Display Age Demographics Table


In [18]:
#creates the age brackets
age_bins = [0,9,14,19,24,29,34,39,100]
age_labels = ["<10","10-14","15-19","20-24","25-29","30-34","35-39","40+"]

#takes the ages from the CSV data. 
purchase_data["Age Group"] = pd.cut(purchase_data["Age"], age_bins, labels=age_labels)

#groups the unique SNs by age and counts them.  Then it converts the series to a dataframe
age_breakdown = purchase_data.groupby("Age Group")["SN"].nunique()
age_breakdown_df = age_breakdown.to_frame("Total Count").reset_index()

#Calculates the percentage of the age counts and rounds by 2 decimals. Then converts the series into a dataframe
age_percentage = round((age_breakdown/player_cnt)*100,2)
age_percentage_df = age_percentage.to_frame("Percentage of Players").reset_index()

#Merged the two dataframes together and sets the age group column as the new index. 
age_demographic_df = age_breakdown_df.merge(age_percentage_df).set_index("Age Group")
age_demographic_df





Unnamed: 0_level_0,Total Count,Percentage of Players
Age Group,Unnamed: 1_level_1,Unnamed: 2_level_1
<10,17,2.95
10-14,22,3.82
15-19,107,18.58
20-24,258,44.79
25-29,77,13.37
30-34,52,9.03
35-39,31,5.38
40+,12,2.08


## Purchasing Analysis (Age)

* Bin the purchase_data data frame by age


* Run basic calculations to obtain purchase count, avg. purchase price, avg. purchase total per person etc. in the table below


* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame

In [19]:
#Calculate the age group for all purchases. Then convert to DF
purchase_cnt = purchase_data.groupby("Age Group")["Age"].count()
purchase_cnt_df = purchase_cnt.to_frame("Purchase Count").reset_index()

#calculates the average purchase price per age group.  Then convert to DF
avg_purch_price = round(purchase_data.groupby("Age Group")["Price"].mean(),2)
avg_purch_price_df = avg_purch_price.to_frame("Average Purchase Price").reset_index()

#Get a total purchase value for each age group. Then convert to DF
total_purchase = round(purchase_data.groupby("Age Group")["Price"].sum(),2)
total_purchase_df = total_purchase.to_frame("Total Purchase Value").reset_index()

#Find the average total purchase value per person. Then convert to DF
avg_person_purch = round(total_purchase/age_breakdown,2).map("${:.2f}".format)
avg_person_purch_df = avg_person_purch.to_frame("Avg Total Purchase per Person").reset_index()

#merge all the dataframes together. Set age group as the index. 
purchase_analysis_df = purchase_cnt_df.merge(avg_purch_price_df).merge(total_purchase_df).merge(avg_person_purch_df).set_index("Age Group")
purchase_analysis_df

Unnamed: 0_level_0,Purchase Count,Average Purchase Price,Total Purchase Value,Avg Total Purchase per Person
Age Group,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
<10,23,3.35,77.13,$4.54
10-14,28,2.96,82.78,$3.76
15-19,136,3.04,412.89,$3.86
20-24,365,3.05,1114.06,$4.32
25-29,101,2.9,293.0,$3.81
30-34,73,2.93,214.0,$4.12
35-39,41,3.6,147.67,$4.76
40+,13,2.94,38.24,$3.19


## Top Spenders

* Run basic calculations to obtain the results in the table below


* Create a summary data frame to hold the results


* Sort the total purchase value column in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the summary data frame



## Most Popular Items

* Retrieve the Item ID, Item Name, and Item Price columns


* Group by Item ID and Item Name. Perform calculations to obtain purchase count, item price, and total purchase value


* Create a summary data frame to hold the results


* Sort the purchase count column in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the summary data frame



## Most Profitable Items

* Sort the above table by total purchase value in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the data frame

