### Heroes Of Pymoli Data Analysis
* Of the 1163 active players, the vast majority are male (84%). There also exists, a smaller, but notable proportion of female players (14%).

* Our peak age demographic falls between 20-24 (44.8%) with secondary groups falling between 15-19 (18.60%) and 25-29 (13.4%).  
-----

### Note
* Instructions have been included for each segment. You do not have to follow them exactly, but they are included to help you think through the steps.

In [3]:
# Dependencies and Setup
import pandas as pd
import numpy as np

# File to Load (Remember to Change These)
file_to_load = "Resources/purchase_data.csv"

# Read Purchasing File and store into Pandas data frame
purchase_data = pd.read_csv(file_to_load)
purchase_data.head()

Unnamed: 0,Purchase ID,SN,Age,Gender,Item ID,Item Name,Price
0,0,Lisim78,20,Male,108,"Extraction, Quickblade Of Trembling Hands",3.53
1,1,Lisovynya38,40,Male,143,Frenzied Scimitar,1.56
2,2,Ithergue48,24,Male,92,Final Critic,4.88
3,3,Chamassasya86,24,Male,100,Blindscythe,3.27
4,4,Iskosia90,23,Male,131,Fury,1.44


## Player Count

* Display the total number of players


In [4]:
total_players = purchase_data['SN'].nunique()
total_players

576

## Purchasing Analysis (Total)

* Run basic calculations to obtain number of unique items, average price, etc.


* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame


In [5]:
# Number of Unique Items
unique_item_ct = purchase_data['Item ID'].nunique()
unique_item_ct

183

In [6]:
# Average (Item) Price
avg_price = round(purchase_data.groupby('Item Name')['Price'].mean().mean(), 2)
avg_price

3.04

In [7]:
# Total Number of Purchases
purchase_ct = purchase_data['Purchase ID'].count()
purchase_ct

780

In [8]:
# Total Revenue
sales_revenue = purchase_data['Price'].sum()
sales_revenue

2379.77

In [9]:
# Summarize Purchase Data into DataFrame
summary = pd.DataFrame(data={'Number of Unique Items':[unique_item_ct],
                             'Average Price':[avg_price],
                             'Number of Purchases':[purchase_ct],
                             'Total Revenue':[sales_revenue]})
summary

Unnamed: 0,Number of Unique Items,Average Price,Number of Purchases,Total Revenue
0,183,3.04,780,2379.77


In [10]:
# Format Output: Total Revenue as Currency
summary['Total Revenue'] = summary['Total Revenue'].map("${:,.2f}".format)
summary

Unnamed: 0,Number of Unique Items,Average Price,Number of Purchases,Total Revenue
0,183,3.04,780,"$2,379.77"


## Gender Demographics

* Percentage and Count of Male Players


* Percentage and Count of Female Players


* Percentage and Count of Other / Non-Disclosed




In [11]:
# Male Player Count
male_ct = purchase_data[['SN','Gender']].groupby('Gender')['SN'].nunique().loc['Male']
male_ct

484

In [12]:
# Male Player Percentage
male_pct = round((male_ct/total_players)*100,2)
male_pct

84.03

In [13]:
# Female Player Percentage
female_ct = purchase_data[['SN','Gender']].groupby('Gender')['SN'].nunique().loc['Female']
female_ct

81

In [14]:
female_pct = round((female_ct/total_players)*100,2)
female_pct

14.06

In [15]:
other_ct = purchase_data[['SN','Gender']].groupby('Gender')['SN'].nunique().loc['Other / Non-Disclosed']
other_ct

11

In [16]:
other_pct = round((other_ct/total_players)*100,2)
other_pct

1.91


## Purchasing Analysis (Gender)

* Run basic calculations to obtain purchase count, avg. purchase price, avg. purchase total per person etc. by gender




* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame

In [17]:
# By Gender: Purchase Count
gen_purchase_count = purchase_data.groupby('Gender')['SN'].count()
gen_purchase_count.reset_index().rename(columns={'SN':'Counts'})


Unnamed: 0,Gender,Counts
0,Female,113
1,Male,652
2,Other / Non-Disclosed,15


In [18]:
# By Gender: Avg Purchase Price
gen_avg_price = round(purchase_data.groupby('Gender')['Price'].mean(),2)
gen_avg_price.reset_index().rename(columns={'Price':'Avg Item Price'})

Unnamed: 0,Gender,Avg Item Price
0,Female,3.2
1,Male,3.02
2,Other / Non-Disclosed,3.35


In [19]:
# By Gender: Total Purchase Value
gen_total_purchase_value = round(purchase_data.groupby('Gender')['Price'].sum(),2)
gen_total_purchase_value.reset_index().rename(columns={'Price':'Avg. Item Price'})

Unnamed: 0,Gender,Avg. Item Price
0,Female,361.94
1,Male,1967.64
2,Other / Non-Disclosed,50.19


In [20]:
# By Gender: Avg Purchase Total per Person
gen_each_avg_purchase = round(purchase_data.groupby(['Gender','SN'])['Price'].mean(),2)
gen_each_avg_purchase.reset_index().rename(columns={'Price':'Avg. Price'}).head()

Unnamed: 0,Gender,SN,Avg. Price
0,Female,Adastirin33,4.48
1,Female,Aerithllora36,4.32
2,Female,Aethedru70,3.54
3,Female,Aidain51,3.45
4,Female,Aiduesu86,4.48


## Age Demographics

* Establish bins for ages


* Categorize the existing players using the age bins. Hint: use pd.cut()


* Calculate the numbers and percentages by age group


* Create a summary data frame to hold the results


* Optional: round the percentage column to two decimal points


* Display Age Demographics Table


In [22]:
# By Age: Establish Bins For Ages
ages = [0,12,20,35,50,65,150]
categories = 'Child Teen Young_Adult Adult Mature Elderly'.split()
purchase_data['Age Group'] = pd.cut(purchase_data['Age'], ages, labels=categories)
purchase_data.head()

Unnamed: 0,Purchase ID,SN,Age,Gender,Item ID,Item Name,Price,Age Group
0,0,Lisim78,20,Male,108,"Extraction, Quickblade Of Trembling Hands",3.53,Teen
1,1,Lisovynya38,40,Male,143,Frenzied Scimitar,1.56,Adult
2,2,Ithergue48,24,Male,92,Final Critic,4.88,Young_Adult
3,3,Chamassasya86,24,Male,100,Blindscythe,3.27,Young_Adult
4,4,Iskosia90,23,Male,131,Fury,1.44,Young_Adult


In [84]:
# By Age: Categorize Existing Players Using the Age Bins
age_data = purchase_data.groupby('Age Group')['Price'].sum()
age_summary = age_data.reset_index().rename(columns={'Price':'Revenue'})
age_summary

Unnamed: 0,Age Group,Revenue
0,Child,143.55
1,Teen,743.57
2,Young_Adult,1358.77
3,Adult,133.88
4,Mature,0.0
5,Elderly,0.0


In [73]:
# By Age: Calculate and Organize by Percentage Totals
age_summary['%Total'] = round(age_summary['Revenue']/age_summary['Revenue'].sum(), 2)*100
age_summary

Unnamed: 0,Age Group,Revenue,%Total
0,Child,143.55,6.0
1,Teen,743.57,31.0
2,Young_Adult,1358.77,57.0
3,Adult,133.88,6.0
4,Mature,0.0,0.0
5,Elderly,0.0,0.0


## Purchasing Analysis (Age)

* Bin the purchase_data data frame by age


* Run basic calculations to obtain purchase count, avg. purchase price, avg. purchase total per person etc. in the table below


* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame

In [61]:
# By Age: Categorize Existing Players Using the Age Bins
age_data = purchase_data.groupby('Age Group')['Price'].sum()
age_data = age_data.reset_index().rename(columns={'Price':'Revenue'})
age_data.head()

Unnamed: 0,Age Group,Revenue
0,Child,143.55
1,Teen,743.57
2,Young_Adult,1358.77
3,Adult,133.88
4,Mature,0.0


In [70]:
age_data['Pct Total'] = round(age_data['Revenue']/age_data['Revenue'].sum(), 2)*100
age_data.head()

Unnamed: 0,Age Group,Revenue,Pct Total
0,Child,143.55,6.0
1,Teen,743.57,31.0
2,Young_Adult,1358.77,57.0
3,Adult,133.88,6.0
4,Mature,0.0,0.0


## Top Spenders

* Run basic calculations to obtain the results in the table below


* Create a summary data frame to hold the results


* Sort the total purchase value column in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the summary data frame



In [74]:
purchase_data.groupby('SN')['Price'].sum().sort_values(ascending=False).reset_index().rename(columns={'Price':'Total Purchase Value'}).head()

Unnamed: 0,SN,Total Purchase Value
0,Lisosia93,18.96
1,Idastidru52,15.45
2,Chamjask73,13.83
3,Iral74,13.62
4,Iskadarya95,13.1


## Most Popular Items

* Retrieve the Item ID, Item Name, and Item Price columns


* Group by Item ID and Item Name. Perform calculations to obtain purchase count, item price, and total purchase value


* Create a summary data frame to hold the results


* Sort the purchase count column in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the summary data frame



In [23]:
# By Count: Most Popular Items
#purchase_data.groupby(['Item Name', 'Item ID', 'Price'])['Price'].count().sort_values(ascending=False).head()
purchase_data.groupby(['Item Name', 'Item ID'])['Price'].value_counts().sort_values(ascending=False).head()

Item Name                                     Item ID  Price
Oathbreaker, Last Hope of the Breaking Storm  178      4.23     12
Extraction, Quickblade Of Trembling Hands     108      3.53      9
Nirvana                                       82       4.90      9
Fiery Glass Crusader                          145      4.58      9
Shadow Strike, Glory of Ending Hope           37       3.16      8
Name: Price, dtype: int64

In [24]:
example = purchase_data.groupby(['Item Name', 'Item ID'])['Price'].value_counts().sort_values(ascending=False).to_frame().head()
example.columns = ['Counts']
example = example.reset_index()
example

Unnamed: 0,Item Name,Item ID,Price,Counts
0,"Oathbreaker, Last Hope of the Breaking Storm",178,4.23,12
1,"Extraction, Quickblade Of Trembling Hands",108,3.53,9
2,Nirvana,82,4.9,9
3,Fiery Glass Crusader,145,4.58,9
4,"Shadow Strike, Glory of Ending Hope",37,3.16,8


## Most Profitable Items

* Sort the above table by total purchase value in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the data frame



In [42]:
profit_data = purchase_data.groupby(['Item Name', 'Item ID'])['Price'].sum().sort_values(ascending=False)
profit_data.reset_index().rename(columns={'Price':'Total Rev'}).head()

Unnamed: 0,Item Name,Item ID,Total Rev
0,"Oathbreaker, Last Hope of the Breaking Storm",178,50.76
1,Nirvana,82,44.1
2,Fiery Glass Crusader,145,41.22
3,Final Critic,92,39.04
4,Singed Scalpel,103,34.8
