### Heroes Of Pymoli Data Analysis
* Of the 1163 active players, the vast majority are male (84%). There also exists, a smaller, but notable proportion of female players (14%).

* Our peak age demographic falls between 20-24 (44.8%) with secondary groups falling between 15-19 (18.60%) and 25-29 (13.4%).  
-----

### Note
* Instructions have been included for each segment. You do not have to follow them exactly, but they are included to help you think through the steps.

In [1]:
# Dependencies and Setup
import pandas as pd
import numpy as np

# File to Load (Remember to Change These)
file_to_load = "purchase_data.csv"

# Read Purchasing File and store into Pandas data frame
purchase_data = pd.read_csv(file_to_load)
#purchase_data.head(3)

## Player Count

* Display the total number of players


In [2]:
total_count = purchase_data["SN"].unique()
pd.DataFrame({"Total Players":[len(total_count)]})

Unnamed: 0,Total Players
0,576


## Purchasing Analysis (Total)

* Run basic calculations to obtain number of unique items, average price, etc.


* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame


In [3]:
#Calculatios number of unique items, average price, purchases, total revenue
uniq_items = len(purchase_data["Item ID"].unique())
avarage_price = purchase_data["Price"].mean()
number_of_purchases = purchase_data["Purchase ID"].count()
total_revenue = purchase_data["Price"].sum()

#The summary data frame
df = pd.DataFrame({"Number of Unique Items":[uniq_items],
             "Average Price":[avarage_price],
             "Number of Purchases":[number_of_purchases],
             "Total Revenue":[total_revenue]})
#Format
df['Average Price'] = df['Average Price'].map('${:,.2f}'.format)
df['Total Revenue'] = df['Total Revenue'].map('${:,.2f}'.format)
df

Unnamed: 0,Number of Unique Items,Average Price,Number of Purchases,Total Revenue
0,183,$3.05,780,"$2,379.77"


## Gender Demographics

* Percentage and Count of Male Players


* Percentage and Count of Female Players


* Percentage and Count of Other / Non-Disclosed




In [4]:
#Gender count
gender_sn = purchase_data.loc[:, ["Gender", "SN"]]
gender_sn = gender_sn.drop_duplicates()
count = gender_sn["Gender"].value_counts()
gender_group = purchase_data["Gender"].value_counts()
perc = gender_group/len(purchase_data)*100
#DataFrame
gender_dataframe = pd.DataFrame({"Total Count": count,
                                "Percentage of Players": perc})
#Format
gender_dataframe['Percentage of Players'] = gender_dataframe['Percentage of Players'].map('{:,.2f}'.format)
gender_dataframe

Unnamed: 0,Total Count,Percentage of Players
Male,484,83.59
Female,81,14.49
Other / Non-Disclosed,11,1.92



## Purchasing Analysis (Gender)

* Run basic calculations to obtain purchase count, avg. purchase price, avg. purchase total per person etc. by gender




* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame

In [7]:
#Calculation
total_purchase = purchase_data.groupby(["Gender"]).sum()["Price"]
average_purchase_price = purchase_data.groupby(["Gender"]).mean()["Price"]
purchase_count = purchase_data.groupby(["Gender"]).count()["Price"]
average_per_person = total_purchase/count
#DataFrame
purchasing_dataframe = pd.DataFrame({"Purchase Count": purchase_count,
                                    "Average Purchase Price": average_purchase_price,
                                    "Total Purchase Value": total_purchase,
                                    "Avg Total Purchase per Person":average_per_person})
#Format
purchasing_dataframe['Average Purchase Price'] = purchasing_dataframe['Average Purchase Price'].map('${:,.2f}'.format)
purchasing_dataframe['Total Purchase Value'] = purchasing_dataframe['Total Purchase Value'].map('${:,.2f}'.format)
purchasing_dataframe['Avg Total Purchase per Person'] = purchasing_dataframe['Avg Total Purchase per Person'].map('${:,.2f}'.format)
purchasing_dataframe

Unnamed: 0_level_0,Purchase Count,Average Purchase Price,Total Purchase Value,Avg Total Purchase per Person
Gender,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
Female,113,$3.20,$361.94,$4.47
Male,652,$3.02,"$1,967.64",$4.07
Other / Non-Disclosed,15,$3.35,$50.19,$4.56


## Age Demographics

* Establish bins for ages


* Categorize the existing players using the age bins. Hint: use pd.cut()


* Calculate the numbers and percentages by age group


* Create a summary data frame to hold the results


* Optional: round the percentage column to two decimal points


* Display Age Demographics Table


In [8]:
#Create Bins and Groups
bins = [0, 9.90, 14.90, 19.90, 24.9, 29.9, 34.90, 39.90, 9999999]
group_names = ["<10","10-14","15-19","20-24","25-29","30-34","35-39","40+"]
players_age = purchase_data.loc[:, ["Gender", "SN", "Age"]]
players_age = players_age.drop_duplicates()
players_count = players_age.count()[0]

In [9]:
players_age["Age Range"] = pd.cut(players_age["Age"], bins, labels=group_names)
#Total Counts
players_age_count = players_age["Age Range"].value_counts()
#Percentage
percentage_players = players_age_count/players_count*100
#DataFrame
age_demographics_dataframe = pd.DataFrame({"Total Count": players_age_count,
                                            "Percentage of Players": percentage_players})
#Format
age_demographics_dataframe['Percentage of Players'] = age_demographics_dataframe['Percentage of Players'].map('{:,.2f}'.format)
age_demographics_dataframe.sort_index()

Unnamed: 0,Total Count,Percentage of Players
<10,17,2.95
10-14,22,3.82
15-19,107,18.58
20-24,258,44.79
25-29,77,13.37
30-34,52,9.03
35-39,31,5.38
40+,12,2.08


## Purchasing Analysis (Age)

* Bin the purchase_data data frame by age


* Run basic calculations to obtain purchase count, avg. purchase price, avg. purchase total per person etc. in the table below


* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame

In [10]:
#Add price to Players age data
players_age_price = purchase_data.loc[:, ["Gender", "SN", "Age","Price"]]
players_age_price = players_age_price.drop_duplicates()
players_age_price["Age Range"] = pd.cut(players_age_price["Age"], bins, labels=group_names)
group_players = players_age_price.groupby(["Age Range"])
#Calculation
group_players_count = group_players["Age"].count()
group_players_mean = group_players["Price"].mean()
group_players_sum = group_players["Price"].sum()
group_players_per_person = group_players_sum / players_age_count
#DataFrame
Purchasing_analysis_dataframe = pd.DataFrame({"Purchase Count": group_players_count,
                                                "Average Purchase Price": group_players_mean,
                                                "Total Purchase Value":group_players_sum,
                                                "Avg Total Purchase per Person":group_players_per_person})
#Format
Purchasing_analysis_dataframe['Average Purchase Price'] = Purchasing_analysis_dataframe['Average Purchase Price'].map('${:,.2f}'.format)
Purchasing_analysis_dataframe['Total Purchase Value'] = Purchasing_analysis_dataframe['Total Purchase Value'].map('${:,.2f}'.format)
Purchasing_analysis_dataframe['Avg Total Purchase per Person'] = Purchasing_analysis_dataframe['Avg Total Purchase per Person'].map('${:,.2f}'.format)

Purchasing_analysis_dataframe

Unnamed: 0,Purchase Count,Average Purchase Price,Total Purchase Value,Avg Total Purchase per Person
10-14,28,$2.96,$82.78,$3.76
15-19,135,$3.04,$410.41,$3.84
20-24,365,$3.05,"$1,114.06",$4.32
25-29,101,$2.90,$293.00,$3.81
30-34,73,$2.93,$214.00,$4.12
35-39,41,$3.60,$147.67,$4.76
40+,13,$2.94,$38.24,$3.19
<10,23,$3.35,$77.13,$4.54


## Top Spenders

* Run basic calculations to obtain the results in the table below


* Create a summary data frame to hold the results


* Sort the total purchase value column in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the summary data frame



In [11]:
#Calculation
total_purchase_count = purchase_data.groupby(["SN"]).sum()["Price"].rename("Total Purchase Count")
avarage_purchase = purchase_data.groupby(["SN"]).mean()["Price"]
purchase_count = purchase_data.groupby(["SN"]).count()["Price"]
#total_purchase_count.sort_values(ascending=False).head(5)

#DataFrame
age_demographics_dataframe = pd.DataFrame({"Purchase Count": purchase_count,
                                            "Average Purchase Price": avarage_purchase,
                                            "Total Purchase Count": total_purchase_count})
#Format
age_demographics_dataframe['Average Purchase Price'] = age_demographics_dataframe['Average Purchase Price'].map('${:,.2f}'.format)

age_demographics_dataframe.sort_values("Total Purchase Count", ascending=False).head(5)


Unnamed: 0_level_0,Purchase Count,Average Purchase Price,Total Purchase Count
SN,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Lisosia93,5,$3.79,18.96
Idastidru52,4,$3.86,15.45
Chamjask73,3,$4.61,13.83
Iral74,4,$3.40,13.62
Iskadarya95,3,$4.37,13.1


## Most Popular Items

* Retrieve the Item ID, Item Name, and Item Price columns


* Group by Item ID and Item Name. Perform calculations to obtain purchase count, item price, and total purchase value


* Create a summary data frame to hold the results


* Sort the purchase count column in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the summary data frame



In [12]:
#Group by Item ID and Item Name
popular_count_purchase = purchase_data.groupby(["Item ID","Item Name"]).count()["Price"]
popular_count_purchase.sort_values(axis=0, ascending=False, inplace=False).head(5)



Item ID  Item Name                                   
178      Oathbreaker, Last Hope of the Breaking Storm    12
82       Nirvana                                          9
108      Extraction, Quickblade Of Trembling Hands        9
145      Fiery Glass Crusader                             9
92       Final Critic                                     8
Name: Price, dtype: int64

## Most Profitable Items

* Sort the above table by total purchase value in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the data frame

