### Heroes Of Pymoli Data Analysis
Of the 1163 active players, the vast majority are male (84%). There also exists, a smaller, but notable proportion of female players (14%).

Our peak age demographic falls between 20-24 (44.8%) with secondary groups falling between 15-19 (18.60%) and 25-29 (13.4%).



In [86]:
# Dependencies and Setup
import pandas as pd
import numpy as np

# File to Load (Remember to Change These)
file_to_load = "Resources/purchase_data.csv"

# Read Purchasing File and store into Pandas data frame
purchase_data = pd.read_csv(file_to_load)
purchase_data['Age'].max()

45

### Player Count

In [87]:
purchase_data.head()

Unnamed: 0,Purchase ID,SN,Age,Gender,Item ID,Item Name,Price
0,0,Lisim78,20,Male,108,"Extraction, Quickblade Of Trembling Hands",3.53
1,1,Lisovynya38,40,Male,143,Frenzied Scimitar,1.56
2,2,Ithergue48,24,Male,92,Final Critic,4.88
3,3,Chamassasya86,24,Male,100,Blindscythe,3.27
4,4,Iskosia90,23,Male,131,Fury,1.44


In [89]:
purchase_data_nodups = purchase_data.drop_duplicates(subset='SN')
purchase_data_nodups

Unnamed: 0,Purchase ID,SN,Age,Gender,Item ID,Item Name,Price
0,0,Lisim78,20,Male,108,"Extraction, Quickblade Of Trembling Hands",3.53
1,1,Lisovynya38,40,Male,143,Frenzied Scimitar,1.56
2,2,Ithergue48,24,Male,92,Final Critic,4.88
3,3,Chamassasya86,24,Male,100,Blindscythe,3.27
4,4,Iskosia90,23,Male,131,Fury,1.44
5,5,Yalae81,22,Male,81,Dreamkiss,3.61
6,6,Itheria73,36,Male,169,"Interrogator, Blood Blade of the Queen",2.18
7,7,Iskjaskst81,20,Male,162,Abyssal Shard,2.67
8,8,Undjask33,22,Male,21,Souleater,1.10
9,9,Chanosian48,35,Other / Non-Disclosed,136,Ghastly Adamantite Protector,3.58


### Purchasing Analysis (Total)

In [115]:
number_of_players = len(purchase_data['SN'].unique())

number_of_unique_items = len(purchase_data['Item ID'].value_counts())

average_price = round(purchase_data['Price'].mean(),2)

number_of_purchases = purchase_data.shape[0]

total_revenue = purchase_data['Price'].sum()

purchasing_analysis_df = pd.DataFrame({'Number of Unique Items': [number_of_unique_items],
              'Average Price': [average_price], 
              'Number of Purchases': [number_of_purchases], 
              'Total Revenue': [total_revenue]})
purchasing_analysis_df 

576


Unnamed: 0,Number of Unique Items,Average Price,Number of Purchases,Total Revenue
0,183,3.05,780,2379.77


### Gender Demographics

In [91]:
gender_players = purchase_data.groupby('Gender')
gender_players_df = gender_players.nunique()

# Percentage of Male Players
number_of_male_players = gender_players_df.iloc[1,1]
percentage_of_male_players = round((number_of_male_players/number_of_players)*100,2)
#print(number_of_male_players)
#print(percentage_of_male_players)

# Percentage of Female Players
number_of_female_players = gender_players_df.iloc[0,1]
percentage_of_female_players = round((number_of_female_players/number_of_players)*100,2)
#print(number_of_female_players)
#print(percentage_of_female_players)

# Percentage of Other Players
number_of_other_players = number_of_players - (number_of_male_players + number_of_female_players)
percentage_of_other_players = round(100 - (percentage_of_female_players+percentage_of_male_players),2)
#print(number_of_other_players)
#print(percentage_of_other_players)


data1 = {'Total Count': [number_of_male_players,number_of_female_players,number_of_other_players], 
        'Percentage of Players': [percentage_of_male_players,percentage_of_female_players,percentage_of_other_players]}

player_gender_df = pd.DataFrame(data1,index=['Male','Female','Other'])
player_gender_df


Unnamed: 0,Total Count,Percentage of Players
Male,484,84.03
Female,81,14.06
Other,11,1.91


### Purchasing Analysis (Gender)

In [290]:
groupby_df = purchase_data.groupby('Gender')
groupby_nodups_df = purchase_data_nodups.groupby('Gender')

# Number of male/female/other purchasers (no-duplicates)
male_nodups_count = groupby_nodups_df.get_group('Male').shape[0]
female_nodups_count = groupby_nodups_df.get_group('Female').shape[0]
other_nodups_count = groupby_nodups_df.get_group('Other / Non-Disclosed').shape[0]

# Creating Sub-DataFrames
male_df = groupby_df.get_group('Male')
female_df = groupby_df.get_group('Female')
other_df = groupby_df.get_group('Other / Non-Disclosed')

# Purchase Counts
male_purchase_count = male_df['Gender'].count()
female_purchase_count = female_df['Gender'].count()
other_purchase_count = other_df['Gender'].count()

# Average Purchase Price
male_avg_purchase = round(male_df['Price'].mean(),2)
female_avg_purchase = round(female_df['Price'].mean(),2)
other_avg_purchase = round(other_df['Price'].mean(),2)

# Total Purchase Price
male_total_purchase = male_df['Price'].sum()
female_total_purchase = female_df['Price'].sum()
other_total_purchase = other_df['Price'].sum()

# Average Purchase per Person
male_avg_total = male_total_purchase/male_purchase_count
male_avg_total

# Create Purchase Analysis (Gender) DataFrame
data2 = {'Purchase Count': [female_purchase_count,male_purchase_count,other_purchase_count],
        'Average Purchase Price': [female_avg_purchase,male_avg_purchase,other_avg_purchase],
        'Total Purchase Value': [female_total_purchase,male_total_purchase,other_total_purchase],
        'Average Total Purchase per Person': [round(female_total_purchase/female_nodups_count,2),round(male_total_purchase/male_nodups_count,2),round(other_total_purchase/other_nodups_count,2)]}

gender_purchase_df = pd.DataFrame(data2, index=['Female','Male','Other'])
gender_purchase_df.index.name = 'Gender'
gender_purchase_df


Unnamed: 0_level_0,Purchase Count,Average Purchase Price,Total Purchase Value,Average Total Purchase per Person
Gender,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
Female,113,3.2,361.94,4.47
Male,652,3.02,1967.64,4.07
Other,15,3.35,50.19,4.56


### Age Demographics
 
Establish bins for ages


Categorize the existing players using the age bins. Hint: use pd.cut()


Calculate the numbers and percentages by age group


Create a summary data frame to hold the results


Optional: round the percentage column to two decimal points


Display Age Demographics Table

In [153]:
# Create buckets for demographic data
bins = [purchase_data_nodups['Age'].min(),10,15,20,25,30,35,40,purchase_data_nodups['Age'].max()+1]

# Make age demographic data frame
age_df = pd.DataFrame()

# Count the number of players from different age ranges
age_df['Total Count'] = pd.cut(purchase_data_nodups['Age'],bins,right=False,include_lowest=True).value_counts()
age_df_sorted = age_df.sort_index()

# Calculate percentage of each age bracket 
age_df_sorted['Percentage of Players'] = round((age_df_sorted['Total Count'] / number_of_players) *100,2)
age_df_sorted.index.name = 'Age Group'
age_df_sorted

Unnamed: 0_level_0,Total Count,Percentage of Players
Age Group,Unnamed: 1_level_1,Unnamed: 2_level_1
"[7, 10)",17,2.95
"[10, 15)",22,3.82
"[15, 20)",107,18.58
"[20, 25)",258,44.79
"[25, 30)",77,13.37
"[30, 35)",52,9.03
"[35, 40)",31,5.38
"[40, 46)",12,2.08


### Purchasing Analysis (Age)
 
Bin the purchase_data data frame by age


Run basic calculations to obtain purchase count, avg. purchase price, avg. purchase total per person etc. in the table below


Create a summary data frame to hold the results


Optional: give the displayed data cleaner formatting


Display the summary data frame

In [238]:
# Creating age groups for purchase data
bins = [purchase_data['Age'].min(),10,15,20,25,30,35,40,purchase_data['Age'].max()+1]

# Make dataframe
age_purchase_df = pd.DataFrame()

# Count number of purchases made by each age group, and sort by age group 
age_purchase_df['Purchase Count'] = pd.cut(purchase_data['Age'],bins,right=False,include_lowest=True).value_counts()
age_purchase_df = age_purchase_df.sort_index()

# Make new column with corresponding age bins, then group by bins
purchase_data['age_group'] = pd.cut(purchase_data['Age'], bins, right=False, include_lowest=True)
grouped_purchase_data = purchase_data.groupby('age_group')

# Calculate Total Purchases made by each age bracket
age_purchase_df['Total Purchases'] = grouped_purchase_data['Price'].sum().values

# Calculate Average Purchase made by each age bracket
age_purchase_df['Average Purchase Price']= round(age_purchase_df['Total Purchases']/age_purchase_df['Purchase Count'],2)
age_purchase_df

Unnamed: 0,Purchase Count,Total Purchases,Average Purchase Price
"[7, 10)",23,77.13,3.35
"[10, 15)",28,82.78,2.96
"[15, 20)",136,412.89,3.04
"[20, 25)",365,1114.06,3.05
"[25, 30)",101,293.0,2.9
"[30, 35)",73,214.0,2.93
"[35, 40)",41,147.67,3.6
"[40, 46)",13,38.24,2.94


### Top Spenders
 
Run basic calculations to obtain the results in the table below


Create a summary data frame to hold the results


Sort the total purchase value column in descending order


Optional: give the displayed data cleaner formatting


Display a preview of the summary data frame

In [313]:
top_spenders = purchase_data.groupby(['SN']).Price.sum().to_frame()
top_spenders = top_spenders.sort_values(by='Price',ascending=False).head()
top_spenders

Unnamed: 0_level_0,Price
SN,Unnamed: 1_level_1
Lisosia93,18.96
Idastidru52,15.45
Chamjask73,13.83
Iral74,13.62
Iskadarya95,13.1
