### Heroes Of Pymoli Data Analysis
* Of the 1163 active players, the vast majority are male (84%). There also exists, a smaller, but notable proportion of female players (14%).

* Our peak age demographic falls between 20-24 (44.8%) with secondary groups falling between 15-19 (18.60%) and 25-29 (13.4%).  
-----

### Note
* Instructions have been included for each segment. You do not have to follow them exactly, but they are included to help you think through the steps.

In [2]:
# Dependencies and Setup
import pandas as pd
import numpy as np

# Raw data file
file_to_load = "Resources/purchase_data.csv"

# Read purchasing file and store into pandas data frame
purchase_data = pd.read_csv(file_to_load)
purchase_data.head()

Unnamed: 0,Purchase ID,SN,Age,Gender,Item ID,Item Name,Price
0,0,Lisim78,20,Male,108,"Extraction, Quickblade Of Trembling Hands",3.53
1,1,Lisovynya38,40,Male,143,Frenzied Scimitar,1.56
2,2,Ithergue48,24,Male,92,Final Critic,4.88
3,3,Chamassasya86,24,Male,100,Blindscythe,3.27
4,4,Iskosia90,23,Male,131,Fury,1.44


## Player Count

* Display the total number of players


In [3]:
number_of_players = purchase_data['SN'].unique()
player_count = pd.DataFrame(number_of_players).count()
display_player_count = pd.DataFrame({'Number of players' : player_count})
display_player_count

Unnamed: 0,Number of players
0,576


## Purchasing Analysis (Total)

* Run basic calculations to obtain number of unique items, average price, etc.


* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame


In [4]:
unique_items = purchase_data['Item ID'].value_counts()
unique_count = unique_items.count()
average_price = purchase_data['Price'].mean()
number_of_purchases = purchase_data['Purchase ID'].count()
total_revenue = purchase_data['Price'].sum()
purchase_analysis = pd.DataFrame([{'Number of Unique items' : unique_count, 'Average Price': round(average_price, 2), 'Number of Purchases': number_of_purchases, 'Total Revenue': total_revenue}])
purchase_analysis

Unnamed: 0,Average Price,Number of Purchases,Number of Unique items,Total Revenue
0,3.05,780,183,2379.77


## Gender Demographics

* Run basic calculations to obtain number of unique items, average price, etc.


* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame


In [5]:
gender_demo = pd.DataFrame()

gender_demo['Total count'] = purchase_data['Gender'].value_counts()
total_players = gender_demo['Total count'].sum()
gender_demo['percentage of players'] = round((gender_demo["Total count"]/total_players) * 100, 2)
gender_demo

Unnamed: 0,Total count,percentage of players
Male,652,83.59
Female,113,14.49
Other / Non-Disclosed,15,1.92



## Purchasing Analysis (Gender)

* Run basic calculations to obtain purchase count, avg. purchase price, etc. by gender


* For normalized purchasing, divide total purchase value by purchase count, by gender


* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame

In [6]:
gender_purchase = pd.DataFrame()
gender_purchase['purchase count'] = gender_demo['Total count']

male_data = purchase_data.loc[purchase_data['Gender'].isin(['Male'])]
female_data = purchase_data.loc[purchase_data['Gender'].isin(['Female'])]
other_data = purchase_data.loc[purchase_data['Gender'].isin(['Other / Non-Disclosed'])]
average_male_spending = round(male_data['Price'].mean(), 2)
average_female_spending = round(female_data['Price'].mean(), 2)
average_other_spending = round(other_data['Price'].mean(), 2)
total_male = round(male_data['Price'].sum(), 2)
total_female = round(female_data['Price'].sum(), 2)
total_other = round(other_data['Price'].sum(), 2)
normal_male = round(male_data['Price'].sum()/ male_data['Purchase ID'].count(), 2)
normal_female = round(female_data['Price'].sum()/ female_data['Purchase ID'].count(), 2)
normal_other = round(other_data['Price'].sum()/ other_data['Purchase ID'].count(), 2)

gender_purchase['Normalized Totals'] = [normal_male, normal_female, normal_other]
gender_purchase['Total Purchase value'] = [total_male, total_female, total_other]
gender_purchase['Average purchase price'] = [average_male_spending, average_female_spending, average_other_spending]
gender_purchase.head()

Unnamed: 0,purchase count,Normalized Totals,Total Purchase value,Average purchase price
Male,652,3.02,1967.64,3.02
Female,113,3.2,361.94,3.2
Other / Non-Disclosed,15,3.35,50.19,3.35


## Age Demographics

* Establish bins for ages


* Categorize the existing players using the age bins. Hint: use pd.cut()


* Calculate the numbers and percentages by age group


* Create a summary data frame to hold the results


* Optional: round the percentage column to two decimal points


* Display Age Demographics Table


In [7]:
# Establish bins for ages
age_bins = [0, 9.90, 14.90, 19.90, 24.90, 29.90, 34.90, 39.90, 99999]
group_names = ["<10", "10-14", "15-19", "20-24", "25-29", "30-34", "35-39", "40+"]

purchase_data['age range'] = pd.cut(purchase_data.Age , age_bins, labels = group_names)


age_range = purchase_data.set_index('age range')


age_range['Total count'] = purchase_data['age range'].value_counts()
df1 = age_range['Total count'].unique()
df2 =purchase_data['age range'].unique()
construct = pd.DataFrame()
construct['Age Range'] = df2
construct['Total Count'] = df1
almost_there = construct.set_index('Age Range')
almost_there['percentage of players'] = round((almost_there["Total Count"]/total_players) * 100, 2)
almost_there

Unnamed: 0_level_0,Total Count,percentage of players
Age Range,Unnamed: 1_level_1,Unnamed: 2_level_1
20-24,365,46.79
40+,13,1.67
35-39,41,5.26
30-34,73,9.36
25-29,101,12.95
10-14,28,3.59
<10,23,2.95
15-19,136,17.44


## Purchasing Analysis (Age)

* Bin the purchase_data data frame by age


* Run basic calculations to obtain purchase count, avg. purchase price, etc. in the table below


* Calculate Normalized Purchasing


* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame

In [8]:
almost_there['Total Purchase Value'] = purchase_data.groupby('age range').Price.sum()
almost_there['Average Purchase Price'] = round(purchase_data.groupby('age range').Price.mean(), 2)
almost_there['Normalized Totals'] = round(almost_there['Total Purchase Value']/ almost_there['Total Count'], 2)
age_purchases = almost_there[['Total Count', 'Total Purchase Value', 'Average Purchase Price', 'Normalized Totals']]
age_purchases

Unnamed: 0_level_0,Total Count,Total Purchase Value,Average Purchase Price,Normalized Totals
Age Range,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
20-24,365,1114.06,3.05,3.05
40+,13,38.24,2.94,2.94
35-39,41,147.67,3.6,3.6
30-34,73,214.0,2.93,2.93
25-29,101,293.0,2.9,2.9
10-14,28,82.78,2.96,2.96
<10,23,77.13,3.35,3.35
15-19,136,412.89,3.04,3.04


## Top Spenders

* Run basic calculations to obtain the results in the table below


* Create a summary data frame to hold the results


* Sort the total purchase value column in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the summary data frame



In [9]:
sn_data = pd.DataFrame()
sn_data['Total Purchase Value'] = purchase_data.groupby('SN').Price.sum()
sn_data['Average Purchase Value'] = round(purchase_data.groupby('SN').Price.mean(), 2)
sn_data['Purchase Count'] = purchase_data.groupby('SN').Price.nunique()
sn_data.sort_values('Total Purchase Value', ascending=False).head()


Unnamed: 0_level_0,Total Purchase Value,Average Purchase Value,Purchase Count
SN,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Lisosia93,18.96,3.79,5
Idastidru52,15.45,3.86,4
Chamjask73,13.83,4.61,3
Iral74,13.62,3.4,4
Iskadarya95,13.1,4.37,3


## Most Popular Items

* Retrieve the Item ID, Item Name, and Item Price columns


* Group by Item ID and Item Name. Perform calculations to obtain purchase count, item price, and total purchase value


* Create a summary data frame to hold the results


* Sort the purchase count column in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the summary data frame



In [60]:
popular_items = pd.DataFrame()
popular_items['Total Purchase Value'] = purchase_data.groupby(['Item ID', 'Item Name']).Price.sum()
popular_items['Purchase Count'] =  purchase_data.groupby(['Item ID', 'Item Name']).Price.count()
popular_items['Price'] = popular_items['Total Purchase Value']/popular_items['Purchase Count']
popular_items.sort_values('Total Purchase Value', ascending=False).head()

Unnamed: 0_level_0,Unnamed: 1_level_0,Total Purchase Value,Purchase Count,Price
Item ID,Item Name,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
178,"Oathbreaker, Last Hope of the Breaking Storm",50.76,12,4.23
82,Nirvana,44.1,9,4.9
145,Fiery Glass Crusader,41.22,9,4.58
92,Final Critic,39.04,8,4.88
103,Singed Scalpel,34.8,8,4.35


## Most Profitable Items

* Sort the above table by total purchase value in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the data frame



In [61]:
print('Already did that ^')

Already did that ^
