### Heroes Of Pymoli Data Analysis
* Of the 1163 active players, the vast majority are male (84%). There also exists, a smaller, but notable proportion of female players (14%).

* Our peak age demographic falls between 20-24 (44.8%) with secondary groups falling between 15-19 (18.60%) and 25-29 (13.4%).  
-----

### Note
* Instructions have been included for each segment. You do not have to follow them exactly, but they are included to help you think through the steps.

In [2]:
# Dependencies and Setup
import pandas as pd
import numpy as np

# File to Load (Remember to Change These)
file_to_load = "Resources/purchase_data.csv"

# Read Purchasing File and store into Pandas data frame
purchase_data = pd.read_csv(file_to_load)

## Player Count

* Display the total number of players


In [3]:
total_players=purchase_data['SN'].nunique()
players=purchase_data.drop_duplicates(subset=['SN'])
purchase_data.head()

Unnamed: 0,Purchase ID,SN,Age,Gender,Item ID,Item Name,Price
0,0,Lisim78,20,Male,108,"Extraction, Quickblade Of Trembling Hands",3.53
1,1,Lisovynya38,40,Male,143,Frenzied Scimitar,1.56
2,2,Ithergue48,24,Male,92,Final Critic,4.88
3,3,Chamassasya86,24,Male,100,Blindscythe,3.27
4,4,Iskosia90,23,Male,131,Fury,1.44


## Purchasing Analysis (Total)

* Run basic calculations to obtain number of unique items, average price, etc.


* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame


In [4]:
unique_items=purchase_data['Item ID'].nunique()
average_price=purchase_data['Price'].mean()
total_purchases=purchase_data['Purchase ID'].count()
total_revenue=purchase_data['Price'].sum()
summary_df=pd.DataFrame({'Total Players':[total_players],'Unique Items':[unique_items],
                         'Average Price $':[round(average_price,2)],'Total Puchases':[total_purchases],
                         'Total Revenue $':[total_revenue]})
summary_df

Unnamed: 0,Total Players,Unique Items,Average Price $,Total Puchases,Total Revenue $
0,576,183,3.05,780,2379.77


## Gender Demographics

* Percentage and Count of Male Players


* Percentage and Count of Female Players


* Percentage and Count of Other / Non-Disclosed




In [5]:
gender_df=players[['SN','Gender']].groupby('Gender').count()
gender_df['Percentages']=round(gender_df['SN']/total_players*100,2)
gender_df

Unnamed: 0_level_0,SN,Percentages
Gender,Unnamed: 1_level_1,Unnamed: 2_level_1
Female,81,14.06
Male,484,84.03
Other / Non-Disclosed,11,1.91



## Purchasing Analysis (Gender)

* Run basic calculations to obtain purchase count, avg. purchase price, avg. purchase total per person etc. by gender




* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame

In [6]:
gender_df2=purchase_data.groupby('Gender')
purchase_count_gender=gender_df2['Purchase ID'].count()
average_price_gender=gender_df2['Price'].mean()
total_purchase_gender=gender_df2['Price'].sum()
average_per_gender=total_purchase_gender/gender_df['SN']
gender_df['Purchase Count']=purchase_count_gender
gender_df['Average Price per Purchase']=round(average_price_gender,2)
gender_df['Total Revenue']=total_purchase_gender
gender_df['Average Revenue Per Person']=round(average_per_gender,2)
gender_df

Unnamed: 0_level_0,SN,Percentages,Purchase Count,Average Price per Purchase,Total Revenue,Average Revenue Per Person
Gender,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Female,81,14.06,113,3.2,361.94,4.47
Male,484,84.03,652,3.02,1967.64,4.07
Other / Non-Disclosed,11,1.91,15,3.35,50.19,4.56


## Age Demographics

* Establish bins for ages


* Categorize the existing players using the age bins. Hint: use pd.cut()


* Calculate the numbers and percentages by age group


* Create a summary data frame to hold the results


* Optional: round the percentage column to two decimal points


* Display Age Demographics Table


In [7]:
bins=[6,11,16,21,26,31,36,41,46]
groups=['6 to 10','11 to 15','16 to 20','21 to 25','26 to 30','31 to 35','36 to 40','41 to 45']
players['Age Groups']=pd.cut(players['Age'],bins,labels=groups)
age_df=players[['SN','Age Groups']].groupby('Age Groups').count()
age_df['Percentages']=round(age_df['SN']/age_df['SN'].count(),2)
age_df


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  This is separate from the ipykernel package so we can avoid doing imports until


Unnamed: 0_level_0,SN,Percentages
Age Groups,Unnamed: 1_level_1,Unnamed: 2_level_1
6 to 10,30,3.75
11 to 15,59,7.38
16 to 20,169,21.12
21 to 25,200,25.0
26 to 30,53,6.62
31 to 35,37,4.62
36 to 40,23,2.88
41 to 45,5,0.62


## Purchasing Analysis (Age)

* Bin the purchase_data data frame by age


* Run basic calculations to obtain purchase count, avg. purchase price, avg. purchase total per person etc. in the table below


* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame

In [8]:
bins=[6,11,16,21,26,31,36,41,46]
groups=['6 to 10','11 to 15','16 to 20','21 to 25','26 to 30','31 to 35','36 to 40','41 to 45']
purchase_data['Age Groups']=pd.cut(purchase_data['Age'],bins,labels=groups)
age_df2=purchase_data.groupby('Age Groups')
purchase_count_age=age_df2['Purchase ID'].count()
average_price_age=age_df2['Price'].mean()
total_purchase_age=age_df2['Price'].sum()
average_per_age=total_purchase_age/age_df['SN']
age_df['Purchase Count']=purchase_count_age
age_df['Average Price per Purchase']=round(average_price_age,2)
age_df['Total Revenue']=total_purchase_age
age_df['Average Revenue Per Person']=round(average_per_age,2)
age_df

Unnamed: 0_level_0,SN,Percentages,Purchase Count,Average Price per Purchase,Total Revenue,Average Revenue Per Person
Age Groups,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
6 to 10,30,3.75,39,3.28,127.75,4.26
11 to 15,59,7.38,77,2.97,228.37,3.87
16 to 20,169,21.12,232,3.07,711.74,4.21
21 to 25,200,25.0,277,3.04,841.09,4.21
26 to 30,53,6.62,70,2.93,205.28,3.87
31 to 35,37,4.62,50,2.89,144.32,3.9
36 to 40,23,2.88,30,3.54,106.23,4.62
41 to 45,5,0.62,5,3.0,14.99,3.0


## Top Spenders

* Run basic calculations to obtain the results in the table below


* Create a summary data frame to hold the results


* Sort the total purchase value column in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the summary data frame



In [9]:
players_group=purchase_data.groupby('SN')
purchase_total_player=players_group['Price'].sum()
purchase_per_player=players_group['Item ID'].count()
top_spender=pd.DataFrame({'Purchase Count':purchase_per_player,
                          'Purchase Total':purchase_total_player})
top_spender.sort_values(by='Purchase Total',ascending=False).head()

Unnamed: 0_level_0,Purchase Count,Purchase Total
SN,Unnamed: 1_level_1,Unnamed: 2_level_1
Lisosia93,5,18.96
Idastidru52,4,15.45
Chamjask73,3,13.83
Iral74,4,13.62
Iskadarya95,3,13.1


## Most Popular Items

* Retrieve the Item ID, Item Name, and Item Price columns


* Group by Item ID and Item Name. Perform calculations to obtain purchase count, item price, and total purchase value


* Create a summary data frame to hold the results


* Sort the purchase count column in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the summary data frame



In [10]:
items_group=purchase_data.groupby('Item ID')
item_total=items_group['Price'].sum()
item_count=items_group['Item ID'].count()
top_item=pd.DataFrame({'Item Name':purchase_data['Item Name'],
                       'Purchase Count':item_count,
                       'Purchase Total':item_total})
top_item.sort_values(by='Purchase Count',ascending=False).head()

Unnamed: 0,Item Name,Purchase Count,Purchase Total
178,"Despair, Favor of Due Diligence",12.0,50.76
145,Hopeless Ebon Dualblade,9.0,41.22
108,Malificent Bag,9.0,31.77
82,Azurewrath,9.0,44.1
19,"Blazefury, Protector of Delusions",8.0,8.16


## Most Profitable Items

* Sort the above table by total purchase value in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the data frame



In [11]:
top_item.sort_values(by='Purchase Total',ascending=False).head()

Unnamed: 0,Item Name,Purchase Count,Purchase Total
178,"Despair, Favor of Due Diligence",12.0,50.76
82,Azurewrath,9.0,44.1
145,Hopeless Ebon Dualblade,9.0,41.22
92,"Betrayal, Whisper of Grieving Widows",8.0,39.04
103,"Thorn, Satchel of Dark Souls",8.0,34.8
