### Heroes Of Pymoli Data Analysis
* Of the 1163 active players, the vast majority are male (84%). There also exists, a smaller, but notable proportion of female players (14%).

* Our peak age demographic falls between 20-24 (44.8%) with secondary groups falling between 15-19 (18.60%) and 25-29 (13.4%).  
-----

### Note
* Instructions have been included for each segment. You do not have to follow them exactly, but they are included to help you think through the steps.

In [1]:
# Dependencies and Setup
import pandas as pd

# File to Load (Remember to Change These)
file_to_load = "Resources/purchase_data.csv"

# Read Purchasing File and store into Pandas data frame
purchase_data_df = pd.read_csv(file_to_load)

In [2]:
#What are the columns in teh data file
purchase_data_df.columns

Index(['Purchase ID', 'SN', 'Age', 'Gender', 'Item ID', 'Item Name', 'Price'], dtype='object')

## Player Count

* Display the total number of players


In [3]:
playerCountTotal = purchase_data_df['SN'].nunique()
playerCountTotal

576

In [4]:
#Make sure set has no duplicates using nunique and return count, display in a dataframe
playerCountTotal_df = pd.DataFrame({'Total Players' : [playerCountTotal]})
playerCountTotal_df

Unnamed: 0,Total Players
0,576


## Purchasing Analysis (Total)

* Run basic calculations to obtain number of unique items, average price, etc.


* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame


In [5]:
#Unique Items
itemsTotal = purchase_data_df['Item ID'].nunique()
itemsTotal

179

In [6]:
#Average Price
itemsAvgPrice = purchase_data_df['Price'].mean()
itemsAvgPrice

3.0509871794871795

In [7]:
#Number Purchases


In [8]:
purchase_data_df['Mean'] = purchase_data_df['Price'].mean()
purchase_data_df.head()

Unnamed: 0,Purchase ID,SN,Age,Gender,Item ID,Item Name,Price,Mean
0,0,Lisim78,20,Male,108,"Extraction, Quickblade Of Trembling Hands",3.53,3.050987
1,1,Lisovynya38,40,Male,143,Frenzied Scimitar,1.56,3.050987
2,2,Ithergue48,24,Male,92,Final Critic,4.88,3.050987
3,3,Chamassasya86,24,Male,100,Blindscythe,3.27,3.050987
4,4,Iskosia90,23,Male,131,Fury,1.44,3.050987


In [9]:
#Summary Table
#Not able to get this format working 'Average Price' : [purchase_data_df['Mean'].astype(float).map("${:,.0f}".format)],
purchase_summary_df = pd.DataFrame({'Total Players' : [playerCountTotal],
                                    'Average Price' : [purchase_data_df['Price'].mean()],
                                   'Number Purchases' : [purchase_data_df['Purchase ID'].nunique()],
                                   'Total Revenue' : [purchase_data_df['Price'].sum()]})
purchase_summary_df

Unnamed: 0,Total Players,Average Price,Number Purchases,Total Revenue
0,576,3.050987,780,2379.77


## Gender Demographics

* Percentage and Count of Male Players


* Percentage and Count of Female Players


* Percentage and Count of Other / Non-Disclosed




In [10]:
#clean data of nulls 
purchase_data_df.count()

Purchase ID    780
SN             780
Age            780
Gender         780
Item ID        780
Item Name      780
Price          780
Mean           780
dtype: int64

In [11]:
cleaned_purchase_data_df = purchase_data_df.dropna(how='all')
cleaned_purchase_data_df.head()

Unnamed: 0,Purchase ID,SN,Age,Gender,Item ID,Item Name,Price,Mean
0,0,Lisim78,20,Male,108,"Extraction, Quickblade Of Trembling Hands",3.53,3.050987
1,1,Lisovynya38,40,Male,143,Frenzied Scimitar,1.56,3.050987
2,2,Ithergue48,24,Male,92,Final Critic,4.88,3.050987
3,3,Chamassasya86,24,Male,100,Blindscythe,3.27,3.050987
4,4,Iskosia90,23,Male,131,Fury,1.44,3.050987


In [12]:
#clean data of duplicates to get accurate count of only gender and players
gender_purchase_data_df = cleaned_purchase_data_df.loc[:, ['SN','Gender']]
gender_purchase_data_df.head()

Unnamed: 0,SN,Gender
0,Lisim78,Male
1,Lisovynya38,Male
2,Ithergue48,Male
3,Chamassasya86,Male
4,Iskosia90,Male


In [25]:
gender_purchase_data_df.drop_duplicates(inplace = True)
genderSummary_df = pd.DataFrame(gender_purchase_data_df['Gender'].value_counts())
genderSummary_df

Unnamed: 0,Gender
Male,484
Female,81
Other / Non-Disclosed,11


In [26]:
#rename columns...?
genderSummary_df.rename(columns={'Gender': 'Total Count'})
genderSummary_df

Unnamed: 0,Gender
Male,484
Female,81
Other / Non-Disclosed,11


In [27]:
genderSummary_df['Percentage'] = genderSummary_df/genderSummary_df.sum()*100
#genderSummary_df.sum()
genderSummary_df




Unnamed: 0,Gender,Percentage
Male,484,84.027778
Female,81,14.0625
Other / Non-Disclosed,11,1.909722



## Purchasing Analysis (Gender)

* Run basic calculations to obtain purchase count, avg. purchase price, avg. purchase total per person etc. by gender




* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame

In [31]:
genderPurchaseSummary = purchase_data_df.groupby('Gender')
genderPurchaseSummary.head()

Unnamed: 0,Purchase ID,SN,Age,Gender,Item ID,Item Name,Price,Mean
0,0,Lisim78,20,Male,108,"Extraction, Quickblade Of Trembling Hands",3.53,3.050987
1,1,Lisovynya38,40,Male,143,Frenzied Scimitar,1.56,3.050987
2,2,Ithergue48,24,Male,92,Final Critic,4.88,3.050987
3,3,Chamassasya86,24,Male,100,Blindscythe,3.27,3.050987
4,4,Iskosia90,23,Male,131,Fury,1.44,3.050987
9,9,Chanosian48,35,Other / Non-Disclosed,136,Ghastly Adamantite Protector,3.58,3.050987
15,15,Lisassa64,21,Female,98,"Deadline, Voice Of Subtlety",2.89,3.050987
18,18,Reunasu60,22,Female,82,Nirvana,4.9,3.050987
22,22,Siarithria38,38,Other / Non-Disclosed,24,Warped Fetish,3.81,3.050987
38,38,Reulae52,10,Female,116,Renewed Skeletal Katana,4.18,3.050987


## Age Demographics

* Establish bins for ages


* Categorize the existing players using the age bins. Hint: use pd.cut()


* Calculate the numbers and percentages by age group


* Create a summary data frame to hold the results


* Optional: round the percentage column to two decimal points


* Display Age Demographics Table


## Purchasing Analysis (Age)

* Bin the purchase_data data frame by age


* Run basic calculations to obtain purchase count, avg. purchase price, avg. purchase total per person etc. in the table below


* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame

## Top Spenders

* Run basic calculations to obtain the results in the table below


* Create a summary data frame to hold the results


* Sort the total purchase value column in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the summary data frame



## Most Popular Items

* Retrieve the Item ID, Item Name, and Item Price columns


* Group by Item ID and Item Name. Perform calculations to obtain purchase count, item price, and total purchase value


* Create a summary data frame to hold the results


* Sort the purchase count column in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the summary data frame



## Most Profitable Items

* Sort the above table by total purchase value in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the data frame

