### Note
* Instructions have been included for each segment. You do not have to follow them exactly, but they are included to help you think through the steps.

In [1]:
# Dependencies and Setup
import pandas as pd
import numpy as np
import os

# File to Load (Remember to Change These)
file_to_load = "Resources/purchase_data.csv"

# Read Purchasing File and store into Pandas data frame
purchase_data = pd.read_csv(file_to_load)

## Player Count

* Display the total number of players


In [2]:
heroes_df = pd.DataFrame(purchase_data)
heroes_df.set_index('Purchase ID', append=True)
pd.set_option('display.max_columns', 7)
pd.set_option('display.max_rows', 780)
heroes_df = heroes_df.rename(columns={"Purchase ID":"Purchase_ID", "Item ID":"Item_ID", "Item Name":"Item_Name"})

In [3]:
heroes_df.dtypes

Purchase_ID      int64
SN              object
Age              int64
Gender          object
Item_ID          int64
Item_Name       object
Price          float64
dtype: object

In [12]:
heroes_df.astype({'Item_ID': 'object', 'Price':'int64'}).dtypes

Purchase_ID     int64
SN             object
Age             int64
Gender         object
Item_ID        object
Item_Name      object
Price           int64
dtype: object

In [57]:
#total number of players
heroes_count = heroes_df['Purchase_ID'].count()
print(heroes_count)


780


## Purchasing Analysis (Total)

In [60]:
#average price
heroes_avg = heroes_df['Price'].mean()
heroes_avg

3.050987179487176

In [61]:
#number of unique items
heroes_unique = heroes_df.Item_ID.nunique()
heroes_unique

179

In [62]:
#sum of total sales
heroes_sum = heroes_df['Price'].sum()
heroes_sum

2379.77

In [63]:
heroes_df.describe()

Unnamed: 0,Purchase_ID,Age,Item_ID,Price
count,780.0,780.0,780.0,780.0
mean,389.5,22.714103,91.755128,3.050987
std,225.310896,6.659444,52.697702,1.169549
min,0.0,7.0,0.0,1.0
25%,194.75,20.0,47.75,1.98
50%,389.5,22.0,92.0,3.15
75%,584.25,25.0,138.0,4.08
max,779.0,45.0,183.0,4.99


* Run basic calculations to obtain number of unique items, average price, etc.


* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame


In [66]:
heroes_df_summary = pd.DataFrame({'Unique_items':[heroes_unique], 'Avg_Price':[heroes_avg],'Total_Players':[heroes_count], 'Total_Sales':[heroes_sum]})

In [67]:
heroes_df_summary

Unnamed: 0,Unique_items,Avg_Price,Total_Players,Total_Sales
0,179,3.050987,780,2379.77


## Gender Demographics

* Percentage and Count of Male Players


* Percentage and Count of Female Players


* Percentage and Count of Other / Non-Disclosed




In [43]:
#count of gender type
my_stat = ['count']
gender_demo = heroes_df.groupby(['Gender'], as_index=False)[['SN']].agg(my_stat)
gender_demo

Unnamed: 0_level_0,SN
Unnamed: 0_level_1,count
Gender,Unnamed: 1_level_2
Female,113
Male,652
Other / Non-Disclosed,15


In [72]:
#percentage per gender type
percentage = (gender_demo/heroes_count)
percentage

Unnamed: 0_level_0,SN
Unnamed: 0_level_1,count
Gender,Unnamed: 1_level_2
Female,0.144872
Male,0.835897
Other / Non-Disclosed,0.019231



## Purchasing Analysis (Gender)

* Run basic calculations to obtain purchase count, avg. purchase price, avg. purchase total per person etc. by gender




* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame

In [73]:
#gender purchase count, avg price, avg total
my_stat = ['count', 'mean', 'sum']
gender_stats = heroes_df.groupby(['Gender'], as_index=False)[['Price']].agg(my_stat)
gender_stats

Unnamed: 0_level_0,Price,Price,Price
Unnamed: 0_level_1,count,mean,sum
Gender,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2
Female,113,3.203009,361.94
Male,652,3.017853,1967.64
Other / Non-Disclosed,15,3.346,50.19


## Age Demographics

* Establish bins for ages


* Categorize the existing players using the age bins. Hint: use pd.cut()


* Calculate the numbers and percentages by age group


* Create a summary data frame to hold the results


* Optional: round the percentage column to two decimal points


* Display Age Demographics Table


## Purchasing Analysis (Age)

* Bin the purchase_data data frame by age


* Run basic calculations to obtain purchase count, avg. purchase price, avg. purchase total per person etc. in the table below


* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame

## Top Spenders

* Run basic calculations to obtain the results in the table below


* Create a summary data frame to hold the results


* Sort the total purchase value column in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the summary data frame



## Most Popular Items

* Retrieve the Item ID, Item Name, and Item Price columns


* Group by Item ID and Item Name. Perform calculations to obtain purchase count, average item price, and total purchase value


* Create a summary data frame to hold the results


* Sort the purchase count column in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the summary data frame



## Most Profitable Items

* Sort the above table by total purchase value in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the data frame

