### Note
* Instructions have been included for each segment. You do not have to follow them exactly, but they are included to help you think through the steps.

In [462]:
# Dependencies and Setup
import pandas as pd

# setting file, reading and storing as a data_frame
input = "Resources/purchase_data.csv"

purchase_data = pd.read_csv(input)
purchase_data.dtypes

Purchase ID      int64
SN              object
Age              int64
Gender          object
Item ID          int64
Item Name       object
Price          float64
dtype: object

## Player Count

* Display the total number of players


In [449]:
# Uses current df to find unique SN's and save them to player_count_df. Then we find the length of the new df for total players
# This removes all duplicate SN's
player_count = len(purchase_data["SN"].unique())

#Creates total_players df with column "Total Players" with the player count above being the row data
total_players_df = pd.DataFrame({"Total Players": [player_count]})
#Prints df and hides index
total_players_df.style.hide_index()

Total Players
576


## Purchasing Analysis (Total)

* Run basic calculations to obtain number of unique items, average price, etc.


* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame


In [450]:
# Number of unique items
unique_items = len(purchase_data["Item Name"].unique())
int(unique_items)

# Average price per item
average = purchase_data[{"Price"}].mean()
average_price
average_price.reset_index(drop=True, inplace=True)

# Average age
average_age = purchase_data[{"Age"}].mean()
average_age.reset_index(drop=True, inplace=True)

# Most popular item(s)
most_popular = purchase_data["Item Name"].mode()
most_popular

#Table to sotre above info
analysis_df = pd.DataFrame({"Unique Items": unique_items, "Average Price": average_price,
                            "Average Age": average_age, "Most Popular Item": most_popular})
#Set all float values to 2 decimal places
pd.options.display.float_format = "{:,.2f}".format
analysis_df.style.hide_index()

Unique Items,Average Price,Average Age,Most Popular Item
179,3.05098718,22.71410256,Final Critic


## Gender Demographics

* Percentage and Count of Male Players


* Percentage and Count of Female Players


* Percentage and Count of Other / Non-Disclosed




In [498]:
# Percentage and Count of Males
males = purchase_data.loc[purchase_data["Gender"] == "Male", :]
male_count = len(males["SN"].unique())
male_percentage = "{:.2f}%".format((male_count/player_count*100), 2)
male_percentage

females = purchase_data.loc[purchase_data["Gender"] == "Female", :]
female_count = len(females["SN"].unique())
female_percentage = "{:.2f}%".format((female_count/player_count*100), 2)
female_percentage

other = purchase_data.loc[purchase_data["Gender"] == "Other / Non-Disclosed", :]
other_count = len(other["SN"].unique())
other_percentage = "{:.2f}%".format(((other_count/player_count)*100), 2)
other_percentage

gender_demographic = pd.DataFrame({"Gender": ["Male","Female","Other / Non-Disclosed"],
                                   "Total of Gender": [male_count, female_count, other_count] , 
                                   "Percentage": [male_percentage, female_percentage, other_percentage]})
                                   
gender_demographic.style.hide_index()

Gender,Total of Gender,Percentage
Male,484,84.03%
Female,81,14.06%
Other / Non-Disclosed,11,1.91%



## Purchasing Analysis (Gender)

* Run basic calculations to obtain purchase count, avg. purchase price, avg. purchase total per person etc. by gender




* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame

In [519]:
# Male, Female, and other gender purchase counts
male_purchase = purchase_data.loc[purchase_data["Gender"] == "Male"]
purchase_count_male = len(male_purchase)

Female_purchase = purchase_data.loc[purchase_data["Gender"] == "Female"]
purchase_count_Female = len(Female_purchase)

other_purchase = purchase_data.loc[purchase_data["Gender"] == "Other / Non-Disclosed"]
purchase_count_other = len(other_purchase)

#Male, Female, and other gender avg. purchase price (Total amount / total products) - PER GENDER
male_purchase_price = purchase_data.loc[purchase_data["Gender"] == "Male", ["Price"]]
male_avg_purchase = float(male_purchase_price.sum()/purchase_count_male)

female_purchase_price = purchase_data.loc[purchase_data["Gender"] == "Female", ["Price"]]
female_avg_purchase = float(female_purchase_price.sum()/purchase_count_female)

other_purchase_price = purchase_data.loc[purchase_data["Gender"] == "Other / Non-Disclosed", ["Price"]]
other_avg_purchase = float(other_purchase_price.sum()/purchase_count_other)
other_avg_purchase

# Male, Female, and Other gender purchase per person "pp"
male_purchase_pp = male_purchase_price.sum() / male_count
female_purchase_pp = female_purchase_price.sum() / female_count
other_purchase_pp = other_purchase_price.sum() / other_count

# Creating dataframe with correct columns and data, will sort and format later
gender_purchase_analysis = pd.DataFrame({"Gender": ["Male","Female","Other / Non-Disclosed"],
                                    "Purchase Count": [purchase_count_male, purchase_count_Female, purchase_count_other],
                                    "Average Purchase Price": [male_avg_purchase, female_avg_purchase, other_avg_purchase],
                                    "Total Purchase Value": [float(male_purchase_price.sum()), float(female_purchase_price.sum()), float(other_purchase_price.sum())],
                                    "Avg Purchase per Person": [float(male_purchase_pp), float(female_purchase_pp), float(other_purchase_pp)]})
# Sorted by Average purchase price per person.
gender_purchase_analysis
gender_purchase_analysis = gender_purchase_analysis.sort_values("Avg Purchase per Person", ascending=False)
gender_purchase_analysis.style.format({'Average Purchase Price': "${:.2f}".format, 'Total Purchase Value': "${:.2f}".format, "Avg Purchase per Person": "${:.2f}".format })

Unnamed: 0,Gender,Purchase Count,Average Purchase Price,Total Purchase Value,Avg Purchase per Person
2,Other / Non-Disclosed,15,$3.35,$50.19,$4.56
1,Female,113,$3.20,$361.94,$4.47
0,Male,652,$3.02,$1967.64,$4.07



## Age Demographics

* Establish bins for ages


* Categorize the existing players using the age bins. Hint: use pd.cut()


* Calculate the numbers and percentages by age group


* Create a summary data frame to hold the results


* Optional: round the percentage column to two decimal points


* Display Age Demographics Table


Unnamed: 0_level_0,mean,sum
Gender,Unnamed: 1_level_1,Unnamed: 2_level_1
Female,3.2,361.94
Male,3.02,1967.64
Other / Non-Disclosed,3.35,50.19


## Purchasing Analysis (Age)

* Bin the purchase_data data frame by age


* Run basic calculations to obtain purchase count, avg. purchase price, avg. purchase total per person etc. in the table below


* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame

## Top Spenders

* Run basic calculations to obtain the results in the table below


* Create a summary data frame to hold the results


* Sort the total purchase value column in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the summary data frame



## Most Popular Items

* Retrieve the Item ID, Item Name, and Item Price columns


* Group by Item ID and Item Name. Perform calculations to obtain purchase count, average item price, and total purchase value


* Create a summary data frame to hold the results


* Sort the purchase count column in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the summary data frame



## Most Profitable Items

* Sort the above table by total purchase value in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the data frame

