### Heroes Of Pymoli Data Analysis
* Of the 1163 active players, the vast majority are male (84%). There also exists, a smaller, but notable proportion of female players (14%).

* Our peak age demographic falls between 20-24 (44.8%) with secondary groups falling between 15-19 (18.60%) and 25-29 (13.4%).  
-----

### Note
* Instructions have been included for each segment. You do not have to follow them exactly, but they are included to help you think through the steps.

In [None]:
# Dependencies and Setup
import pandas as pd
import numpy as np

# File to Load (Remember to Change These)
file_to_load = "Resources/purchase_data.csv"

# Read Purchasing File and store into Pandas data frame
purchase_data = pd.read_csv(file_to_load)

In [2]:
purchase_data.head()

## Player Count

* Display the total number of players


In [None]:
#total_players = len(purchase_data["SN"].value_counts())
total_players = pd.DataFrame({"SN":purchase_data["SN"].unique()})
num_total_players = len(total_players)
num_total_players

## Purchasing Analysis (Total)

* Run basic calculations to obtain number of unique items, average price, etc.


* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame


In [None]:
unique_items = purchase_data["Item ID"].value_counts()
unique_items.head()
num_of_unique_items = len(unique_items)
num_of_unique_items
num_purchases = len(purchase_data)
num_purchases
total_revenue = sum(purchase_data["Price"])
average_price = (total_revenue/num_purchases)
df_summary = pd.DataFrame({'Number of Unique Items':[num_of_unique_items], 
                           'Average Price':[average_price],
                           'Number of Purchases':[num_purchases],
                           'Total Revenue':[total_revenue]})
df_summary['Average Price'] = df_summary['Average Price'].astype(float).map("${:.2f}".format)
df_summary['Total Revenue'] = df_summary['Total Revenue'].astype(float).map("${:,.2f}".format)
df_summary

## Gender Demographics

* Percentage and Count of Male Players


* Percentage and Count of Female Players


* Percentage and Count of Other / Non-Disclosed




In [None]:
player_demographics = pd.DataFrame(purchase_data.groupby(["Gender"]).nunique()["SN"])
player_demographics = player_demographics.rename(columns={"SN": "Total Count"})
player_demographics["Percentage of Players"] = 100*player_demographics["Total Count"]/num_total_players
player_demographics["Percentage of Players"] = player_demographics["Percentage of Players"].map("{:.2f}%".format)
player_demographics


## Purchasing Analysis (Gender)

* Run basic calculations to obtain purchase count, avg. purchase price, avg. purchase total per person etc. by gender




* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame

In [None]:
purchase_data_by_gender = purchase_data.groupby(["Gender"])
purchase_count = purchase_data_by_gender["Purchase ID"].count()
average_purchase_price = purchase_data_by_gender["Price"].mean()
total_purchase_value = purchase_data_by_gender["Price"].sum()
average_total_purchase = purchase_data_by_gender["Price"].sum()/player_demographics["Total Count"]
df_gender_summary = pd.DataFrame({"Purchase Count":purchase_count,
                                  'Average Purchase Price':average_purchase_price,
                                  'Total Purchase Value':total_purchase_value,
                                  'Avg Total Purchase per Person':average_total_purchase})
df_gender_summary['Average Purchase Price'] = df_gender_summary['Average Purchase Price'].map("${:,.2f}".format)
df_gender_summary['Total Purchase Value'] = df_gender_summary['Total Purchase Value'].map("${:,.2f}".format)
df_gender_summary['Avg Total Purchase per Person'] = df_gender_summary['Avg Total Purchase per Person'].map("${:,.2f}".format)
df_gender_summary

## Age Demographics

* Establish bins for ages


* Categorize the existing players using the age bins. Hint: use pd.cut()


* Calculate the numbers and percentages by age group


* Create a summary data frame to hold the results


* Optional: round the percentage column to two decimal points


* Display Age Demographics Table


In [None]:
bins = [0,9,14,19,24,29,34,39,100]
bin_labels = ["<10","10-14","15-19","20-24","25-29","30-34","35-39","40+"]
players_df = purchase_data[["SN","Age"]]
players_df = players_df.drop_duplicates("SN")
players_df["Age Group"] = pd.cut(players_df["Age"],bins,labels=bin_labels)
players_age_group = players_df.groupby("Age Group").count()
players_age_group = players_age_group[["Age"]]
players_age_group = players_age_group.rename(columns={"Age":"Total Count"})
players_age_group["Percentage of Players"] = 100*players_age_group["Total Count"]/num_total_players
players_age_group["Percentage of Players"] = players_age_group["Percentage of Players"].map("{:0.2f}".format)
players_age_group

## Purchasing Analysis (Age)

* Bin the purchase_data data frame by age


* Run basic calculations to obtain purchase count, avg. purchase price, avg. purchase total per person etc. in the table below


* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame

In [None]:
purchase_data["Age Group"] = pd.cut(purchase_data["Age"],bins,labels = bin_labels)
age_purchase_count = purchase_data.groupby("Age Group")["Price"].count()
age_avg_purchase_price = purchase_data.groupby("Age Group")["Price"].mean()
age_total_purchase_value = purchase_data.groupby("Age Group")["Price"].sum()
age_total_purchase_person = purchase_data.groupby("Age Group")["Price"].sum()/players_age_group["Total Count"]
age_group_purchase_df = pd.DataFrame({"Purchase Count":age_purchase_count,
                            "Average Purchase Price":age_avg_purchase_price.map("${:.2f}".format),
                           "Total Purchase Value":age_total_purchase_value.map("${:,.2f}".format),
                                     "Avg Total Purchase per Person":age_total_purchase_person.map("${:.2f}".format)})
age_group_purchase_df

## Top Spenders

* Run basic calculations to obtain the results in the table below


* Create a summary data frame to hold the results


* Sort the total purchase value column in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the summary data frame



In [None]:
sn_purchase_count = purchase_data.groupby("SN")["Purchase ID"].count()
sn_avg_purchase_price = purchase_data.groupby("SN")["Price"].mean()
sn_total_purchase_value = purchase_data.groupby("SN")["Price"].sum()
spenders_df = pd.DataFrame({"Purchase Count":sn_purchase_count,
                            "Average Purchase Price":sn_avg_purchase_price,
                           "Total Purchase Value":sn_total_purchase_value})
top_five_spenders = spenders_df.sort_values("Total Purchase Value",ascending=False).head(5)
top_five_spenders["Average Purchase Price"] = top_five_spenders["Average Purchase Price"].map("${:.2f}".format)
top_five_spenders["Total Purchase Value"] = top_five_spenders["Total Purchase Value"].map("${:.2f}".format)
top_five_spenders

## Most Popular Items

* Retrieve the Item ID, Item Name, and Item Price columns


* Group by Item ID and Item Name. Perform calculations to obtain purchase count, item price, and total purchase value


* Create a summary data frame to hold the results


* Sort the purchase count column in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the summary data frame



In [None]:
items = purchase_data[["Item ID","Item Name","Price"]]
items_groupby = items.groupby(["Item ID","Item Name"])
item_purchase_count = items_groupby["Item ID"].count()
item_price = items_groupby["Price"].mean()
item_total_purchase_value = items_groupby["Price"].sum()
items_df = pd.DataFrame({"Purchase Count":item_purchase_count,
                         "Item Price":item_price,
                         "Total Purchase Value":item_total_purchase_value})
five_popular_items = items_df.sort_values("Purchase Count",ascending=False).head(5)
five_popular_items["Item Price"] = five_popular_items["Item Price"].map("${:.02f}".format)
five_popular_items["Total Purchase Value"] = five_popular_items["Total Purchase Value"].map("${:.02f}".format)
five_popular_items

## Most Profitable Items

* Sort the above table by total purchase value in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the data frame



In [None]:
five_profitable_items = items_df.sort_values("Total Purchase Value",ascending=False).head(5)
five_profitable_items["Item Price"] = five_profitable_items["Item Price"].map("${:.02f}".format)
five_profitable_items["Total Purchase Value"] = five_profitable_items["Total Purchase Value"].map("${:.02f}".format)
five_profitable_items