### Heroes Of Pymoli Data Analysis
* Of the 1163 active players, the vast majority are male (84%). There also exists, a smaller, but notable proportion of female players (14%).

* Our peak age demographic falls between 20-24 (44.8%) with secondary groups falling between 15-19 (18.60%) and 25-29 (13.4%).  
-----

### Note
* Instructions have been included for each segment. You do not have to follow them exactly, but they are included to help you think through the steps.

In [None]:
# Dependencies and Setup
import pandas as pd

# File to Load (Remember to Change These)
file_to_load = "Resources/purchase_data.csv"

# Read Purchasing File and store into Pandas data frame
purchase_data = pd.read_csv(file_to_load)

# Display
purchase_data

## Player Count

* Display the total number of players


In [None]:
# Total amount of Players
total_players = len((purchase_data["SN"]).unique())
total_players_df = pd.DataFrame({"Total Players": [total_players]})

# Display
total_players_df

## Purchasing Analysis (Total)

* Run basic calculations to obtain number of unique items, average price, etc.


* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame


In [None]:
# Number of unique items based on Item ID
unique_items = len((purchase_data["Item ID"]).unique())

# Average Price
average_price = purchase_data["Price"].mean()

# Number of purchases based on Purchase ID
num_of_purchases = len((purchase_data["Purchase ID"]).unique())

# Total Revenue summed by Price column
total_revenue = purchase_data["Price"].sum()

# Create DataFrame
purchase_summary_df = pd.DataFrame({"Number of Unique Items": [unique_items], "Average Price": [average_price], 
                                     "Number of Purchases": [num_of_purchases], "Total Revenue": [total_revenue]})

# Format DataFrame
purchase_summary_df.style.format({'Average Price': "${:,.2f}",
                                'Total Revenue': "${:,.2f}"})

Gender Demographics

* Percentage and Count of Male Players


* Percentage and Count of Female Players


* Percentage and Count of Other / Non-Disclosed




In [None]:
# Group by Gender
gender_counts_df = purchase_data.groupby("Gender")

# Total Players by Gender
gender_totals_df = pd.DataFrame({'Total Count': gender_counts_df.nunique()["SN"]})

# Sort Descending
gender_totals_df = gender_totals_df.sort_values(by=['Total Count'], ascending=False)

# Sum Total Players
player_totals = gender_totals_df['Total Count'].sum()

# Percentage of Players by Gender
gender_totals_df['Percentage of Players'] = (gender_totals_df['Total Count'] / player_totals)

# Formatting
gender_totals_df['Percentage of Players'] = gender_totals_df['Percentage of Players'].astype(float).map("{:.2%}".format)

# Display
gender_totals_df


## Purchasing Analysis (Gender)

* Run basic calculations to obtain purchase count, avg. purchase price, avg. purchase total per person etc. by gender




* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame

In [None]:
# Total Purchases
purchase_count = gender_counts_df['Purchase ID'].count()

# Average Purchase Price
avg_purchase = gender_counts_df['Price'].mean()

# Total Purchase Amount
total_purchase = gender_counts_df['Price'].sum()

# Totals by gender
gen_totals = gender_counts_df.nunique()["SN"]

# Average Total Purchase per Person
purchase_person = total_purchase / gen_totals

# Create DataFrame
purch_analysis_df = pd.DataFrame({'Purchase Count': purchase_count, 'Average Purchase Price': avg_purchase, 
                               'Total Purchase Value': total_purchase, 'Avg Total Purchase per Person': purchase_person})

# Format DataFrame
purch_analysis_df['Average Purchase Price'] = purch_analysis_df['Average Purchase Price'].astype(float).map("${:.2f}".format)
purch_analysis_df['Total Purchase Value'] = purch_analysis_df['Total Purchase Value'].astype(float).map("${:.2f}".format)
purch_analysis_df['Avg Total Purchase per Person'] = purch_analysis_df['Avg Total Purchase per Person'].astype(float).map("${:.2f}".format)
purch_analysis_df.index.name = ''

#Display DataFrame
purch_analysis_df

## Age Demographics

* Establish bins for ages


* Categorize the existing players using the age bins. Hint: use pd.cut()


* Calculate the numbers and percentages by age group


* Create a summary data frame to hold the results


* Optional: round the percentage column to two decimal points


* Display Age Demographics Table


In [None]:
# Create Bins
bins = [0, 9, 14, 19, 24, 29, 34, 39, 200]

# Set Group Names
group_names = ["<10", "10-14", "15-19", "20-24", "25-29", "30-34", "35-39", "40+"]

# Confirm Bins
purchase_data["Age Range"] = pd.cut(purchase_data['Age'], bins, labels=group_names, include_lowest=True)
age_df = purchase_data.groupby("Age Range")

# Count Unique SN by age
age_count = age_df.nunique()["SN"]

# Obtain Total Players
age_totals = age_count.sum()

# Create DataFrame
age_demo_df = pd.DataFrame({'Total Count': age_count, 'Percentage of Players': (age_count/age_totals)})

# Format DataFrame
age_demo_df['Percentage of Players'] = age_demo_df['Percentage of Players'].astype(float).map("{:.2%}".format)

# Display DataFrame
age_demo_df

## Purchasing Analysis (Age)

* Bin the purchase_data data frame by age


* Run basic calculations to obtain purchase count, avg. purchase price, avg. purchase total per person etc. in the table below


* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame

In [None]:
# Count number of purchases
purchase_count_age = age_df['Purchase ID'].count()

# Average Purchase Price
avg_purchase_age = age_df['Price'].mean()

# Sum Purchase Price
total_purchase_age = age_df['Price'].sum()

# Avg Total Purchase per Person
purchase_age = total_purchase_age / age_count

# Create DataFrame
age_analysis_df = pd.DataFrame({'Purchase Count': purchase_count_age, 'Average Purchase Price': avg_purchase_age, 
                               'Total Purchase Value': total_purchase_age, 'Avg Total Purchase per Person': purchase_age})

# Format DataFrame
age_analysis_df['Average Purchase Price'] = age_analysis_df['Average Purchase Price'].astype(float).map("${:.2f}".format)
age_analysis_df['Total Purchase Value'] = age_analysis_df['Total Purchase Value'].astype(float).map("${:.2f}".format)
age_analysis_df['Avg Total Purchase per Person'] = age_analysis_df['Avg Total Purchase per Person'].astype(float).map("${:.2f}".format)
age_analysis_df.index.name = ''

# Display DataFrame
age_analysis_df

## Top Spenders

* Run basic calculations to obtain the results in the table below


* Create a summary data frame to hold the results


* Sort the total purchase value column in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the summary data frame



In [None]:
# Group purchase_data by "SN"
sn = purchase_data.groupby("SN")

# Count number of purchases
purchase_count_sn = sn['Purchase ID'].count()

# Average Purchase Price
average_purchase_sn = sn['Price'].mean()

# Sum Purchase Price
total_purchase_sn = sn['Price'].sum()

# Create DataFrame
spenders_df = pd.DataFrame({'Purchase Count': purchase_count_sn,
                            'Average Purchase Price': average_purchase_sn,
                            'Total Purchase Value': total_purchase_sn})

# Sort DataFrame
spenders_df = spenders_df.sort_values(["Total Purchase Value"], ascending=False)

# Format DataFrame
spenders_df['Average Purchase Price'] = spenders_df['Average Purchase Price'].astype(float).map("${:.2f}".format)
spenders_df['Total Purchase Value'] = spenders_df['Total Purchase Value'].astype(float).map("${:.2f}".format)
spenders_df.index.name = 'SN'

# Display DataFrame
spenders_df.head()

## Most Popular Items

* Retrieve the Item ID, Item Name, and Item Price columns


* Group by Item ID and Item Name. Perform calculations to obtain purchase count, item price, and total purchase value


* Create a summary data frame to hold the results


* Sort the purchase count column in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the summary data frame



In [None]:
# Retrieve Item ID, Name, Price from purchase_data
pop_df = purchase_data.loc[:, ["Item ID", "Item Name", "Price"]]

# Groupby
pop_items = pop_df.groupby(['Item ID', 'Item Name'])

# Set Index
# pop_items = pop_df.set_index(['Item ID', 'Item Name'])

# Purchase Count
purchase_count_pop = pop_items['Price'].count()

# Total Purchase Value
purchase_count_price = pop_items['Price'].sum()

# Create DataFrame
most_pop_df = pd.DataFrame({'Purchase Count': purchase_count_pop, 
                            'Total Purchase Value': purchase_count_price})

# Item Price
most_pop_df['Item Price'] = most_pop_df['Total Purchase Value'] / most_pop_df['Purchase Count']

# Organize DataFrame
most_pop_df = most_pop_df[['Purchase Count', 'Item Price', 'Total Purchase Value']]

#Create a copy for final step
copy_most_pop_df = most_pop_df

# Sort Values
most_pop_df = most_pop_df.sort_values(['Purchase Count'], ascending=False)

# Format DataFrame
most_pop_df['Item Price'] = most_pop_df['Item Price'].astype(float).map("${:.2f}".format)
most_pop_df['Total Purchase Value'] = most_pop_df['Total Purchase Value'].astype(float).map("${:.2f}".format)

# Display DataFrame
most_pop_df.head()

## Most Profitable Items

* Sort the above table by total purchase value in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the data frame



In [None]:
# Create new DataFrame and sort by Total Purchase Value
most_profit_df = copy_most_pop_df.sort_values(['Total Purchase Value'], ascending=False)

# Formatting
most_profit_df['Item Price'] = most_profit_df['Item Price'].astype(float).map("${:.2f}".format)
most_profit_df['Total Purchase Value'] = most_profit_df['Total Purchase Value'].astype(float).map("${:.2f}".format)

# Display Preview
most_profit_df.head()