### Heroes Of Pymoli Data Analysis
* Of the 1163 active players, the vast majority are male (84%). There also exists, a smaller, but notable proportion of female players (14%).

* Our peak age demographic falls between 20-24 (44.8%) with secondary groups falling between 15-19 (18.60%) and 25-29 (13.4%).  
-----

### Note
* Instructions have been included for each segment. You do not have to follow them exactly, but they are included to help you think through the steps.

In [1]:
# Dependencies and Setup
import pandas as pd
import numpy as np

# Raw data file
file_to_load = "purchase_data.csv"

# Read purchasing file and store into pandas data frame
purchase_data = pd.read_csv(file_to_load)

## Player Count

* Display the total number of players


In [2]:

# solve for player count
player_demographics = purchase_data.loc[:, ["Gender", "SN", "Age"]]
player_demographics.drop_duplicates(inplace=True)
num_player = player_demographics.count()[0]



In [9]:

# display total number of players
#num_player = pd.DataFrame({"Total Players": num_player})

num_player


576

## Purchasing Analysis (Total)

* Run basic calculations to obtain number of unique items, average price, etc.


* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame


In [None]:

average_item_price = purchase_data["Price"].mean()
total_purchase_value = purchase_data["Price"].sum()
purchase_count = purchase_data["Price"].count()
item_count = len(purchase_data["Item ID"].unique())

# display results
pd.DataFrame([
    "Number of Unique Items": item_count,
    
    ...
    ])

## Gender Demographics

* Run basic calculations to obtain number of unique items, average price, etc.


* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame


In [None]:
gender_totals = []
gender_percentage = gender_totals / num_player * 100

gender_demographics = pd.DataFrame({
    "Total Count": gender_totals,
    "Percentage of Players": gender_percentage
}) 


## Purchasing Analysis (Gender)

* Run basic calculations to obtain purchase count, avg. purchase price, etc. by gender


* For normalized purchasing, divide total purchase value by purchase count, by gender


* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame

In [None]:
gender_purchase_total = purchase_data.groupby(["Gender"]).sum()["Price"].rename("Total purchase Value")
gender_average = purchase_data.groupby(["Gender"]).mean()["Price"].rename("Average Purchase Price")
gender_counts = purchase_data.groupby(["Gender"]).count()["Price"].rename("Purchase Count")

In [None]:
# calculate normalized purhchasing
normalized total = gender_purchase_total / gender_demographics["Total Count"]


gender_data = pd.DataFrame({
    "Purchase Count": gender_counts,
    "Average Purchase Price": gender_average,
    "Total Purchse Value": gender_purchase_total,
    "Normalized total": normalized_total
})

In [None]:
gender_data["Average Purchase Price"] = gender_data["Average Purchase Price"].map("${:,.2f}".format) 
gender_data["Total Purchase Value"] = gender_data["Total Purchase Value"].map("${:,.2f}".format) 
gender_data["Purchase Count"] = gender_data["Purchase Count"].map("${:,.2f}".format) 
gender_data["Normalized Total"] = gender_data["Normalized Total"].map("${:,.2f}".format) 

## Age Demographics

* Establish bins for ages


* Categorize the existing players using the age bins. Hint: use pd.cut()


* Calculate the numbers and percentages by age group


* Create a summary data frame to hold the results


* Optional: round the percentage column to two decimal points


* Display Age Demographics Table


In [None]:
# Establish bins for ages
age_bins = [0, 9.90, 14.90, 19.90, 24.90, 29.90, 34.90, 39.90, 99999]
group_names = ["<10", "10-14", "15-19", "20-24", "25-29", "30-34", "35-39", "40+"]


In [None]:
# categorize existing players using the age bins
player_demographics["Age Ranges"] = pd.cut(player_demographics["Age"], age_bins, labels=group_names)


In [None]:

# calculate the numbers and percentage by age group
age_demographic_totals = player_demographics["Age Ranges"].value_counts()
age_demographic_percent = age_demographic_totals / num_player * 100

In [None]:

# calculate the numbers and percentage by age group
age_demographic_totals = player_demographics["Age Ranges"].value_counts()
age_demographic_percent = age_demographic_totals / num_player * 100
# display results
age_demographic = pd.DataFrame({
    "Total Count": age_demographic_totals, 
    "Percentage of Players": age_demographic_percent,
})
age_demographic = age_demographic.round(2)
age_demographic.sort_index()

## Purchasing Analysis (Age)

* Bin the purchase_data data frame by age


* Run basic calculations to obtain purchase count, avg. purchase price, etc. in the table below


* Calculate Normalized Purchasing


* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame

## Top Spenders

* Run basic calculations to obtain the results in the table below


* Create a summary data frame to hold the results


* Sort the total purchase value column in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the summary data frame



## Most Popular Items

* Retrieve the Item ID, Item Name, and Item Price columns


* Group by Item ID and Item Name. Perform calculations to obtain purchase count, item price, and total purchase value


* Create a summary data frame to hold the results


* Sort the purchase count column in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the summary data frame



## Most Profitable Items

* Sort the above table by total purchase value in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the data frame

