### Heroes Of Pymoli Data Analysis
* Of the 1163 active players, the vast majority are male (84%). There also exists, a smaller, but notable proportion of female players (14%).

* Our peak age demographic falls between 20-24 (44.8%) with secondary groups falling between 15-19 (18.60%) and 25-29 (13.4%).  
-----

### Note
* Instructions have been included for each segment. You do not have to follow them exactly, but they are included to help you think through the steps.

In [34]:
# Dependencies and Setup
import pandas as pd
import numpy as np

# File to Load (Remember to Change These)
file_to_load = "Resources/purchase_data.csv"

# Read Purchasing File and store into Pandas data frame
purchase_data = pd.read_csv(file_to_load)

## Player Count

* Display the total number of players


In [35]:
# Dependencies and Setup
import pandas as pd
import numpy as np

# File to Load (Remember to Change These)
file_to_load = "Resources/purchase_data.csv"

# Read Purchasing File and store into Pandas data frame
purchase_data = pd.read_csv(file_to_load)

unique = purchase_data["SN"].nunique()
unique

player_count = pd.DataFrame()
player_count["Total Players"] = [unique]
player_count




Unnamed: 0,Total Players
0,576


## Purchasing Analysis (Total)

* Run basic calculations to obtain number of unique items, average price, etc.


* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame


In [36]:
Number_of_Unique_Items = purchase_data["Item Name"].nunique()

Average_Price =  purchase_data["Price"].mean()
Average_Price = "${:.2f}".format(Average_Price)

Number_of_Purchases = purchase_data["Item Name"].count()

Total_Revenue =  purchase_data ["Price"].sum()

items_dicts = [{ "Number of Unique Items":Number_of_Unique_Items,"Average Price": Average_Price,"Number of Purchases":Number_of_Purchases,"Total Revenue":Total_Revenue }]
items_count = pd.DataFrame(items_dicts)
items_count



Unnamed: 0,Average Price,Number of Purchases,Number of Unique Items,Total Revenue
0,$3.05,780,179,2379.77


## Gender Demographics

* Percentage and Count of Male Players


* Percentage and Count of Female Players


* Percentage and Count of Other / Non-Disclosed




In [37]:
Total_count =  purchase_data ["Gender"].count()
Gender_count =  purchase_data ["Gender"].value_counts()
Percentage_of_Players = Gender_count/ Total_count
Percentage_of_Players  = Percentage_of_Players.map(lambda n: '{:,.2%}'.format(n))

gender_dicts = {"Total Count":Gender_count, "Percentage of Players":Percentage_of_Players}
gender_demographics = pd.DataFrame(gender_dicts)
gender_demographics

Unnamed: 0,Total Count,Percentage of Players
Male,652,83.59%
Female,113,14.49%
Other / Non-Disclosed,15,1.92%



## Purchasing Analysis (Gender)

* Run basic calculations to obtain purchase count, avg. purchase price, avg. purchase total per person etc. by gender




* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame

In [38]:
purchase_dicts = {"Gender":purchase_data["Gender"],"Price":purchase_data["Price"]}
purchase_analysis = pd.DataFrame(purchase_dicts)
grouped_purchase_analysis = purchase_analysis.groupby('Gender')['Price'].agg(['count', 'mean', 'sum'])
grouped_purchase_analysis



Unnamed: 0_level_0,count,mean,sum
Gender,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Female,113,3.203009,361.94
Male,652,3.017853,1967.64
Other / Non-Disclosed,15,3.346,50.19


## Age Demographics

* Establish bins for ages


* Categorize the existing players using the age bins. Hint: use pd.cut()


* Calculate the numbers and percentages by age group


* Create a summary data frame to hold the results


* Optional: round the percentage column to two decimal points


* Display Age Demographics Table


In [39]:
bins = [0,10,15,20,25,30,35,40,50]
labels = ["<10","10-14","15-19","20-24","25-29","30-34","35-39","40+"]
Age_count =  purchase_data ["Age"].count()
age_demographics = pd.DataFrame()

age_demographics["Total Count"]= pd.cut(purchase_data["Age"],bins,labels=labels).value_counts()
age_demographics["Percentage_of_Players"] = age_demographics["Total Count"]/ Age_count
age_demographics["Percentage_of_Players"]  = age_demographics["Percentage_of_Players"].map(lambda n: '{:,.2%}'.format(n))

age_demographics


Unnamed: 0,Total Count,Percentage_of_Players
20-24,325,41.67%
15-19,200,25.64%
25-29,77,9.87%
10-14,54,6.92%
30-34,52,6.67%
35-39,33,4.23%
<10,32,4.10%
40+,7,0.90%


## Purchasing Analysis (Age)

* Bin the purchase_data data frame by age


* Run basic calculations to obtain purchase count, avg. purchase price, avg. purchase total per person etc. in the table below


* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame

In [40]:
bins = [0,9,14,19,24,29,34,39,45]
labels = ["<10","10-14","15-19","20-24","25-29","30-34","35-39","40+"]

purchase_dicts = {"Age":purchase_data["Age"],"SN":purchase_data["SN"],"Price":purchase_data["Price"]}
purchase_analysis = pd.DataFrame(purchase_dicts)
purchase_analysis["Age Groups"]= pd.cut(purchase_analysis["Age"],bins,labels = labels)
grouped_purchase_analysis = purchase_analysis.groupby('Age Groups')['Price'].agg(['count', 'mean', 'sum'])
#grouped_purchase_analysis = purchase_analysis.groupby('SN')['Price'].agg([ 'mean'])
grouped_purchase_analysis["mean"]  = grouped_purchase_analysis["mean"].map(lambda n: '{:,.2f}'.format(n))

grouped_purchase_analysis

Unnamed: 0_level_0,count,mean,sum
Age Groups,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
<10,23,3.35,77.13
10-14,28,2.96,82.78
15-19,136,3.04,412.89
20-24,365,3.05,1114.06
25-29,101,2.9,293.0
30-34,73,2.93,214.0
35-39,41,3.6,147.67
40+,13,2.94,38.24


## Top Spenders

* Run basic calculations to obtain the results in the table below


* Create a summary data frame to hold the results


* Sort the total purchase value column in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the summary data frame



In [41]:
top_spenders_dicts = {"SN":purchase_data["SN"],"Price":purchase_data["Price"]}
top_spenders_analysis = pd.DataFrame(top_spenders_dicts)

grouped_top_spenders = items_count.groupby('SN')['Price'].agg(['count', 'mean', 'sum'])
grouped_top_spenders["mean"]  = grouped__top_spenders["mean"].map("${:,.2f}".format)
grouped_top_spenders.head(10)



KeyError: 'SN'

## Most Popular Items

* Retrieve the Item ID, Item Name, and Item Price columns


* Group by Item ID and Item Name. Perform calculations to obtain purchase count, item price, and total purchase value


* Create a summary data frame to hold the results


* Sort the purchase count column in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the summary data frame



In [42]:
most_popular_items_dicts = {"SN":purchase_data["SN"],"Item Name":purchase_data["Item Name"],"Price":purchase_data["Price"]}
most_popular_items_analysis = pd.DataFrame(most_popular_items_dicts)

grouped_most_popular_items = items_count.groupby('Item ID')('Item Name')(['Price'].agg(['count', 'mean', 'sum'])
grouped_most_popular_items["mean"]  = grouped__most_popular_items["mean"].map("${:,.2f}".format)

grouped_most_popular_items.head(10)


SyntaxError: invalid syntax (<ipython-input-42-e6f6b047e953>, line 4)

## Most Profitable Items

In [1]:
profitable_items = purchase_data.sort_values("Total Purchase Value", ascending=False)

profitable_items["Item Price"] = purchase_data["Item Price"].map("${:,.2f}".format)
profitable_items["Purchase Count"] = purchase_data["Purchase Count"].map("{:,}".format)
profitable_items["Total Purchase Value"] = purchase_data["Total Purchase Value"].map("${:,.2f}".format)
profitable_items = profitable_items.loc[:,["Purchase Count", "Item Price", "Total Purchase Value"]]

profitable_items.head(10)

NameError: name 'purchase_data' is not defined