### Heroes Of Pymoli Data Analysis
* Of the 1163 active players, the vast majority are male (84%). There also exists, a smaller, but notable proportion of female players (14%).

* Our peak age demographic falls between 20-24 (44.8%) with secondary groups falling between 15-19 (18.60%) and 25-29 (13.4%).  
-----

### Note
* Instructions have been included for each segment. You do not have to follow them exactly, but they are included to help you think through the steps.

In [177]:
# Dependencies and Setup
import pandas as pd
import numpy as np

# Raw data file
file_to_load = "Resources/purchase_data.csv"

# Read purchasing file and store into pandas data frame
purchase_data_df = pd.read_csv(file_to_load)

purchase_data_df.head()

Unnamed: 0,Purchase ID,SN,Age,Gender,Item ID,Item Name,Price
0,0,Lisim78,20,Male,108,"Extraction, Quickblade Of Trembling Hands",3.53
1,1,Lisovynya38,40,Male,143,Frenzied Scimitar,1.56
2,2,Ithergue48,24,Male,92,Final Critic,4.88
3,3,Chamassasya86,24,Male,100,Blindscythe,3.27
4,4,Iskosia90,23,Male,131,Fury,1.44


In [178]:
purchase_data_df.dtypes

Purchase ID      int64
SN              object
Age              int64
Gender          object
Item ID          int64
Item Name       object
Price          float64
dtype: object

## Player Count

* Display the total number of players


In [179]:
players_df = purchase_data_df.groupby("SN")["SN"].unique()
players_df.count()

576

## Purchasing Analysis (Total)

* Run basic calculations to obtain number of unique items, average price, etc.


* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame


In [180]:
unique_items = len(purchase_data_df["Item Name"].unique())
average_price= purchase_data_df["Price"].mean()
total_purchases_df = len(purchase_data_df)
total_revenue_df = purchase_data_df["Price"].sum()
total_revenue_df.round(2)


2379.77

In [181]:
Purchasing_summary = pd.DataFrame({"Total Unique Items": [unique_items],"Average Price": [average_price],
                              "Total Purchases": [total_purchases_df], "Total Revenue": [total_revenue_df]})
Purchasing_summary

Unnamed: 0,Total Unique Items,Average Price,Total Purchases,Total Revenue
0,179,3.050987,780,2379.77


## Gender Demographics

In [182]:
gender_count =purchase_data_df.groupby("Gender")["SN"].nunique()
gender_count

Gender
Female                    81
Male                     484
Other / Non-Disclosed     11
Name: SN, dtype: int64

In [183]:
gender_percentage_df = (gender_count/576)*100
gender_percentage_df.round()


Gender
Female                   14.0
Male                     84.0
Other / Non-Disclosed     2.0
Name: SN, dtype: float64

In [184]:
gender_summary_df = pd.DataFrame({"Gender of Players":gender_count, "Percent of Total Players":gender_percentage_df})
gender_summary_df.style.format({"Percent of Total Players": "{:.2f}%"})

Unnamed: 0_level_0,Gender of Players,Percent of Total Players
Gender,Unnamed: 1_level_1,Unnamed: 2_level_1
Female,81,14.06%
Male,484,84.03%
Other / Non-Disclosed,11,1.91%


* Percentage and Count of Male Players


* Percentage and Count of Female Players


* Percentage and Count of Other / Non-Disclosed




In [185]:
purchases_df = purchase_data_df.groupby("Purchase ID")["Purchase ID"].unique()
purchases_df.count()

780


## Purchasing Analysis (Gender)

* Run basic calculations to obtain purchase count, avg. purchase price, avg. purchase total per person etc. by gender




* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame

In [186]:
#number of purchases by gender
purchases_gen_df = purchase_data_df.groupby("Gender")["Item Name"].nunique()

purchases_gen_df.head()

Gender
Female                    90
Male                     178
Other / Non-Disclosed     13
Name: Item Name, dtype: int64

In [187]:
#average price of item by gender
gender_average_df = purchase_data_df.groupby("Gender")["Price"].mean()
gender_average_df.round(2)

Gender
Female                   3.20
Male                     3.02
Other / Non-Disclosed    3.35
Name: Price, dtype: float64

In [188]:
#total purchase value by gender
gender_total_df = purchase_data_df.groupby("Gender")["Price"].sum()
gender_total_df

Gender
Female                    361.94
Male                     1967.64
Other / Non-Disclosed      50.19
Name: Price, dtype: float64

In [202]:
gender_df = pd.DataFrame({"Purchases": purchases_gen_df, "Average Price": gender_average_df, "Total Purchase Value":gender_total_df})
gender_df["Average Price"]=gender_df["Average Price"].map("${:.2f}".format)
gender_df["Total Purchase Value"]=gender_df["Total Purchase Value"].map("${:.2f}".format)
gender_df

Unnamed: 0_level_0,Purchases,Average Price,Total Purchase Value
Gender,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Female,90,$3.20,$361.94
Male,178,$3.02,$1967.64
Other / Non-Disclosed,13,$3.35,$50.19


## Age Demographics

* Establish bins for ages


* Categorize the existing players using the age bins. Hint: use pd.cut()


* Calculate the numbers and percentages by age group


* Create a summary data frame to hold the results


* Optional: round the percentage column to two decimal points


* Display Age Demographics Table


In [190]:
# Establish bins for ages
bins = [0, 9.90, 14.90, 19.90, 24.90, 29.90, 34.90, 39.90, 99999]
age_ranges = ["<10", "10-14", "15-19", "20-24", "25-29", "30-34", "35-39", "40+"]


## Purchasing Analysis (Age)

* Bin the purchase_data data frame by age


* Run basic calculations to obtain purchase count, avg. purchase price, avg. purchase total per person etc. in the table below


* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame

In [191]:
purchase_data_df["Players Age"] = pd.cut(purchase_data_df["Age"], bins, labels=age_ranges)
purchase_data_df.head()

Unnamed: 0,Purchase ID,SN,Age,Gender,Item ID,Item Name,Price,Players Age
0,0,Lisim78,20,Male,108,"Extraction, Quickblade Of Trembling Hands",3.53,20-24
1,1,Lisovynya38,40,Male,143,Frenzied Scimitar,1.56,40+
2,2,Ithergue48,24,Male,92,Final Critic,4.88,20-24
3,3,Chamassasya86,24,Male,100,Blindscythe,3.27,20-24
4,4,Iskosia90,23,Male,131,Fury,1.44,20-24


In [192]:
Players_age_count = purchase_data_df.groupby("Players Age")["Item Name"].nunique()


In [193]:
average_purchase_age_df = purchase_data_df.groupby("Players Age")["Price"].mean()
average_purchase_age_df.round(2)

Players Age
<10      3.35
10-14    2.96
15-19    3.04
20-24    3.05
25-29    2.90
30-34    2.93
35-39    3.60
40+      2.94
Name: Price, dtype: float64

In [203]:
Age_Ranges_df = pd.DataFrame({"Purchase Count":Players_age_count,
                              "Average Purchase Price":average_purchase_age_df})
Age_Ranges_df["Average Purchase Price"]=Age_Ranges_df["Average Purchase Price"].map("${:.2f}".format)
Age_Ranges_df

Unnamed: 0_level_0,Purchase Count,Average Purchase Price
Players Age,Unnamed: 1_level_1,Unnamed: 2_level_1
<10,21,$3.35
10-14,24,$2.96
15-19,96,$3.04
20-24,163,$3.05
25-29,78,$2.90
30-34,60,$2.93
35-39,37,$3.60
40+,13,$2.94


## Top Spenders

* Run basic calculations to obtain the results in the table below


* Create a summary data frame to hold the results


* Sort the total purchase value column in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the summary data frame



In [195]:
top_spenders = purchase_data_df.groupby("SN")
purchase_count = top_spenders["Age"].count()
avg_purchase = top_spenders["Price"].mean()
total_purchase_value = top_spenders["Price"].sum()

top_spenders_df = pd.DataFrame({"Purchase Count": purchase_count, 
                               "Average Purchase Price": avg_purchase,
                               "Total Value": total_purchase_value})
top_spenders_df = top_spenders_df.sort_values("Total Value", ascending = False)
top_spenders_df["Average Purchase Price"] = top_spenders_df["Average Purchase Price"].map("${:.2f}".format)
top_spenders_df["Total Value"] = top_spenders_df["Total Value"].map("${:.2f}".format)

top_spenders_df

Unnamed: 0_level_0,Purchase Count,Average Purchase Price,Total Value
SN,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Lisosia93,5,$3.79,$18.96
Idastidru52,4,$3.86,$15.45
Chamjask73,3,$4.61,$13.83
Iral74,4,$3.40,$13.62
Iskadarya95,3,$4.37,$13.10
Ilarin91,3,$4.23,$12.70
Ialallo29,3,$3.95,$11.84
Tyidaim51,3,$3.94,$11.83
Lassilsala30,3,$3.84,$11.51
Chadolyla44,3,$3.82,$11.46


## Most Popular Items

* Retrieve the Item ID, Item Name, and Item Price columns


* Group by Item ID and Item Name. Perform calculations to obtain purchase count, item price, and total purchase value


* Create a summary data frame to hold the results


* Sort the purchase count column in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the summary data frame



In [196]:
items_group = purchase_data_df.groupby(["Item ID", "Item Name"]) 
items_purch_count = items_group["Price"].count()
purchase_value = (items_group["Price"].sum())
item_price = purchase_value/items_purc_count
most_popular_items_df = pd.DataFrame({"Purchase Count": items_purch_count,
                                   "Price": item_price,
                                   "Total Purchase Value": purchase_value})

most_popular_items_df["Total Purchase Value"] =most_popular_items_df["Total Purchase Value"].map("${:.2f}".format)
most_popular_items_df["Price"] =most_popular_items_df["Price"].map("${:.2f}".format)

most_popular_items_df.sort_values("Purchase Count", ascending = False)






Unnamed: 0_level_0,Unnamed: 1_level_0,Purchase Count,Price,Total Purchase Value
Item ID,Item Name,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
178,"Oathbreaker, Last Hope of the Breaking Storm",12,$4.23,$50.76
145,Fiery Glass Crusader,9,$4.58,$41.22
108,"Extraction, Quickblade Of Trembling Hands",9,$3.53,$31.77
82,Nirvana,9,$4.90,$44.10
19,"Pursuit, Cudgel of Necromancy",8,$1.02,$8.16
103,Singed Scalpel,8,$4.35,$34.80
75,Brutality Ivory Warmace,8,$2.42,$19.36
72,Winter's Bite,8,$3.77,$30.16
60,Wolf,8,$3.54,$28.32
59,"Lightning, Etcher of the King",8,$4.23,$33.84


## Most Profitable Items

* Sort the above table by total purchase value in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the data frame



In [197]:
profitable_items_df = pd.DataFrame({"Purchase Count": items_purch_count,
                                   "Price": item_price,
                                   "Total Purchase Value": purchase_value})
profitable_items_df = profitable_items_df.sort_values("Total Purchase Value", ascending = False)
profitable_items_df["Total Purchase Value"] =profitable_items_df["Total Purchase Value"].map("${:.2f}".format)
profitable_items_df["Price"] =profitable_items_df["Price"].map("${:.2f}".format)

profitable_items_df.iloc[0:5,:]

Unnamed: 0_level_0,Unnamed: 1_level_0,Purchase Count,Price,Total Purchase Value
Item ID,Item Name,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
178,"Oathbreaker, Last Hope of the Breaking Storm",12,$4.23,$50.76
82,Nirvana,9,$4.90,$44.10
145,Fiery Glass Crusader,9,$4.58,$41.22
92,Final Critic,8,$4.88,$39.04
103,Singed Scalpel,8,$4.35,$34.80
