### Heroes Of Pymoli Data Analysis
* Of the 1163 active players, the vast majority are male (84%). There also exists, a smaller, but notable proportion of female players (14%).

* Our peak age demographic falls between 20-24 (44.8%) with secondary groups falling between 15-19 (18.60%) and 25-29 (13.4%).  
-----

### Note
* Instructions have been included for each segment. You do not have to follow them exactly, but they are included to help you think through the steps.

In [1]:
# Dependencies and Setup
import pandas as pd
import numpy as np

# Raw data file
file_to_load = "purchase_data.csv"

# Read purchasing file and store into pandas data frame
purchase_data = pd.read_csv(file_to_load)

## Player Count

* Display the total number of players


In [2]:

# solve for player count
player_demographics = purchase_data.loc[:, ["Gender", "SN", "Age"]]
player_demographics.drop_duplicates(inplace=True)
num_player = player_demographics.count()[0]



In [3]:

# display total number of players
#num_player = pd.DataFrame({"Total Players": num_player})

num_player


576

## Purchasing Analysis (Total)

* Run basic calculations to obtain number of unique items, average price, etc.


* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame


In [4]:

average_item_price = purchase_data["Price"].mean()
total_purchase_value = purchase_data["Price"].sum()
purchase_count = purchase_data["Price"].count()
item_count = len(purchase_data["Item ID"].unique())


print("Number of Unique Items:", item_count)
print("Average Purchase Price:%.2f"% average_item_price)
print("Total Number of Purchases:", total_purchase_value)
print("Total Revenue:", purchase_count)

Number of Unique Items: 183
Average Purchase Price:3.05
Total Number of Purchases: 2379.77
Total Revenue: 780


## Gender Demographics

* Run basic calculations to obtain number of unique items, average price, etc.


* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame


In [5]:
purchase_data.head()

Unnamed: 0,Purchase ID,SN,Age,Gender,Item ID,Item Name,Price
0,0,Lisim78,20,Male,108,"Extraction, Quickblade Of Trembling Hands",3.53
1,1,Lisovynya38,40,Male,143,Frenzied Scimitar,1.56
2,2,Ithergue48,24,Male,92,Final Critic,4.88
3,3,Chamassasya86,24,Male,100,Blindscythe,3.27
4,4,Iskosia90,23,Male,131,Fury,1.44


In [6]:
total_n=purchase_data["SN"].unique().shape[0]

for gender,data in purchase_data.groupby("Gender"):
    n=data["SN"].unique().shape[0]
    print('{}: {} {:.1f}%'.format(gender,n,n*100/total_n))
    

Female: 81 14.1%
Male: 484 84.0%
Other / Non-Disclosed: 11 1.9%



## Purchasing Analysis (Gender)

* Run basic calculations to obtain purchase count, avg. purchase price, etc. by gender


* For normalized purchasing, divide total purchase value by purchase count, by gender


* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame

In [7]:
for gender,data in purchase_data.groupby("Gender"):
    average_item_price = data["Price"].mean()
    total_purchase_value = data["Price"].sum()
    purchase_count = data["Price"].count()
    item_count = len(data["Item ID"].unique())

    print(gender)
    print("Number of Unique Items:", item_count)
    print("Average Purchase Price:%.2f"% average_item_price)
    print("Total Number of Purchases:", total_purchase_value)
    print("Total Revenue:", purchase_count)


Female
Number of Unique Items: 90
Average Purchase Price:3.20
Total Number of Purchases: 361.94
Total Revenue: 113
Male
Number of Unique Items: 182
Average Purchase Price:3.02
Total Number of Purchases: 1967.64
Total Revenue: 652
Other / Non-Disclosed
Number of Unique Items: 13
Average Purchase Price:3.35
Total Number of Purchases: 50.19
Total Revenue: 15


## Age Demographics

* Establish bins for ages


* Categorize the existing players using the age bins. Hint: use pd.cut()


* Calculate the numbers and percentages by age group


* Create a summary data frame to hold the results


* Optional: round the percentage column to two decimal points


* Display Age Demographics Table


In [8]:
# Establish bins for ages
age_bins = [0, 9.90, 14.90, 19.90, 24.90, 29.90, 34.90, 39.90, 99999]
group_names = ["<10", "10-14", "15-19", "20-24", "25-29", "30-34", "35-39", "40+"]


In [9]:
# categorize existing players using the age bins
purchase_data["Age Ranges"] = pd.cut(purchase_data["Age"], age_bins, labels=group_names)


In [10]:
purchase_data.head()

Unnamed: 0,Purchase ID,SN,Age,Gender,Item ID,Item Name,Price,Age Ranges
0,0,Lisim78,20,Male,108,"Extraction, Quickblade Of Trembling Hands",3.53,20-24
1,1,Lisovynya38,40,Male,143,Frenzied Scimitar,1.56,40+
2,2,Ithergue48,24,Male,92,Final Critic,4.88,20-24
3,3,Chamassasya86,24,Male,100,Blindscythe,3.27,20-24
4,4,Iskosia90,23,Male,131,Fury,1.44,20-24


## Purchasing Analysis (Age)

* Bin the purchase_data data frame by age


* Run basic calculations to obtain purchase count, avg. purchase price, etc. in the table below


* Calculate Normalized Purchasing


* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame

In [13]:
for age,data in purchase_data.groupby("Age Ranges"):
    average_item_price = data["Price"].mean()
    total_purchase_value = data["Price"].sum()
    purchase_count = data["Price"].count()
    item_count = len(data["Item ID"].unique())

    print(age)
    print("Number of Unique Items:", item_count)
    print("Average Purchase Price:%.2f"% average_item_price)
    print("Total Number of Purchases:", total_purchase_value)
    print("Total Revenue:", purchase_count)

<10
Number of Unique Items: 22
Average Purchase Price:3.35
Total Number of Purchases: 77.13
Total Revenue: 23
10-14
Number of Unique Items: 24
Average Purchase Price:2.96
Total Number of Purchases: 82.78
Total Revenue: 28
15-19
Number of Unique Items: 96
Average Purchase Price:3.04
Total Number of Purchases: 412.89
Total Revenue: 136
20-24
Number of Unique Items: 166
Average Purchase Price:3.05
Total Number of Purchases: 1114.06
Total Revenue: 365
25-29
Number of Unique Items: 78
Average Purchase Price:2.90
Total Number of Purchases: 292.99999999999994
Total Revenue: 101
30-34
Number of Unique Items: 60
Average Purchase Price:2.93
Total Number of Purchases: 214.00000000000003
Total Revenue: 73
35-39
Number of Unique Items: 37
Average Purchase Price:3.60
Total Number of Purchases: 147.67
Total Revenue: 41
40+
Number of Unique Items: 13
Average Purchase Price:2.94
Total Number of Purchases: 38.24
Total Revenue: 13


## Top Spenders

* Run basic calculations to obtain the results in the table below


* Create a summary data frame to hold the results


* Sort the total purchase value column in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the summary data frame



In [15]:
purchase_data.head()

Unnamed: 0,Purchase ID,SN,Age,Gender,Item ID,Item Name,Price,Age Ranges
0,0,Lisim78,20,Male,108,"Extraction, Quickblade Of Trembling Hands",3.53,20-24
1,1,Lisovynya38,40,Male,143,Frenzied Scimitar,1.56,40+
2,2,Ithergue48,24,Male,92,Final Critic,4.88,20-24
3,3,Chamassasya86,24,Male,100,Blindscythe,3.27,20-24
4,4,Iskosia90,23,Male,131,Fury,1.44,20-24


In [16]:
price_summary = purchase_data.groupby("SN")["Price"].sum()
price_summary.sort_values(ascending=False).head()

SN
Lisosia93      18.96
Idastidru52    15.45
Chamjask73     13.83
Iral74         13.62
Iskadarya95    13.10
Name: Price, dtype: float64

## Most Popular Items

* Retrieve the Item ID, Item Name, and Item Price columns


* Group by Item ID and Item Name. Perform calculations to obtain purchase count, item price, and total purchase value


* Create a summary data frame to hold the results


* Sort the purchase count column in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the summary data frame



In [17]:
most_popular = purchase_data.groupby(["Item ID","Item Name"])["Price"]
item_count = most_popular.count()
item_count.sort_values(ascending=False).head()

Item ID  Item Name                                   
178      Oathbreaker, Last Hope of the Breaking Storm    12
82       Nirvana                                          9
108      Extraction, Quickblade Of Trembling Hands        9
145      Fiery Glass Crusader                             9
92       Final Critic                                     8
Name: Price, dtype: int64

In [18]:
item_price = most_popular.mean()
item_price.sort_values(ascending=False).head()

Item ID  Item Name                        
63       Stormfury Mace                       4.99
139      Mercy, Katana of Dismay              4.94
147      Hellreaver, Heirloom of Inception    4.93
173      Stormfury Longsword                  4.93
128      Blazeguard, Reach of Eternity        4.91
Name: Price, dtype: float64

In [19]:
total_purchase_value = most_popular.sum()
total_purchase_value.sort_values(ascending=False).head()

Item ID  Item Name                                   
178      Oathbreaker, Last Hope of the Breaking Storm    50.76
82       Nirvana                                         44.10
145      Fiery Glass Crusader                            41.22
92       Final Critic                                    39.04
103      Singed Scalpel                                  34.80
Name: Price, dtype: float64

In [20]:
item_df = pd.concat([item_count,item_price,total_purchase_value],axis=1)
item_df.columns = ["Count","Price", "Total Purchase"]
item_df.sort_values("Count",ascending=False).head()


Unnamed: 0_level_0,Unnamed: 1_level_0,Count,Price,Total Purchase
Item ID,Item Name,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
178,"Oathbreaker, Last Hope of the Breaking Storm",12,4.23,50.76
145,Fiery Glass Crusader,9,4.58,41.22
108,"Extraction, Quickblade Of Trembling Hands",9,3.53,31.77
82,Nirvana,9,4.9,44.1
19,"Pursuit, Cudgel of Necromancy",8,1.02,8.16


## Most Profitable Items

* Sort the above table by total purchase value in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the data frame



In [21]:
item_df.sort_values("Total Purchase",ascending=False).head()

Unnamed: 0_level_0,Unnamed: 1_level_0,Count,Price,Total Purchase
Item ID,Item Name,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
178,"Oathbreaker, Last Hope of the Breaking Storm",12,4.23,50.76
82,Nirvana,9,4.9,44.1
145,Fiery Glass Crusader,9,4.58,41.22
92,Final Critic,8,4.88,39.04
103,Singed Scalpel,8,4.35,34.8
