### Heroes Of Pymoli Data Analysis
* Of the 1163 active players, the vast majority are male (84%). There also exists, a smaller, but notable proportion of female players (14%).

* Our peak age demographic falls between 20-24 (44.8%) with secondary groups falling between 15-19 (18.60%) and 25-29 (13.4%).  
-----

### Note
* Instructions have been included for each segment. You do not have to follow them exactly, but they are included to help you think through the steps.

In [1]:
# Dependencies and Setup
import pandas as pd
import numpy as np

# File to Load (Remember to Change These)
file_to_load = "Resources/purchase_data.csv"

# Read Purchasing File and store into Pandas data frame
purchase_data = pd.read_csv(file_to_load)
purchase_data.head(10) 

Unnamed: 0,Purchase ID,SN,Age,Gender,Item ID,Item Name,Price
0,0,Lisim78,20,Male,108,"Extraction, Quickblade Of Trembling Hands",3.53
1,1,Lisovynya38,40,Male,143,Frenzied Scimitar,1.56
2,2,Ithergue48,24,Male,92,Final Critic,4.88
3,3,Chamassasya86,24,Male,100,Blindscythe,3.27
4,4,Iskosia90,23,Male,131,Fury,1.44
5,5,Yalae81,22,Male,81,Dreamkiss,3.61
6,6,Itheria73,36,Male,169,"Interrogator, Blood Blade of the Queen",2.18
7,7,Iskjaskst81,20,Male,162,Abyssal Shard,2.67
8,8,Undjask33,22,Male,21,Souleater,1.1
9,9,Chanosian48,35,Other / Non-Disclosed,136,Ghastly Adamantite Protector,3.58


## Player Count

* Display the total number of players


In [44]:
total_players = purchase_data["SN"].nunique()
total_players

576

## Purchasing Analysis (Total)

In [3]:
unique_items = purchase_data["Item ID"].nunique()
unique_items

183

In [4]:
avg_price = purchase_data["Price"].mean()
avg_price

3.050987179487176

In [5]:
total_purchase = purchase_data["Purchase ID"].nunique()
total_purchase

780

In [47]:
total_revenue = purchase_data["Price"].sum()
total_revenue

2379.77

## Gender Demographics

* Percentage and Count of Male Players


* Percentage and Count of Female Players


* Percentage and Count of Other / Non-Disclosed




In [7]:
male_df = purchase_data.loc[purchase_data["Gender"] == "Male"]
count_male = male_df["SN"].nunique()
count_male

484

In [8]:
male_percent = ((count_male/total_players) * 100)
male_percent

84.02777777777779

In [9]:
female_df = purchase_data.loc[purchase_data["Gender"] == "Female"]
count_female = female_df["SN"].nunique()
count_female

81

In [10]:
female_percent = ((count_female/total_players) * 100)
female_percent

14.0625

In [11]:
other_df = purchase_data.loc[purchase_data["Gender"] == "Other / Non-Disclosed"]
count_other = other_df["SN"].nunique()
count_other

11

In [12]:
other_percent = ((count_other/total_players) * 100)
other_percent

1.9097222222222223


## Purchasing Analysis (Gender)

* Run basic calculations to obtain purchase count, avg. purchase price, avg. purchase total per person etc. by gender




* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame

In [13]:
avg_male_price = male_df["Price"].mean()
avg_male_price

3.0178527607361953

In [14]:
purchase_count_male = male_df["Purchase ID"].nunique()
purchase_count_male

652

In [15]:
total_price_male = male_df["Price"].sum()
total_price_male

1967.64

In [16]:
#avg total purchase per person

In [17]:
avg_female_price = female_df["Price"].mean()
avg_female_price

3.203008849557519

In [18]:
purchase_count_female = female_df["Purchase ID"].nunique()
purchase_count_female

113

In [19]:
total_price_female = female_df["Price"].sum()
total_price_female

361.94

In [20]:
#female avg person purchase

In [21]:
avg_other_price = other_df["Price"].mean()
avg_other_price

3.3460000000000005

In [22]:
purchase_count_other= other_df["Purchase ID"].nunique()
purchase_count_other

15

In [23]:
total_price_other = other_df["Price"].sum()
total_price_other

50.19

In [24]:
#other avg person

## Age Demographics

* Establish bins for ages


* Categorize the existing players using the age bins. Hint: use pd.cut()


* Calculate the numbers and percentages by age group


* Create a summary data frame to hold the results


* Optional: round the percentage column to two decimal points


* Display Age Demographics Table


In [25]:
bins = [0, 10, 15, 20, 25, 30, 35, 40, 100]
labels = ["<10", "10-14","15-19","20-24","25-29","30-34","35-39","40+"]
age_range = pd.cut(purchase_data["Age"], bins, labels=labels)
age_range.head(10)

0    15-19
1    35-39
2    20-24
3    20-24
4    20-24
5    20-24
6    35-39
7    15-19
8    20-24
9    30-34
Name: Age, dtype: category
Categories (8, object): [<10 < 10-14 < 15-19 < 20-24 < 25-29 < 30-34 < 35-39 < 40+]

## Top Spenders

In [26]:
top_sn = purchase_data["SN"].value_counts().nlargest(5) 
top_sn

Lisosia93      5
Iral74         4
Idastidru52    4
Ialallo29      3
Raesty92       3
Name: SN, dtype: int64

In [27]:
#find average price per individual and total

In [28]:
sn_numbers = purchase_data["SN"].groupby(purchase_data["SN"]).value_counts().nlargest()
sn_numbers

SN           SN         
Lisosia93    Lisosia93      5
Idastidru52  Idastidru52    4
Iral74       Iral74         4
Aelin32      Aelin32        3
Aina42       Aina42         3
Name: SN, dtype: int64

In [29]:
largest_five = pd.DataFrame(purchase_data.groupby('SN')['Price'].sum().nlargest(5))
largest_five

Unnamed: 0_level_0,Price
SN,Unnamed: 1_level_1
Lisosia93,18.96
Idastidru52,15.45
Chamjask73,13.83
Iral74,13.62
Iskadarya95,13.1


In [30]:
pd.merge(purchase_data, largest_five, how='inner', on=['SN'])


Unnamed: 0,Purchase ID,SN,Age,Gender,Item ID,Item Name,Price_x,Price_y
0,74,Lisosia93,25,Male,89,"Blazefury, Protector of Delusions",4.64,18.96
1,120,Lisosia93,25,Male,24,Warped Fetish,3.81,18.96
2,224,Lisosia93,25,Male,157,"Spada, Etcher of Hatred",4.8,18.96
3,603,Lisosia93,25,Male,141,Persuasion,3.19,18.96
4,609,Lisosia93,25,Male,40,Second Chance,2.52,18.96
5,128,Iral74,21,Male,58,"Freak's Bite, Favor of Holy Might",4.14,13.62
6,623,Iral74,21,Male,114,Yearning Mageblade,3.82,13.62
7,758,Iral74,21,Male,182,Toothpick,4.03,13.62
8,776,Iral74,21,Male,164,Exiled Doomblade,1.63,13.62
9,148,Iskadarya95,20,Male,148,"Warmonger, Gift of Suffering's End",4.03,13.1


In [31]:
avg_five = pd.DataFrame(purchase_data.groupby('SN')['Price'].mean().nlargest(5))
avg_five 

Unnamed: 0_level_0,Price
SN,Unnamed: 1_level_1
Dyally87,4.99
Chanirrasta87,4.94
Lirtilsa71,4.94
Ririp86,4.94
Yarithsurgue62,4.94


In [32]:
pd.merge(purchase_data, avg_five, how='inner', on=['SN'])

Unnamed: 0,Purchase ID,SN,Age,Gender,Item ID,Item Name,Price_x,Price_y
0,110,Ririp86,25,Male,139,"Mercy, Katana of Dismay",4.94,4.94
1,231,Yarithsurgue62,26,Male,139,"Mercy, Katana of Dismay",4.94,4.94
2,246,Lirtilsa71,24,Male,139,"Mercy, Katana of Dismay",4.94,4.94
3,493,Chanirrasta87,14,Male,139,"Mercy, Katana of Dismay",4.94,4.94
4,554,Dyally87,22,Male,63,Stormfury Mace,4.99,4.99


## Most Popular Items

* Retrieve the Item ID, Item Name, and Item Price columns


* Group by Item ID and Item Name. Perform calculations to obtain purchase count, item price, and total purchase value


* Create a summary data frame to hold the results


* Sort the purchase count column in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the summary data frame



In [33]:
purchase_data["Item Name"].value_counts().nlargest(5) 

Final Critic                                    13
Oathbreaker, Last Hope of the Breaking Storm    12
Persuasion                                       9
Extraction, Quickblade Of Trembling Hands        9
Fiery Glass Crusader                             9
Name: Item Name, dtype: int64

In [34]:
largest_item = pd.DataFrame(purchase_data.groupby('Item Name')['Price'].value_counts().nlargest(5))
largest_item

Unnamed: 0_level_0,Unnamed: 1_level_0,Price
Item Name,Price,Unnamed: 2_level_1
"Oathbreaker, Last Hope of the Breaking Storm",4.23,12
"Extraction, Quickblade Of Trembling Hands",3.53,9
Fiery Glass Crusader,4.58,9
Nirvana,4.9,9
Brutality Ivory Warmace,2.42,8


In [35]:
pd.merge(purchase_data, largest_item, how='inner', on=['Item Name'], suffixes=('Per Item', 'Item Total' ))

Unnamed: 0,Purchase ID,SN,Age,Gender,Item ID,Item Name,PricePer Item,PriceItem Total
0,0,Lisim78,20,Male,108,"Extraction, Quickblade Of Trembling Hands",3.53,9
1,56,Raesty92,12,Male,108,"Extraction, Quickblade Of Trembling Hands",3.53,9
2,157,Marast30,18,Female,108,"Extraction, Quickblade Of Trembling Hands",3.53,9
3,414,Marokian45,23,Male,108,"Extraction, Quickblade Of Trembling Hands",3.53,9
4,483,Saistyphos30,35,Male,108,"Extraction, Quickblade Of Trembling Hands",3.53,9
5,498,Firon67,35,Male,108,"Extraction, Quickblade Of Trembling Hands",3.53,9
6,558,Lassadarsda57,25,Female,108,"Extraction, Quickblade Of Trembling Hands",3.53,9
7,640,Sausosia74,22,Male,108,"Extraction, Quickblade Of Trembling Hands",3.53,9
8,678,Rarallo90,33,Male,108,"Extraction, Quickblade Of Trembling Hands",3.53,9
9,18,Reunasu60,22,Female,82,Nirvana,4.9,9


## Most Profitable Items

* Sort the above table by total purchase value in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the data frame



In [36]:
item_five = pd.DataFrame(purchase_data.groupby('Item ID')['Price'].sum().nlargest(5))
item_five

Unnamed: 0_level_0,Price
Item ID,Unnamed: 1_level_1
178,50.76
82,44.1
145,41.22
92,39.04
103,34.8


In [37]:
merge = pd.merge(purchase_data, item_five, how='inner', on=['Item ID'])
merge_again = merge.groupby("Item Name")
merge_again["Item Name"].count().nlargest()

Item Name
Oathbreaker, Last Hope of the Breaking Storm    12
Fiery Glass Crusader                             9
Nirvana                                          9
Final Critic                                     8
Singed Scalpel                                   8
Name: Item Name, dtype: int64

In [38]:
#.dataframe is screwing up data for some reason
pd.DataFrame()