### Heroes Of Pymoli Data Analysis
* Of the 1163 active players, the vast majority are male (84%). There also exists, a smaller, but notable proportion of female players (14%).

* Our peak age demographic falls between 20-24 (44.8%) with secondary groups falling between 15-19 (18.60%) and 25-29 (13.4%).  
-----

### Note
* Instructions have been included for each segment. You do not have to follow them exactly, but they are included to help you think through the steps.

In [37]:
# Dependencies and Setup
import pandas as pd
import numpy as np

# File to Load (Remember to Change These)
file_to_load = "Resources/purchase_data.csv"

# Read Purchasing File and store into Pandas data frame
purchase_data = pd.read_csv(file_to_load)

In [38]:
purchase_data.head()


Unnamed: 0,Purchase ID,SN,Age,Gender,Item ID,Item Name,Price
0,0,Lisim78,20,Male,108,"Extraction, Quickblade Of Trembling Hands",3.53
1,1,Lisovynya38,40,Male,143,Frenzied Scimitar,1.56
2,2,Ithergue48,24,Male,92,Final Critic,4.88
3,3,Chamassasya86,24,Male,100,Blindscythe,3.27
4,4,Iskosia90,23,Male,131,Fury,1.44


## Player Count

* Display the total number of players


In [39]:
total_no_players = len(purchase_data["SN"].unique())

print("Total Number of Players: " + str(total_no_players))

Total Number of Players: 576


## Purchasing Analysis (Total)

In [40]:
number_items = purchase_data["Item Name"].nunique()  
average_price = round(purchase_data["Price"].mean(),2)
number_purchases = purchase_data["Purchase ID"].count()
total_revenue = purchase_data["Price"].sum()

data_summary = pd.DataFrame({"Number items":[str(number_items)],
                             "Average Price":["$ " + str(average_price)],
                             "Number Purchases":[str(number_purchases)],
                             "Total Revenue":["$ " + str(total_revenue)]
                            })
data_summary




Unnamed: 0,Number items,Average Price,Number Purchases,Total Revenue
0,179,$ 3.05,780,$ 2379.77



## Gender Demographics

In [41]:
total_counts = purchase_data['Gender'].count()
male_counts = purchase_data['Gender'].value_counts().loc['Male']
female_counts = purchase_data['Gender'].value_counts().loc['Female']
other_counts = purchase_data['Gender'].value_counts().loc['Other / Non-Disclosed']

male_percent = round((male_counts * 100) / (total_counts),2)                                                      
female_percent = round((female_counts * 100) / (total_counts),2)                                                        
other_percent =round((other_counts * 100) / (total_counts),2)

gender_demo_summary = pd.DataFrame({"Male counts":[str(male_counts)],
                                   "Male percentage":[str(male_percent) + "%"],
                                  "Female counts":[str(female_counts)],
                                   "Female percentage":[str(female_percent)+ "%"],
                                   "Other counts":[str(other_counts)],
                                   "Other percentage":[str(male_percent) + "%"],})


gender_demo_summary


Unnamed: 0,Male counts,Male percentage,Female counts,Female percentage,Other counts,Other percentage
0,652,83.59%,113,14.49%,15,83.59%


## Age Demographics

In [42]:
fullcount = purchase_data["SN"].nunique()
years_10 = purchase_data[purchase_data["Age"] < 10]
years_19 = purchase_data[(purchase_data["Age"] >= 10) & (purchase_data["Age"] <= 19)]
years_29 = purchase_data[(purchase_data["Age"] >= 20) & (purchase_data["Age"] <= 29)]
years_39 = purchase_data[(purchase_data["Age"] >= 30) & (purchase_data["Age"] <= 39)]
years_49 = purchase_data[(purchase_data["Age"] >= 40) & (purchase_data["Age"] <= 49)]

age_df = pd.DataFrame({"Age": ["<10", "10-19", "20-29", "30-39", "40-49"],
                        "Percentage of Players": [(years_10["SN"].nunique()/fullcount)*100, (years_19["SN"].nunique()/fullcount)*100,
                                                  (years_29["SN"].nunique()/fullcount)*100, (years_39["SN"].nunique()/fullcount)*100,
                                                 (years_49["SN"].nunique()/fullcount)*100]
                      })

age_final = age_df.set_index("Age")
age_final

Unnamed: 0_level_0,Percentage of Players
Age,Unnamed: 1_level_1
<10,2.951389
10-19,22.395833
20-29,58.159722
30-39,14.409722
40-49,2.083333


## Purchasing Analysis (Age)

In [20]:
#create bins according to age
age_bins =[0,15,20,25,29,100]

age_group =["Below 15","15-19","20-24","25-29","30 and up"]

In [21]:
purchase_data["Age group summary"] = pd.cut(purchase_data["Age"], age_bins, labels=age_group)

purchase_data.head()

Unnamed: 0,Purchase ID,SN,Age,Gender,Item ID,Item Name,Price,Age group summary
0,0,Lisim78,20,Male,108,"Extraction, Quickblade Of Trembling Hands",3.53,15-19
1,1,Lisovynya38,40,Male,143,Frenzied Scimitar,1.56,30 and up
2,2,Ithergue48,24,Male,92,Final Critic,4.88,20-24
3,3,Chamassasya86,24,Male,100,Blindscythe,3.27,20-24
4,4,Iskosia90,23,Male,131,Fury,1.44,20-24


In [19]:
age_purchases = pd.DataFrame(purchase_data["Age group summary"].value_counts())
age_purchases.head()

Unnamed: 0,Age group summary
20-24,325
15-19,200
30 and up,127
Below 15,86
25-29,42


In [22]:
counts_age = purchase_data['Age group summary'].count()
counts_15 = purchase_data['Age group summary'].value_counts().loc['Below 15']
counts_19 = purchase_data['Age group summary'].value_counts().loc['15-19']
counts_24 = purchase_data['Age group summary'].value_counts().loc['20-24']
counts_29 = purchase_data['Age group summary'].value_counts().loc['25-29']
counts_30 = purchase_data['Age group summary'].value_counts().loc['30 and up']

percent_15 = round((counts_15 * 100) / (counts_age),2)
percent_19 = round((counts_19 * 100) / (counts_age),2)                                                      
percent_24 = round((counts_24 * 100) / (counts_age),2)                                                        
percent_29 =round((counts_29 * 100) / (counts_age),2)
percent_30 =round((counts_30 * 100) / (counts_age),2)

In [23]:
age_summary = pd.DataFrame({"Age Group":["Below 15","15-19","20-24","25-29","30 and up"],
                            "# of Players":[str(counts_15),str(counts_19),str(counts_24),str(counts_29),str(counts_30)],
                            "% of Players":[str(percent_15) + "%", str(percent_19) + "%", str(percent_24) + "%", str(percent_29) + "%", str(percent_30) + "%"],})


age_summary

Unnamed: 0,Age Group,# of Players,% of Players
0,Below 15,86,11.03%
1,15-19,200,25.64%
2,20-24,325,41.67%
3,25-29,42,5.38%
4,30 and up,127,16.28%


## Top Spenders

In [43]:
df = purchase_data
sn_total_purchase = df.groupby('SN')['Price'].sum().to_frame()
sn_purchase_count = df.groupby('SN')['Price'].count().to_frame()
sn_purchase_avg = df.groupby('SN')['Price'].mean().to_frame()

sn_total_purchase.columns=["Total Purchase Value"]
join_one = sn_total_purchase.join(sn_purchase_count, how="left")
join_one.columns=["Total Purchase Value", "Purchase Count"]

join_two = join_one.join(sn_purchase_avg, how="inner")
join_two.columns=["Total Purchase Value", "Purchase Count", "Average Purchase Price"]

top_spenders_df = join_two[["Purchase Count", "Average Purchase Price", "Total Purchase Value"]]
top_spenders_final = top_spenders_df.sort_values('Total Purchase Value', ascending=False).head()
top_spenders_final

Unnamed: 0_level_0,Purchase Count,Average Purchase Price,Total Purchase Value
SN,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Lisosia93,5,3.792,18.96
Idastidru52,4,3.8625,15.45
Chamjask73,3,4.61,13.83
Iral74,4,3.405,13.62
Iskadarya95,3,4.366667,13.1


## Most Popular Items

* Retrieve the Item ID, Item Name, and Item Price columns


* Group by Item ID and Item Name. Perform calculations to obtain purchase count, item price, and total purchase value


* Create a summary data frame to hold the results


* Sort the purchase count column in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the summary data frame



## Most Profitable Items

* Sort the above table by total purchase value in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the data frame

