### Heroes Of Pymoli Data Analysis
* Of the 1163 active players, the vast majority are male (84%). There also exists, a smaller, but notable proportion of female players (14%).

* Our peak age demographic falls between 20-24 (44.8%) with secondary groups falling between 15-19 (18.60%) and 25-29 (13.4%).  
-----

### Note
* Instructions have been included for each segment. You do not have to follow them exactly, but they are included to help you think through the steps.

In [1]:
# Dependencies and Setup
import pandas as pd
import numpy as np

# File to Load (Remember to Change These)
file_to_load = "Resources/purchase_data.csv"

# Read Purchasing File and store into Pandas data frame
purchase_data = pd.read_csv(file_to_load)

In [2]:
purchase_data.head()

Unnamed: 0,Purchase ID,SN,Age,Gender,Item ID,Item Name,Price
0,0,Lisim78,20,Male,108,"Extraction, Quickblade Of Trembling Hands",3.53
1,1,Lisovynya38,40,Male,143,Frenzied Scimitar,1.56
2,2,Ithergue48,24,Male,92,Final Critic,4.88
3,3,Chamassasya86,24,Male,100,Blindscythe,3.27
4,4,Iskosia90,23,Male,131,Fury,1.44


## Player Count

* Display the total number of players


In [3]:
num_players=len(purchase_data["SN"].unique())
print(num_players)

576


## Purchasing Analysis (Total)

* Run basic calculations to obtain number of unique items, average price, etc.


* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame


In [4]:
uniq_items=len(purchase_data["Item ID"].unique())
print(uniq_items)

183


In [5]:
avg_price=purchase_data["Price"].mean()
print(avg_price)

3.050987179487176


In [6]:
numb_purchases=len(purchase_data["Purchase ID"])
print(numb_purchases)

780


In [7]:
total_revenue=purchase_data["Price"].sum()
print(total_revenue)

2379.77


In [8]:
summary_df=pd.DataFrame([{"Number of unique items":uniq_items,
                       "Average purchase price":avg_price,
                       "Total number of purchases":numb_purchases,
                        "Total revenue":total_revenue}])
summary_df

Unnamed: 0,Average purchase price,Number of unique items,Total number of purchases,Total revenue
0,3.050987,183,780,2379.77


In [9]:
summary_df=summary_df[["Number of unique items","Average purchase price","Total number of purchases","Total revenue"]]

In [10]:
summary_df

Unnamed: 0,Number of unique items,Average purchase price,Total number of purchases,Total revenue
0,183,3.050987,780,2379.77


## Gender Demographics

* Percentage and Count of Male Players


* Percentage and Count of Female Players


* Percentage and Count of Other / Non-Disclosed




In [26]:
gender_purchase_data=purchase_data.groupby(["Gender"])
gender_uniq_data=gender_purchase_data["SN"].unique()
#print(gender_uniq_data)

num_female_uniq=len(gender_uniq_data["Female"])
num_male_uniq=len(gender_uniq_data["Male"])
num_other_uniq=len(gender_uniq_data["Other / Non-Disclosed"])
print(f'Count of male players: {num_male_uniq}')
print(f'Count of female players: {num_female_uniq}')
print(f'Count of other players: {num_other_uniq}')

perc_male=round((num_male_uniq/num_players)*100)
perc_female=round((num_female_uniq/num_players)*100)
perc_other=round((num_other_uniq/num_players)*100)
print(f'Percentage of male players: {perc_male}%')
print(f'Percentage of female players: {perc_female}%')
print(f'Percentage of other players: {perc_other}%')

Count of male players: 484
Count of female players: 81
Count of other players: 11
Percentage of male players: 84%
Percentage of female players: 14%
Percentage of other players: 2%



## Purchasing Analysis (Gender)

* Run basic calculations to obtain purchase count, avg. purchase price, avg. purchase total per person etc. by gender




* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame

In [12]:
purch_value_gender=gender_purchase_data["Price"].sum()
purch_value_male=round(purch_value_gender["Male"],2)
purch_value_female=round(purch_value_gender["Female"],2)
purch_value_other=round(purch_value_gender["Other / Non-Disclosed"],2)
print(f'Total value of purchases for males: {purch_value_male}')
print(f'Total value of purchases for females: {purch_value_female}')
print(f'Total value purchases for others: {purch_value_other}')

Total value of purchases for males: 1967.64
Total value of purchases for females: 361.94
Total value purchases for others: 50.19


In [13]:
purch_count_gender=gender_purchase_data["Purchase ID"].count()
purch_count_male=round(purch_count_gender["Male"],2)
purch_count_female=round(purch_count_gender["Female"],2)
purch_count_other=round(purch_count_gender["Other / Non-Disclosed"],2)
print(f'Total purchases for males: {purch_count_male}')
print(f'Total purchases for females: {purch_count_female}')
print(f'Total purchases for others: {purch_count_other}')

Total purchases for males: 652
Total purchases for females: 113
Total purchases for others: 15


In [14]:
purch_avg_price=gender_purchase_data["Price"].mean()
purch_avg_male=round(purch_avg_price["Male"],2)
purch_avg_female=round(purch_avg_price["Female"],2)
purch_avg_other=round(purch_avg_price["Other / Non-Disclosed"],2)
print(f'Average purchases for males: {purch_avg_male}')
print(f'Average purchases for females: {purch_avg_female}')
print(f'Average purchases for others: {purch_avg_other}')

Average purchases for males: 3.02
Average purchases for females: 3.2
Average purchases for others: 3.35


In [15]:
#avg purch tot per person
avg_purch_person=purchase_data.groupby(["Gender","SN"])
avg_purch_person_sum=avg_purch_person["Price"].sum()
avg_total_male=round(avg_purch_person_sum["Male"].mean(),2)
print(f'Average of total purchases per person (males): {avg_total_male}')
avg_total_female=round(avg_purch_person_sum["Female"].mean(),2)
print(f'Average of total purchases per person (females): {avg_total_female}')
avg_total_other=round(avg_purch_person_sum["Other / Non-Disclosed"].mean(),2)
print(f'Average of total purchases per person (others): {avg_total_other}')

Average of total purchases per person (males): 4.07
Average of total purchases per person (females): 4.47
Average of total purchases per person (others): 4.56


In [16]:
gender_summary_df=pd.DataFrame({"Total value of purchases": [purch_value_male, purch_value_female, purch_value_other],
                                "Number of purchases": [purch_count_male, purch_count_female, purch_count_other],
                                "Average of purchases":[purch_avg_male, purch_avg_female, purch_avg_other],
                                "Average of total purchases per person":[avg_total_male, avg_total_female, avg_total_other],
                                "Gender":["male", "female","other"]})
gender_summary_df

Unnamed: 0,Total value of purchases,Number of purchases,Average of purchases,Average of total purchases per person,Gender
0,1967.64,652,3.02,4.07,male
1,361.94,113,3.2,4.47,female
2,50.19,15,3.35,4.56,other


In [17]:
gender_summary_df=gender_summary_df.set_index("Gender")
gender_summary_df

Unnamed: 0_level_0,Total value of purchases,Number of purchases,Average of purchases,Average of total purchases per person
Gender,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
male,1967.64,652,3.02,4.07
female,361.94,113,3.2,4.47
other,50.19,15,3.35,4.56


## Age Demographics

* Establish bins for ages


* Categorize the existing players using the age bins. Hint: use pd.cut()


* Calculate the numbers and percentages by age group


* Create a summary data frame to hold the results


* Optional: round the percentage column to two decimal points


* Display Age Demographics Table


In [92]:
bins=[0,9,14,19,24,29,34,39,100]
group_names=["<10","10-14","15-19","20-24","25-29","30-34","35-40",">40"]
age_purch_data=pd.cut(purchase_data["Age"],bins,labels=group_names)
purchase_data["Age bins"]=age_purch_data
purchase_data_clean=purchase_data.drop_duplicates(["SN"])
age_counts=pd.DataFrame(purchase_data_clean["Age bins"].value_counts())
age_counts=age_counts.reindex(["<10","10-14","15-19","20-24","25-29","30-34","35-40",">40"])
age_counts=age_counts.rename(columns={"Age bins":"Total count"})
age_counts["Percentage of players"]=round(age_counts["Total count"]*100/num_players,2)
age_counts

Unnamed: 0,Total count,Percentage of players
<10,17,2.95
10-14,22,3.82
15-19,107,18.58
20-24,258,44.79
25-29,77,13.37
30-34,52,9.03
35-40,31,5.38
>40,12,2.08


## Purchasing Analysis (Age)

* Bin the purchase_data data frame by age


* Run basic calculations to obtain purchase count, avg. purchase price, avg. purchase total per person etc. in the table below


* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame

## Top Spenders

* Run basic calculations to obtain the results in the table below


* Create a summary data frame to hold the results


* Sort the total purchase value column in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the summary data frame



In [93]:
#dentify the the top 5 spenders in the game by total purchase value, then list (in a table):
purch_per_person=purchase_data.groupby(["SN"])
purch_value= purch_per_person.sum()
purch_value=purch_value.sort_values("Price",ascending=False)
top_5_sp=purch_value.head()

#SN
top_5_names=list(top_5_sp.index)

#Purchase Count
top_5_purch_count=purch_per_person.count()
first_count=top_5_purch_count.loc[top_5_names[0],"Purchase ID"]
second_count=top_5_purch_count.loc[top_5_names[1],"Purchase ID"]
third_count=top_5_purch_count.loc[top_5_names[2],"Purchase ID"]
fourth_count=top_5_purch_count.loc[top_5_names[3],"Purchase ID"]
fifth_count=top_5_purch_count.loc[top_5_names[4],"Purchase ID"]


#Average Purchase Price
avg_purch_price=purch_per_person.mean()
first_avg=round(avg_purch_price.loc[top_5_names[0],"Price"],2)
second_avg=round(avg_purch_price.loc[top_5_names[1],"Price"],2)
third_avg=round(avg_purch_price.loc[top_5_names[2],"Price"],2)
fourth_avg=round(avg_purch_price.loc[top_5_names[3],"Price"],2)
fifth_avg=round(avg_purch_price.loc[top_5_names[4],"Price"],2)

#Total Purchase Value
total_value_top_5=list(round(top_5_sp["Price"],2))

top_5_sp_df=pd.DataFrame({"Name":top_5_names,"Purchase count":[first_count,second_count,third_count,fourth_count,fifth_count],
                          "Average purchase price":[first_avg,second_avg,third_avg,fourth_avg,fifth_avg],
                          "Total purchase value":total_value_top_5})
top_5_sp_df=top_5_sp_df.set_index("Name")
top_5_sp_df


Unnamed: 0_level_0,Purchase count,Average purchase price,Total purchase value
Name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Lisosia93,5,3.79,18.96
Idastidru52,4,3.86,15.45
Chamjask73,3,4.61,13.83
Iral74,4,3.4,13.62
Iskadarya95,3,4.37,13.1


## Most Popular Items


* Retrieve the Item ID, Item Name, and Item Price columns


* Group by Item ID and Item Name. Perform calculations to obtain purchase count, item price, and total purchase value


* Create a summary data frame to hold the results


* Sort the purchase count column in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the summary data frame

## Most Profitable Items

* Sort the above table by total purchase value in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the data frame

