### Heroes Of Pymoli Data Analysis
* Of the 1163 active players, the vast majority are male (84%). There also exists, a smaller, but notable proportion of female players (14%).

* Our peak age demographic falls between 20-24 (44.8%) with secondary groups falling between 15-19 (18.60%) and 25-29 (13.4%).  
-----

### Note
* Instructions have been included for each segment. You do not have to follow them exactly, but they are included to help you think through the steps.

In [120]:
# Dependencies and Setup
import pandas as pd
import numpy as np

# File to Load (Remember to Change These)
file_to_load = "Resources/purchase_data.csv"

# Read Purchasing File and store into Pandas data frame
purchase_data = pd.read_csv(file_to_load)
#purchase_data.head(10)

## Player Count

* Display the total number of players


In [141]:
#Calculated total 
uniquePlayers=purchase_data["SN"].nunique()
uniquePlayers

576

In [143]:
Total_df=pd.DataFrame({"Total Players":[uniquePlayers]})
Total_df

Unnamed: 0,Total Players
0,576


## Purchasing Analysis (Total)

* Run basic calculations to obtain number of unique items, average price, etc.


* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame


In [7]:
unique_Items=purchase_data["Item ID"].value_counts()
unique_Items_count=unique_Items.count()

average_Price=round(purchase_data["Price"].mean(),2)

Num_purchases=purchase_data["Item Name"].count()

Total_revenue=purchase_data["Price"].sum()


In [6]:
#Create Purachse Summary dataframe
purchase_summary_df=pd.DataFrame({ "Number of Unique Items" : [unique_Items_count],
                                   "Price":[average_Price],
                                   "Item Name":[Num_purchases],
                                   "Total_revenue ":[Total_revenue]})
purchase_summary_df

Unnamed: 0,Number of Unique Items,Price,Item Name,Total_revenue
0,183,3.05,780,2379.77


## Gender Demographics

* Percentage and Count of Male Players


* Percentage and Count of Female Players


* Percentage and Count of Other / Non-Disclosed




In [58]:
def temp(rows): 
    return  rows["SN"].nunique()

players_by_gender=purchase_data.groupby(["Gender"]).apply(temp)

players_by_gender_df=players_by_gender.to_frame()
players_by_gender_df.columns=["Total Count"]
Players_percentage_byGender=round(players_by_gender_df['Total Count']/576*100,2)
players_by_gender_df['Percentage of Players']=Players_percentage_byGender
players_by_gender_df.sort_values('Total Count',ascending=False)

Unnamed: 0_level_0,Total Count,Percentage of Players
Gender,Unnamed: 1_level_1,Unnamed: 2_level_1
Male,484,84.03
Female,81,14.06
Other / Non-Disclosed,11,1.91


In [155]:
GenderCount=purchase_data.groupby(["Gender"])["SN"].nunique()
GenderCount

Gender
Female                    81
Male                     484
Other / Non-Disclosed     11
Name: SN, dtype: int64


## Purchasing Analysis (Gender)

* Run basic calculations to obtain purchase count, avg. purchase price, avg. purchase total per person etc. by gender




* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame

In [157]:
Purchasing_by_Gender_df=purchase_data.loc[:,["Purchase ID","Gender","Item ID","Price"]]
groupby_obj=Purchasing_by_Gender_df.groupby(["Gender"])

Purchase_count=groupby_obj["Purchase ID"].count()
#Purchase_count

ava_PurPrice=round(groupby_obj["Price"].mean(),2)
#ava_PurPrice

Pur_byGender=groupby_obj["Price"].sum()
#Pur_byGender

ava_pur_perPerson= round(Pur_byGender/GenderCount,2)
print("Avaperperson:" + str(ava_pur_perPerson))


Avaperperson:Gender
Female                   4.47
Male                     4.07
Other / Non-Disclosed    4.56
dtype: float64


In [158]:
#Create Purachse Summary By Gender  dataframe
purchase_summary_By_Gender_df=pd.DataFrame({ "Purchase Count" : Purchase_count,
                                             "Average Purchase Price":ava_PurPrice,
                                             "Total Purchase Value":Pur_byGender,
                                             "Avg Total Purchase per Person":ava_pur_perPerson})
purchase_summary_By_Gender_df

Unnamed: 0_level_0,Purchase Count,Average Purchase Price,Total Purchase Value,Avg Total Purchase per Person
Gender,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
Female,113,3.2,361.94,4.47
Male,652,3.02,1967.64,4.07
Other / Non-Disclosed,15,3.35,50.19,4.56


In [119]:
purchase_data.tail()

Unnamed: 0,Purchase ID,SN,Age,Gender,Item ID,Item Name,Price
775,775,Aethedru70,21,Female,60,Wolf,3.54
776,776,Iral74,21,Male,164,Exiled Doomblade,1.63
777,777,Yathecal72,20,Male,67,"Celeste, Incarnation of the Corrupted",3.46
778,778,Sisur91,7,Male,101,Final Critic,4.19
779,779,Ennrian78,24,Male,50,Dawn,4.6


## Age Demographics

* Establish bins for ages


* Categorize the existing players using the age bins. Hint: use pd.cut()


* Calculate the numbers and percentages by age group


* Create a summary data frame to hold the results


* Optional: round the percentage column to two decimal points


* Display Age Demographics Table


In [89]:
#Create Age bins
Age_bins=[0,9,14,19,24,29,34,39,155]
group_names=['<10','10-14','15-19','20-24','25-29','30-34','35-39','40+']

In [99]:
AgeGroup=purchase_data.groupby(['Age'])
#print (purchase_data.groupby('Age').groups)

In [102]:
Age_Demo_df=purchase_data.copy()
Age_Demo_df["Age Group"]=pd.cut(Age_Demo_df['Age'],Age_bins, labels=group_names)
#Age_Demo_df

In [105]:
Age_groupby_data=Age_Demo_df.groupby(["Age Group"])
#print (Age_Demo_df.groupby('Age Group').groups)


In [163]:
Purchase_count=Age_groupby_data["SN"].nunique()
Purchase_count

Age Group
<10       17
10-14     22
15-19    107
20-24    258
25-29     77
30-34     52
35-39     31
40+       12
Name: SN, dtype: int64

In [164]:
PercentageOFPlayers=round(Purchase_count/uniquePlayers*100,2)
PercentageOFPlayers

Age Group
<10       2.95
10-14     3.82
15-19    18.58
20-24    44.79
25-29    13.37
30-34     9.03
35-39     5.38
40+       2.08
Name: SN, dtype: float64

In [165]:
Age_Demo_df=pd.DataFrame({ "Total Count" : Purchase_count,
                                             "Percentage of Players":PercentageOFPlayers})
Age_Demo_df

Unnamed: 0_level_0,Total Count,Percentage of Players
Age Group,Unnamed: 1_level_1,Unnamed: 2_level_1
<10,17,2.95
10-14,22,3.82
15-19,107,18.58
20-24,258,44.79
25-29,77,13.37
30-34,52,9.03
35-39,31,5.38
40+,12,2.08


## Purchasing Analysis (Age)

* Bin the purchase_data data frame by age


* Run basic calculations to obtain purchase count, avg. purchase price, avg. purchase total per person etc. in the table below


* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame

## Top Spenders

* Run basic calculations to obtain the results in the table below


* Create a summary data frame to hold the results


* Sort the total purchase value column in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the summary data frame



## Most Popular Items

* Retrieve the Item ID, Item Name, and Item Price columns


* Group by Item ID and Item Name. Perform calculations to obtain purchase count, item price, and total purchase value


* Create a summary data frame to hold the results


* Sort the purchase count column in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the summary data frame



## Most Profitable Items

* Sort the above table by total purchase value in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the data frame

