### Heroes Of Pymoli Data Analysis
* Of the 1163 active players, the vast majority are male (84%). There also exists, a smaller, but notable proportion of female players (14%).

* Our peak age demographic falls between 20-24 (44.8%) with secondary groups falling between 15-19 (18.60%) and 25-29 (13.4%).  
-----

### Note
* Instructions have been included for each segment. You do not have to follow them exactly, but they are included to help you think through the steps.

In [1]:
# Dependencies and Setup
import pandas as pd

# File to Load (Remember to Change These)
file_to_load = "Resources/purchase_data.csv"

# Read Purchasing File and store into Pandas data frame
purchase_data = pd.read_csv(file_to_load)
purchase_data.head(100)

Unnamed: 0,Purchase ID,SN,Age,Gender,Item ID,Item Name,Price
0,0,Lisim78,20,Male,108,"Extraction, Quickblade Of Trembling Hands",3.53
1,1,Lisovynya38,40,Male,143,Frenzied Scimitar,1.56
2,2,Ithergue48,24,Male,92,Final Critic,4.88
3,3,Chamassasya86,24,Male,100,Blindscythe,3.27
4,4,Iskosia90,23,Male,131,Fury,1.44
...,...,...,...,...,...,...,...
95,95,Ilassast39,18,Male,164,Exiled Doomblade,1.63
96,96,Lisassala98,16,Male,56,Foul Titanium Battle Axe,2.92
97,97,Aiduecal76,20,Male,134,Undead Crusader,4.50
98,98,Chadossa89,23,Male,132,Persuasion,3.19


In [2]:
purchase_data.tail()

Unnamed: 0,Purchase ID,SN,Age,Gender,Item ID,Item Name,Price
775,775,Aethedru70,21,Female,60,Wolf,3.54
776,776,Iral74,21,Male,164,Exiled Doomblade,1.63
777,777,Yathecal72,20,Male,67,"Celeste, Incarnation of the Corrupted",3.46
778,778,Sisur91,7,Male,92,Final Critic,4.19
779,779,Ennrian78,24,Male,50,Dawn,4.6


In [3]:
purchase_data.describe()

Unnamed: 0,Purchase ID,Age,Item ID,Price
count,780.0,780.0,780.0,780.0
mean,389.5,22.714103,91.755128,3.050987
std,225.310896,6.659444,52.697702,1.169549
min,0.0,7.0,0.0,1.0
25%,194.75,20.0,47.75,1.98
50%,389.5,22.0,92.0,3.15
75%,584.25,25.0,138.0,4.08
max,779.0,45.0,183.0,4.99


In [4]:
purchase_data[['SN','Age']]

purchase_data.iloc[row, column] ==> purchase_data.iloc[5:10,[1,2] ]
purchase_data.loc[row, column] ==> purchase_data.iloc[5:10,['SN','Age']]

SyntaxError: invalid syntax (<ipython-input-4-5c2e595d74b8>, line 3)

## Player Count

* Display the total number of players


In [None]:
purchase_data['SN'].nunique()

## Purchasing Analysis (Total)

* Run basic calculations to obtain number of unique items, average price, etc.


* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame


In [None]:
#Run basic calculations to obtain number of unique items, average price, etc.
id_ = purchase_data['Item ID'].nunique()
Price_average = purchase_data['Price'].mean()
print(id_)
print(Price_average)

In [None]:
#Create a summary data frame to hold the results
summary_results = {'name':['Item ID','Price'],'summary':[179,3.050987179487176]}

summary_results = pd.DataFrame(summary_results,columns=['name','summary'])
print(summary_results)

In [None]:
#Optional: give the displayed data cleaner formatting
summary_results.head()

In [None]:
# Display the summary data frame
summary_results.describe()

## Gender Demographics

* Percentage and Count of Male Players


* Percentage and Count of Female Players


* Percentage and Count of Other / Non-Disclosed




In [5]:
# This is just the count of all players
Male_Players = purchase_data['Gender'].count()
print(Male_Players)

780


In [6]:
# Percentage and Count of Male Players
Male_Players = purchase_data['Gender'].value_counts()
print(Male_Players)


Male                     652
Female                   113
Other / Non-Disclosed     15
Name: Gender, dtype: int64


In [32]:
Male_Players = purchase_data["Gender"].value_counts()
total_gender = Male_Players.sum()
perc_gender=Male_Players/total_gender *100

demo_df=pd.concat([perc_gender, Male_Players], axis=1)
demo_df.columns=["Percentage of Players", "Total Counts"]
demo_df["Percentage of Players"]=demo_df["Percentage of Players"].map("{:,.2f}%".format)
demo_df

Unnamed: 0,Percentage of Players,Total Counts
Male,83.59%,652
Female,14.49%,113
Other / Non-Disclosed,1.92%,15



## Purchasing Analysis (Gender)

* Run basic calculations to obtain purchase count, avg. purchase price, avg. purchase total per person etc. by gender




* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame

In [33]:
grouped_df=purchase_data.groupby(["Gender"])
gender_spend=grouped_df["Price"].sum()
avg_purchase=gender_spend/Male_Players

pag_df=pd.concat([Male_Players, avg_purchase, gender_spend], axis=1)

pag_df

Unnamed: 0,Gender,0,Price
Male,652,3.017853,1967.64
Female,113,3.203009,361.94
Other / Non-Disclosed,15,3.346,50.19


## Age Demographics

* Establish bins for ages


* Categorize the existing players using the age bins. Hint: use pd.cut()


* Calculate the numbers and percentages by age group


* Create a summary data frame to hold the results


* Optional: round the percentage column to two decimal points


* Display Age Demographics Table


In [35]:
age_bins = [0, 9.90, 14.90, 19.90, 24.90, 29.90, 34.90, 39.90, 99999]
group_names = ["<10", "10-14", "15-19", "20-24", "25-29", "30-34", "35-39", "40+"]
purchase_data["Age Group"]=pd.cut(purchase_data["Age"], age_bins, labels=group_names)
age_grouped_df=purchase_data.groupby(["Age Group"])

age_grouped_df

<pandas.core.groupby.generic.DataFrameGroupBy object at 0x0000025E2FC59B38>

## Purchasing Analysis (Age)

* Bin the purchase_data data frame by age


* Run basic calculations to obtain purchase count, avg. purchase price, avg. purchase total per person etc. in the table below


* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame

## Top Spenders

* Run basic calculations to obtain the results in the table below


* Create a summary data frame to hold the results


* Sort the total purchase value column in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the summary data frame



## Most Popular Items

* Retrieve the Item ID, Item Name, and Item Price columns


* Group by Item ID and Item Name. Perform calculations to obtain purchase count, item price, and total purchase value


* Create a summary data frame to hold the results


* Sort the purchase count column in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the summary data frame



## Most Profitable Items

* Sort the above table by total purchase value in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the data frame

