### Heroes Of Pymoli Data Analysis
* Of the 1163 active players, the vast majority are male (84%). There also exists, a smaller, but notable proportion of female players (14%).

* Our peak age demographic falls between 20-24 (44.8%) with secondary groups falling between 15-19 (18.60%) and 25-29 (13.4%).  
-----

### Note
* Instructions have been included for each segment. You do not have to follow them exactly, but they are included to help you think through the steps.

In [1]:
# Dependencies and Setup
import pandas as pd
import numpy as np

# File to Load (Remember to Change These)
file = "Resources/purchase_data.csv"

# Read Purchasing File and store into Pandas data frame
purchase_df = pd.read_csv(file)

## Player Count

* Display the total number of players


In [2]:
#print(purchase_df)

user_df = purchase_df.groupby('SN').nunique()
print(len(user_df), "Players in this game.")

576 Players in this game.


## Purchasing Analysis (Total)

* Run basic calculations to obtain number of unique items, average price, etc.


* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame


In [5]:
item_df = purchase_df.groupby('Item Name').nunique()
item_sum = purchase_df['Price'].sum()
item_mean = purchase_df['Price'].mean()
#print(item_df)
new_df = pd.DataFrame()
pd.options.display.float_format = '${:,.2f}'.format
new_df['Number of Unique Items'] = [len(item_df)]
new_df["Total of All Transactions"] = item_sum 
new_df["Average Cost per Item"] = item_mean.round(2)
new_df.describe()



Unnamed: 0,Number of Unique Items,Total of All Transactions,Average Cost per Item
count,$1.00,$1.00,$1.00
mean,$179.00,"$2,379.77",$3.05
std,$nan,$nan,$nan
min,$179.00,"$2,379.77",$3.05
25%,$179.00,"$2,379.77",$3.05
50%,$179.00,"$2,379.77",$3.05
75%,$179.00,"$2,379.77",$3.05
max,$179.00,"$2,379.77",$3.05


## Gender Demographics

* Percentage and Count of Male Players


* Percentage and Count of Female Players


* Percentage and Count of Other / Non-Disclosed




In [18]:
sex_df1 = purchase_df.groupby("Gender").count()
sex_df2 = sex_df1.drop(['Purchase ID', 'Age', 'Item ID', 'Item ID', 'Item Name', 'Price'], axis=1)
sex_df3 = sex_df2.rename(index=str, columns={"SN": "Gender Count"})
sex_df3["Percentage"] = (sex_df3["Gender Count"]/(len(purchase_df)))*100
print(sex_df3)


                       Gender Count  Percentage
Gender                                         
Female                          113      $14.49
Male                            652      $83.59
Other / Non-Disclosed            15       $1.92



## Purchasing Analysis (Gender)

* Run basic calculations to obtain purchase count, avg. purchase price, avg. purchase total per person etc. by gender




* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame

In [6]:
pur_gen = purchase_df[['Gender', 'Price', 'Item Name', 'SN']]
pur_gen_group = pur_gen.groupby(['Gender']) #data frame grouped by gender
pur_gen_mean = pur_gen_group.mean()
pur_gen_count = pur_gen_group['Price'].count() # number of purchases per gender
av_spent = pur_gen.groupby(['Gender'])['Price'].mean() #average spent by gender
unq_gen = pur_gen.groupby(['Gender'])['Item Name'].nunique() #number of unique items purchased by each gender

pur_gen_mean["Num of Purhcase"] = pur_gen_count
pur_gen_mean["Average Spent"] = av_spent
pur_gen_mean["Unique Items"] = unq_gen
pur_gen_mean.describe()

Unnamed: 0,Price,Num of Purhcase,Average Spent,Unique Items
count,$3.00,$3.00,$3.00,$3.00
mean,$3.19,$260.00,$3.19,$93.67
std,$0.16,$343.00,$0.16,$82.56
min,$3.02,$15.00,$3.02,$13.00
25%,$3.11,$64.00,$3.11,$51.50
50%,$3.20,$113.00,$3.20,$90.00
75%,$3.27,$382.50,$3.27,$134.00
max,$3.35,$652.00,$3.35,$178.00


## Age Demographics

* Establish bins for ages


* Categorize the existing players using the age bins. Hint: use pd.cut()


* Calculate the numbers and percentages by age group


* Create a summary data frame to hold the results


* Optional: round the percentage column to two decimal points


* Display Age Demographics Table


In [8]:
bins = [0, 9, 14, 19, 24, 29, 34, 39, 100]
age_groups = ["<10", "10-14", "15-19", "20-24", "25-29", "30-34", "35-39", "40+"]

demographics = purchase_df

demographics["Demographics"] = pd.cut(demographics["Age"], bins, labels=age_groups)

#print(demographics)
demographics_df = demographics.groupby("Demographics")
demographics_df1 = demographics_df.mean()
demographics_df1['Age'] = demographics_df1['Age'].round(0)
demographics_df1['Count of Purchases'] = demographics_df['SN'].count()
demographics_df1['Individuals'] = demographics_df['SN'].nunique()
demographics_df1['Average Purchase'] = (demographics_df['SN'].count())/demographics_df['SN'].nunique()
demographics_df2 = demographics_df1[['Age', 'Price', 'Count of Purchases', 'Individuals', 'Average Purchase']]
demographics_df2

Unnamed: 0_level_0,Age,Price,Count of Purchases,Individuals,Average Purchase
Demographics,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
<10,$8.00,$3.35,23,17,$1.35
10-14,$11.00,$2.96,28,22,$1.27
15-19,$17.00,$3.04,136,107,$1.27
20-24,$22.00,$3.05,365,258,$1.41
25-29,$26.00,$2.90,101,77,$1.31
30-34,$31.00,$2.93,73,52,$1.40
35-39,$37.00,$3.60,41,31,$1.32
40+,$42.00,$2.94,13,12,$1.08


## Purchasing Analysis (Age)

* Bin the purchase_data data frame by age


* Run basic calculations to obtain purchase count, avg. purchase price, avg. purchase total per person etc. in the table below


* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame

In [9]:
demographics_df2.describe()

Unnamed: 0,Age,Price,Count of Purchases,Individuals,Average Purchase
count,$8.00,$8.00,$8.00,$8.00,$8.00
mean,$24.25,$3.10,$97.50,$72.00,$1.30
std,$12.09,$0.25,$116.11,$81.99,$0.10
min,$8.00,$2.90,$13.00,$12.00,$1.08
25%,$15.50,$2.94,$26.75,$20.75,$1.27
50%,$24.00,$3.00,$57.00,$41.50,$1.32
75%,$32.50,$3.13,$109.75,$84.50,$1.37
max,$42.00,$3.60,$365.00,$258.00,$1.41


## Top Spenders

* Run basic calculations to obtain the results in the table below


* Create a summary data frame to hold the results


* Sort the total purchase value column in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the summary data frame



In [150]:
sn_grouped = purchase_df.groupby("SN")

sn_df = sn_grouped.mean()
sn_df["Number of Purchases"] = sn_grouped['SN'].count()
best_cust = sn_df.sort_values(["Number of Purchases"], ascending=False)

sn_df1 = best_cust[["Age", "Price", "Number of Purchases"]]
sn_df1.head()

Unnamed: 0_level_0,Age,Price,Number of Purchases
SN,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Lisosia93,$25.00,$3.79,5
Iral74,$21.00,$3.40,4
Idastidru52,$24.00,$3.86,4
Asur53,$26.00,$2.48,3
Inguron55,$23.00,$3.70,3


## Most Popular Items

* Retrieve the Item ID, Item Name, and Item Price columns


* Group by Item ID and Item Name. Perform calculations to obtain purchase count, item price, and total purchase value


* Create a summary data frame to hold the results


* Sort the purchase count column in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the summary data frame



In [11]:
item_grouped = purchase_df.groupby("Item Name")

item_df = item_grouped.mean()
item_df
item_df["Times Purchased"] = item_grouped['SN'].count()
item_df["Item Price"] = item_grouped['Price'].unique()
item_df["Total Purchase Value"] = (item_grouped['SN'].count())*(item_df['Item Price'])
item_df1 = item_df[["Item ID", "Times Purchased", "Item Price", "Total Purchase Value"]]
item_df1.describe()

Unnamed: 0,Item ID,Times Purchased
count,$179.00,$179.00
mean,$91.74,$4.36
std,$52.84,$2.04
min,$0.00,$1.00
25%,$47.50,$3.00
50%,$91.00,$4.00
75%,$137.50,$5.00
max,$183.00,$13.00


## Most Profitable Items

* Sort the above table by total purchase value in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the data frame



Unnamed: 0_level_0,Unnamed: 1_level_0,Purchase Count,Item Price,Total Purchase Value
Item ID,Item Name,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
178,"Oathbreaker, Last Hope of the Breaking Storm",12,$4.23,$50.76
82,Nirvana,9,$4.90,$44.10
145,Fiery Glass Crusader,9,$4.58,$41.22
92,Final Critic,8,$4.88,$39.04
103,Singed Scalpel,8,$4.35,$34.80
