### Heroes Of Pymoli Data Analysis
* Of the 1163 active players, the vast majority are male (84%). There also exists, a smaller, but notable proportion of female players (14%).

* Our peak age demographic falls between 20-24 (44.8%) with secondary groups falling between 15-19 (18.60%) and 25-29 (13.4%).  
-----

### Note
* Instructions have been included for each segment. You do not have to follow them exactly, but they are included to help you think through the steps.

In [272]:
# Dependencies and Setup
import pandas as pd
import numpy as np

# File to Load (Remember to Change These)
Heroes_file = "Resources/purchase_data.csv"

# Read Purchasing File and store into Pandas data frame
purchase_data_df = pd.read_csv(Heroes_file)

## Player Count

* Display the total number of players


In [273]:
total_players = len(purchase_data_df["SN"].unique())
total_players

576

## Purchasing Analysis (Total)

* Run basic calculations to obtain number of unique items, average price, etc.


* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame


In [274]:
unique_items_count = len(purchase_data_df["Item ID"].unique()) 
average_price = round(purchase_data_df["Price"].mean(),2)
total_purchases = purchase_data_df["Purchase ID"].count()
total_revenue = purchase_data_df["Price"].sum()
Total_Summary_table = pd.DataFrame({"Number of Unique Items": [unique_items_count], "Average Purchase Price": [average_price], "Total Number of Purchases": [total_purchases], "Total Revenue": [total_revenue]})
Total_Summary_table

Unnamed: 0,Number of Unique Items,Average Purchase Price,Total Number of Purchases,Total Revenue
0,183,3.05,780,2379.77


## Gender Demographics

* Percentage and Count of Male Players


* Percentage and Count of Female Players


* Percentage and Count of Other / Non-Disclosed




In [287]:
Gender_total = purchase_data_df["Gender"].count()
grouped_gender_df = purchase_data_df.groupby(["Gender"])
gender = 100* (grouped_gender_df["SN"].count()/Gender_total), grouped_gender_df["SN"].count()
grouped_gender_df.count().head(10)
print(gender)

(Gender
Female                   14.487179
Male                     83.589744
Other / Non-Disclosed     1.923077
Name: SN, dtype: float64, Gender
Female                   113
Male                     652
Other / Non-Disclosed     15
Name: SN, dtype: int64)



## Purchasing Analysis (Gender)

* Run basic calculations to obtain purchase count, avg. purchase price, avg. purchase total per person etc. by gender




* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame

In [306]:
gender_purchase_count = grouped_gender_df["Purchase ID"].count() 
gender_average_price = grouped_gender_df["Price"].mean()
gender_total_purchases = grouped_gender_df["Price"].sum()
genderaverage_purchase_total = grouped_gender_df["Price"].sum()/grouped_gender_df["SN"].count()
Gender_Summary_table = pd.DataFrame({"Purchase Count": gender_purchase_count, "Average Purchase Price": gender_average_price, "Total Purchase Value": gender_total_purchases, "Average Total Per Person": genderaverage_purchase_total})
Gender_Summary_table

Unnamed: 0_level_0,Purchase Count,Average Purchase Price,Total Purchase Value,Average Total Per Person
Gender,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
Female,113,3.203009,361.94,3.203009
Male,652,3.017853,1967.64,3.017853
Other / Non-Disclosed,15,3.346,50.19,3.346


## Age Demographics

* Establish bins for ages


* Categorize the existing players using the age bins. Hint: use pd.cut()


* Calculate the numbers and percentages by age group


* Create a summary data frame to hold the results


* Optional: round the percentage column to two decimal points


* Display Age Demographics Table


In [284]:
purchase_data_df['Age Groups'] = pd.cut(purchase_data_df["Age"], [0, 10, 14, 18, 22, 26, 30, 34, 38, 42, 100], labels = ["0-10", "11-14", "15-18", "19-22", "23-26", "27-30", "31-34", "35-38", "39-42", "43-100"])
age_group_df = purchase_data_df.groupby(["Age Groups"])
age_group = age_group_df["SN"].count()
print(age_group)

Age Groups
0-10       32
11-14      19
15-18     113
19-22     254
23-26     207
27-30      63
31-34      38
35-38      35
39-42      15
43-100      4
Name: SN, dtype: int64


## Purchasing Analysis (Age)

* Bin the purchase_data data frame by age


* Run basic calculations to obtain purchase count, avg. purchase price, avg. purchase total per person etc. in the table below


* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame

In [307]:
Age_total = purchase_data_df["Age"].count()
age_purchase_count = age_group_df["Purchase ID"].count()
age_average_price = age_group_df["Price"].mean()
age_total_purchases = age_group_df["Price"].sum()
ageaverage_purchase_total = age_group_df["Price"].sum()/age_group_df["SN"].count()
Summary_table = pd.DataFrame({"Purchase Count": age_purchase_count, "Average Purchase Price": age_average_price, "Total Purchase Value": age_total_purchases, "Average Total Per Person": ageaverage_purchase_total})
Summary_table

Unnamed: 0_level_0,Purchase Count,Average Purchase Price,Total Purchase Value,Average Total Per Person
Age Groups,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
0-10,32,3.405,108.96,3.405
11-14,19,2.681579,50.95,2.681579
15-18,113,3.034602,342.91,3.034602
19-22,254,3.038937,771.89,3.038937
23-26,207,3.063961,634.24,3.063961
27-30,63,2.876667,181.23,2.876667
31-34,38,2.728421,103.68,2.728421
35-38,35,3.552857,124.35,3.552857
39-42,15,3.366667,50.5,3.366667
43-100,4,2.765,11.06,2.765


## Top Spenders

* Run basic calculations to obtain the results in the table below


* Create a summary data frame to hold the results


* Sort the total purchase value column in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the summary data frame



In [390]:
grouped_spenders_df = purchase_data_df.groupby(["SN"])
top_total_purchases = grouped_spenders_df["Price"].sum()
Summary_table = pd.DataFrame({"Total Purchase Value": top_total_purchases})
Summary_table


Unnamed: 0_level_0,Total Purchase Value
SN,Unnamed: 1_level_1
Adairialis76,2.28
Adastirin33,4.48
Aeda94,4.91
Aela59,4.32
Aelaria33,1.79
Aelastirin39,7.29
Aelidru27,1.09
Aelin32,8.98
Aelly27,6.79
Aellynun67,3.74


## Most Popular Items

* Retrieve the Item ID, Item Name, and Item Price columns


* Group by Item ID and Item Name. Perform calculations to obtain purchase count, item price, and total purchase value


* Create a summary data frame to hold the results


* Sort the purchase count column in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the summary data frame



In [360]:
grouped_items_df = purchase_data_df.groupby(["Item ID"])
grouped_items_df["Item Name"].value_counts()
Item_price = grouped_items_df["Price"].count()
Group_Purchase_count = grouped_items_df["Purchase ID"].count()
Group_purchase_total = grouped_items_df["Price"].sum()
Group_Summary_table = pd.DataFrame({"Purchase Count": Group_Purchase_count, "Item Price": Item_price, "Total Purchase Value": Group_purchase_total})
Group_Summary_table

Unnamed: 0_level_0,Purchase Count,Item Price,Total Purchase Value
Item ID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
0,4,4,5.12
1,3,3,9.78
2,6,6,14.88
3,6,6,14.94
4,5,5,8.50
5,4,4,16.32
6,2,2,7.40
7,7,7,9.31
8,3,3,11.79
9,4,4,10.92


## Most Profitable Items

* Sort the above table by total purchase value in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the data frame



In [369]:
grouped_purchase_df = grouped_items_df.sort_values("Total Purchase Value", ascending = False)
grouped_purchase_df.head()

AttributeError: Cannot access callable attribute 'sort_values' of 'DataFrameGroupBy' objects, try using the 'apply' method

In [None]:
#Of the 576 unique players, the proportions of males to females are highly imbalanced, with 83.59% males and 14.49% females. 
#The peak age demographics falls within ages 19-22 with secondary groups falling between 23-26 and 15-18
