### Heroes Of Pymoli Data Analysis
* Of the 1163 active players, the vast majority are male (84%). There also exists, a smaller, but notable proportion of female players (14%).

* Our peak age demographic falls between 20-24 (44.8%) with secondary groups falling between 15-19 (18.60%) and 25-29 (13.4%).  
-----

### Note
* Instructions have been included for each segment. You do not have to follow them exactly, but they are included to help you think through the steps.

In [95]:
# Dependencies and Setup
import pandas as pd
import numpy as np

# File to Load (Remember to Change These)
file_to_load = "Resources/purchase_data.csv"

# Read Purchasing File and store into Pandas data frame
purchase_data = pd.read_csv(file_to_load)

## Player Count

* Display the total number of players


In [230]:
#Calculate the number of unique players
#Use the unique function and then use the len function on that value
player_count = len(purchase_data["SN"].unique())
player_count


576

## Purchasing Analysis (Total)

* Run basic calculations to obtain number of unique items, average price, etc.


* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame


In [327]:
#Display a list of all unique items 
#Use the value_counts function
unique_items = purchase_data["Item Name"].value_counts()
unique_items.head()


Final Critic                                    13
Oathbreaker, Last Hope of the Breaking Storm    12
Fiery Glass Crusader                             9
Persuasion                                       9
Extraction, Quickblade Of Trembling Hands        9
Name: Item Name, dtype: int64

In [231]:
#Calculate the number of unique items
#Use the uinique function and then use the len function on that value
item_count = len(purchase_data["Item Name"].unique())
item_count 

179

In [86]:
#Calculate the total average price of all items
#Use the mean function for the 'Price' column
total_average_price = purchase_data["Price"].mean()
total_average_price



3.050987179487176

In [90]:
#Calculate the total number of purchases made
#This will be the length of all entries in the first column
number_of_purchases = len(purchase_data["Purchase ID"])
number_of_purchases


780

In [240]:
#Calculate the total revenue 
#by multiplying (the total average price of all items) by (the total number of purchases made)
total_revenue = total_average_price*number_of_purchases
total_revenue 


2379.7699999999973

In [244]:
#Create a summary Dataframe
summary_df = pd.DataFrame([{"Number of Unique Items": item_count, "Average Price": total_average_price, "Number of Purchases": number_of_purchases, "Total Revenue": total_revenue}])
reorganized_summary_df = summary_df[["Number of Unique Items", "Average Price", "Number of Purchases", "Total Revenue"]]

reorganized_summary_df


Unnamed: 0,Number of Unique Items,Average Price,Number of Purchases,Total Revenue
0,179,3.050987,780,2379.77


## Gender Demographics

* Percentage and Count of Male Players


* Percentage and Count of Female Players


* Percentage and Count of Other / Non-Disclosed




In [264]:
#Calculate a total by gender
grouped_purchase_data = purchase_data["Gender"].value_counts()
grouped_purchase_data


Male                     652
Female                   113
Other / Non-Disclosed     15
Name: Gender, dtype: int64

In [270]:
#Calculate a total amount of all entries to later calculate a percentage value
total_grouped_purchases = purchase_data["Gender"].count()
total_grouped_purchases


780

In [271]:
#Calculate a percentage by gender and round off
percentage_purchase_data = ((grouped_purchase_data)/total_grouped_purchases)*100
percentage_purchase_data


Male                     83.589744
Female                   14.487179
Other / Non-Disclosed     1.923077
Name: Gender, dtype: float64

In [272]:
# Create a DataFrame of paintings using a list of dictionary
## Had trouble and manually entered the numeric values for the table
gender_summary = pd.DataFrame([{"Gender": "Male", "Total Count": 652, "Percentage": 84.0}, {"Gender": "Female", "Total Count": 113, "Percentage": 14.0}, {"Gender": "Other/Non-Disclosed", "Total Count": 15, "Percentage": 2.0}])

organized_gender_summary = gender_summary[["Gender","Total Count","Percentage"]]
organized_gender_summary.head()

Unnamed: 0,Gender,Total Count,Percentage
0,Male,652,84.0
1,Female,113,14.0
2,Other/Non-Disclosed,15,2.0



## Purchasing Analysis (Gender)

* Run basic calculations to obtain purchase count, avg. purchase price, avg. purchase total per person etc. by gender




* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame

In [326]:
# Show mulitple specific columns--note the extra brackets
female_purchase_data = purchase_data.loc[purchase_data["Gender"]=="Female",["SN", "Item Name","Price"]]
female_purchase_data.head()


Unnamed: 0,SN,Item Name,Price
15,Lisassa64,"Deadline, Voice Of Subtlety",2.89
18,Reunasu60,Nirvana,4.9
38,Reulae52,Renewed Skeletal Katana,4.18
41,Assosia88,"Thorn, Satchel of Dark Souls",1.33
55,Phaelap26,Arcane Gem,3.79


In [296]:
female_purchase_count = female_purchase_data["Item Name"].count()
female_purchase_count


113

In [297]:
female_total_purchase_value = female_purchase_data["Price"].sum()
female_total_purchase_value


361.94

In [298]:
average_female_purchase_value = (female_total_purchase_value)/(female_purchase_count)
average_female_purchase_value


3.203008849557522

In [317]:
unique_female_purchasers = len(female_purchase_data["SN"].unique())
unique_female_purchasers


81

In [318]:
average_total_purchase_per_female = (female_total_purchase_value)/(unique_female_purchasers)
average_total_purchase_per_female


4.468395061728395

In [324]:
male_purchase_data = purchase_data.loc[purchase_data["Gender"]=="Male",["SN", "Item Name","Price"]]
male_purchase_data.head()


Unnamed: 0,SN,Item Name,Price
0,Lisim78,"Extraction, Quickblade Of Trembling Hands",3.53
1,Lisovynya38,Frenzied Scimitar,1.56
2,Ithergue48,Final Critic,4.88
3,Chamassasya86,Blindscythe,3.27
4,Iskosia90,Fury,1.44


In [301]:
male_purchase_count = male_purchase_data["Item Name"].count()
male_purchase_count


652

In [302]:
male_total_purchase_value = male_purchase_data["Price"].sum()
male_total_purchase_value


1967.64

In [304]:
average_male_purchase_value = (male_total_purchase_value)/(male_purchase_count)
average_male_purchase_value


3.0178527607361967

In [320]:
unique_male_purchasers = len(male_purchase_data["SN"].unique())
unique_male_purchasers


484

In [321]:
average_total_purchase_per_male = (male_total_purchase_value)/(unique_male_purchasers)
average_total_purchase_per_male


4.065371900826446

In [323]:
othernondisclosed_purchase_data = purchase_data.loc[purchase_data["Gender"]=="Other / Non-Disclosed",["SN", "Item Name","Price"]]
othernondisclosed_purchase_data.head()


Unnamed: 0,SN,Item Name,Price
9,Chanosian48,Ghastly Adamantite Protector,3.58
22,Siarithria38,Warped Fetish,3.81
82,Haerithp41,Azurewrath,4.4
111,Sundim98,Orbit,4.75
228,Jiskirran77,Dreamsong,3.39


In [306]:
othernondisclosed_purchase_count = othernondisclosed_purchase_data["Item Name"].count()
othernondisclosed_purchase_count


15

In [307]:
othernondisclosed_total_purchase_value = othernondisclosed_purchase_data["Price"].sum()
othernondisclosed_total_purchase_value


50.19

In [308]:
average_othernondisclosed_purchase_value = (othernondisclosed_total_purchase_value)/(othernondisclosed_purchase_count)
average_othernondisclosed_purchase_value


3.3459999999999996

In [328]:
unique_othernondisclosed_purchasers = len(othernondisclosed_purchase_data["SN"].unique())
unique_othernondisclosed_purchasers


11

In [329]:
average_total_purchase_per_othernondisclosed = (othernondisclosed_total_purchase_value)/(unique_othernondisclosed_purchasers)
average_total_purchase_per_othernondisclosed


4.5627272727272725

In [330]:
#Create Dataframe to display the generated gender data

gender_dataframe_df = pd.DataFrame({"Gender": ["Female", "Male", "Other/Non-Disclosed"], "Purchase Count": [female_purchase_count, male_purchase_count, othernondisclosed_purchase_count], "Average Purchase Price": [average_female_purchase_value, average_male_purchase_value, average_othernondisclosed_purchase_value], "Total Purchase Value":[female_total_purchase_value, male_total_purchase_value, othernondisclosed_total_purchase_value], "Av Total Purchase Per Person":[average_total_purchase_per_female, average_total_purchase_per_male, average_total_purchase_per_othernondisclosed]})
gender_dataframe_df


Unnamed: 0,Gender,Purchase Count,Average Purchase Price,Total Purchase Value,Av Total Purchase Per Person
0,Female,113,3.203009,361.94,4.468395
1,Male,652,3.017853,1967.64,4.065372
2,Other/Non-Disclosed,15,3.346,50.19,4.562727


## Age Demographics

* Establish bins for ages


* Categorize the existing players using the age bins. Hint: use pd.cut()


* Calculate the numbers and percentages by age group


* Create a summary data frame to hold the results


* Optional: round the percentage column to two decimal points


* Display Age Demographics Table


In [340]:
#Create bins and group
#Use max function to determine the max value of the bins
bins = [0, 10, 14, 19, 24, 29, 34, 39, 45]
group_names = ["<10", "10-14", "15-19", "20-24", "25-29", "30-34", "35-39", "40+"]

print(purchase_data["Age"].max())
print(purchase_data["Age"].min())


45
7


In [344]:
#Place the newly formed series as a new column into the original dataframe
purchase_data["Age Group"] = pd.cut(purchase_data["Age"], bins, labels = group_names)
purchase_data.head()


Unnamed: 0,Purchase ID,SN,Age,Gender,Item ID,Item Name,Price,Age Group
0,0,Lisim78,20,Male,108,"Extraction, Quickblade Of Trembling Hands",3.53,20-24
1,1,Lisovynya38,40,Male,143,Frenzied Scimitar,1.56,40+
2,2,Ithergue48,24,Male,92,Final Critic,4.88,20-24
3,3,Chamassasya86,24,Male,100,Blindscythe,3.27,20-24
4,4,Iskosia90,23,Male,131,Fury,1.44,20-24


In [374]:
#Create a groupby object based upon "Age Group"
purchase_group = purchase_data.groupby("Age Group")


In [385]:
#Count of purchases made by age group bin
purchase_group["Item Name"].count()

Age Group
<10       32
10-14     19
15-19    136
20-24    365
25-29    101
30-34     73
35-39     41
40+       13
Name: Item Name, dtype: int64

In [384]:
#Total Purchase Price paid by age in bin
purchase_group["Price"].sum()


Age Group
<10       108.96
10-14      50.95
15-19     412.89
20-24    1114.06
25-29     293.00
30-34     214.00
35-39     147.67
40+        38.24
Name: Price, dtype: float64

In [None]:
#Average Total purchase per person
#NEED HELP ON THIS ONE

## Purchasing Analysis (Age)

* Bin the purchase_data data frame by age


* Run basic calculations to obtain purchase count, avg. purchase price, avg. purchase total per person etc. in the table below


* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame

In [396]:
#Create DataFrame
#Inputting data manually after generating them above -- NEED HELP ON THIS ONE
age_dataframe_df = pd.DataFrame({"Age": ["<10", "10-14", "15-19", "20-24", "25-29", "30-34", "35-39", "40+"], "Purchase Count": [32,19,136,365,101,73,41,13], "Average Purchase Price":[3.405000,2.681579,3.035956,3.052219,2.900990,2.931507,3.601707,2.941538], "Total Purchase Value":[108.96,50.95,412.89,1114.06,293.00,214,147.67,38.24], "Average Total Purchase Per Person":["na","na","na","na","na","na","na","na"]})
age_dataframe_df


Unnamed: 0,Age,Purchase Count,Average Purchase Price,Total Purchase Value,Average Total Purchase Per Person
0,<10,32,3.405,108.96,na
1,10-14,19,2.681579,50.95,na
2,15-19,136,3.035956,412.89,na
3,20-24,365,3.052219,1114.06,na
4,25-29,101,2.90099,293.0,na
5,30-34,73,2.931507,214.0,na
6,35-39,41,3.601707,147.67,na
7,40+,13,2.941538,38.24,na


## Top Spenders

* Run basic calculations to obtain the results in the table below


* Create a summary data frame to hold the results


* Sort the total purchase value column in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the summary data frame



In [402]:
#Individiual Purchase sorted by most per person
grouped_sn_data = purchase_data["SN"].value_counts()
grouped_sn_data.head(5)


Lisosia93      5
Iral74         4
Idastidru52    4
Tyidaim51      3
Silaera56      3
Name: SN, dtype: int64

In [None]:
#Need help with the rest of this problem

## Most Popular Items

* Retrieve the Item ID, Item Name, and Item Price columns


* Group by Item ID and Item Name. Perform calculations to obtain purchase count, item price, and total purchase value


* Create a summary data frame to hold the results


* Sort the purchase count column in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the summary data frame



In [413]:
grouped_item_count_data = purchase_data["Item Name"].value_counts()
grouped_item_count_data.head(5)


Final Critic                                    13
Oathbreaker, Last Hope of the Breaking Storm    12
Fiery Glass Crusader                             9
Persuasion                                       9
Extraction, Quickblade Of Trembling Hands        9
Name: Item Name, dtype: int64

In [None]:
#Need help with the rest of this problem

## Most Profitable Items

* Sort the above table by total purchase value in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the data frame



In [10]:
#Ran out of time
##Need to study more
### Looking forward to the homework review so that I can study that and improve my awareness

Unnamed: 0_level_0,Unnamed: 1_level_0,Purchase Count,Item Price,Total Purchase Value
Item ID,Item Name,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
178,"Oathbreaker, Last Hope of the Breaking Storm",12,$4.23,$50.76
82,Nirvana,9,$4.90,$44.10
145,Fiery Glass Crusader,9,$4.58,$41.22
92,Final Critic,8,$4.88,$39.04
103,Singed Scalpel,8,$4.35,$34.80
