### Heroes Of Pymoli Data Analysis
* Of the 1163 active players, the vast majority are male (84%). There also exists, a smaller, but notable proportion of female players (14%).

* Our peak age demographic falls between 20-24 (44.8%) with secondary groups falling between 15-19 (18.60%) and 25-29 (13.4%).  
-----

### Note
* Instructions have been included for each segment. You do not have to follow them exactly, but they are included to help you think through the steps.

In [1]:
# Dependencies and Setup
import pandas as pd
import numpy as np

# File to Load (Remember to Change These)
file_to_load = "Resources/purchase_data.csv"

# Read Purchasing File and store into Pandas data frame
purchase_data = pd.read_csv(file_to_load)

In [2]:
purchase_data.columns

Index(['Purchase ID', 'SN', 'Age', 'Gender', 'Item ID', 'Item Name', 'Price'], dtype='object')

## Player Count

* Display the total number of players


In [None]:
total_players = purchase_data["SN"].value_counts()
total_players_count = total_players.count()
total_player_analysis = pd.DataFrame({"Total Players":[total_players_count]})
total_player_analysis

## Purchasing Analysis (Total)

* Run basic calculations to obtain number of unique items, average price, etc.


* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame


In [None]:
unique_item = purchase_data["Item Name"].value_counts()
unique_item_count = unique_item.count()
average_price = round(purchase_data["Price"].mean(),2)
total_reveune = purchase_data["Price"].sum()
total_purchase = purchase_data["Item Name"].count()

In [None]:
purchasing_analysis = pd.DataFrame({"Number of Unquie Item":[unique_item_count],
                                    "Average Price":[average_price], "Number of Purchases":[total_purchase],
                                   "Total Revenue":[total_reveune]})
purchasing_analysis

## Gender Demographics

* Percentage and Count of Male Players


* Percentage and Count of Female Players


* Percentage and Count of Other / Non-Disclosed




In [11]:
#remove duplicated players
gender_demo_clean = purchase_data.drop_duplicates(subset ="SN")

#get total head counts of the new df
total_count = len(gender_demo_clean["SN"].unique())

#calculate # of male players
gender_male = gender_demo_clean.loc[gender_demo_clean["Gender"]=="Male"]
gender_male_count = len(gender_male["SN"].unique())
percent_male = round((gender_male_count/total_count)*100,2)

#calculate # of female players
gender_female = gender_demo_clean.loc[gender_demo_clean["Gender"] == "Female"]
gender_female_count = len(gender_female["SN"].unique())
percent_female = round((gender_female_count/total_count)*100,2)

#calculate # of other players
gender_other = gender_demo_clean.loc[gender_demo_clean["Gender"] == "Other / Non-Disclosed"]
gender_other_count = len(gender_other["SN"].unique())
percent_other = round((gender_other_count/total_count)*100,2)

#percent_gender = gender_demo_clean["Gender"].value_counts(normalize=True) * 100
#gender_count = gender_demo_clean["Gender"].value_counts()
#gender_analysis = pd.DataFrame({"Total Counts":[gender_count], "Percentage of Players":[percent_gender]})

#create a df to contain all values
percent_gender = pd.DataFrame({"Total Counts": [gender_male_count, gender_female_count, gender_other_count],
                               "Percentage of Players": [percent_male,percent_female,percent_other],
                               "Gender": ["Male","Female","Other"]})
percent_gender = percent_gender.set_index("Gender")
percent_gender

Unnamed: 0_level_0,Total Counts,Percentage of Players
Gender,Unnamed: 1_level_1,Unnamed: 2_level_1
Male,484,84.03
Female,81,14.06
Other,11,1.91



## Purchasing Analysis (Gender)

* Run basic calculations to obtain purchase count, avg. purchase price, avg. purchase total per person etc. by gender




* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame

In [None]:
male = purchase_data.loc[purchase_data["Gender"] == "Male"]
female = purchase_data.loc[purchase_data["Gender"] == "Female"]
other = purchase_data.loc[purchase_data["Gender"] == "Other / Non-Disclosed"]

purchase_male = male.count()
purchase_female = female.count()
purchase_other = other.count()

averge_male = male["Price"].mean()
average_female = female["Price"].mean()
average_other = other["Price"].mean()

total_male = male["Price"].sum()
total_female = female["Price"].sum()
total_other = other["Price"].sum()

av_total_male = total_male/purchase_male
av_total_female = total_female / purchase_female
av_total_other = total_other / purchase_other

summary_table = pd.DataFrame({"Purchase Count": [purchase_male, purchase_female, purchase_other],
                                "Average Purchase Price": [averge_male, average_female, average_other],
                            "Total Purchase Value": [total_male, total_female, total_other],
                            "Ave Total Purchase Per Person":[averge_male, average_female, average_other],})
summary_table


In [None]:
purchase_male.head()

## Age Demographics

* Establish bins for ages


* Categorize the existing players using the age bins. Hint: use pd.cut()


* Calculate the numbers and percentages by age group


* Create a summary data frame to hold the results


* Optional: round the percentage column to two decimal points


* Display Age Demographics Table


## Purchasing Analysis (Age)

* Bin the purchase_data data frame by age


* Run basic calculations to obtain purchase count, avg. purchase price, avg. purchase total per person etc. in the table below


* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame

## Top Spenders

* Run basic calculations to obtain the results in the table below


* Create a summary data frame to hold the results


* Sort the total purchase value column in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the summary data frame



## Most Popular Items

* Retrieve the Item ID, Item Name, and Item Price columns


* Group by Item ID and Item Name. Perform calculations to obtain purchase count, item price, and total purchase value


* Create a summary data frame to hold the results


* Sort the purchase count column in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the summary data frame



## Most Profitable Items

* Sort the above table by total purchase value in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the data frame

