### Heroes Of Pymoli Data Analysis
* Of the 1163 active players, the vast majority are male (84%). There also exists, a smaller, but notable proportion of female players (14%).

* Our peak age demographic falls between 20-24 (44.8%) with secondary groups falling between 15-19 (18.60%) and 25-29 (13.4%).  
-----

### Note
* Instructions have been included for each segment. You do not have to follow them exactly, but they are included to help you think through the steps.

In [1]:
# Dependencies and Setup
import pandas as pd
import numpy as np

# File to Load (Remember to Change These)
file_to_load = "Resources/purchase_data.csv"

# Read Purchasing File and store into Pandas data frame
purchase_data = pd.read_csv(file_to_load)

## Player Count

* Display the total number of players


In [2]:
purchase_data.head()
purchase_data["Purchase ID"].shape[0]


780

## Purchasing Analysis (Total)

* Run basic calculations to obtain number of unique items, average price, etc.


* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame


In [3]:
purchase_data["Item ID"].unique() # to find the total number of unique items
#purchase_data["SN"].value_counts() # to find the count for each appearing SN

purchase_data["Price"].mean() # to find the average price
purchase_data.describe()
#display summary of the data
#purchase_data.style.format("{:.2%}")
#print(format(purchase_data.describe(), ".2f") + ' %')


Unnamed: 0,Purchase ID,Age,Item ID,Price
count,780.0,780.0,780.0,780.0
mean,389.5,22.714103,92.114103,3.050987
std,225.310896,6.659444,52.775943,1.169549
min,0.0,7.0,0.0,1.0
25%,194.75,20.0,48.0,1.98
50%,389.5,22.0,93.0,3.15
75%,584.25,25.0,139.0,4.08
max,779.0,45.0,183.0,4.99


## Gender Demographics

* Percentage and Count of Male Players


* Percentage and Count of Female Players


* Percentage and Count of Other / Non-Disclosed




In [4]:
Male_count = purchase_data[purchase_data['Gender'] == 'Male'].shape[0]
print("Total number of Male players: ", Male_count)

Total number of Male players:  652


In [5]:
percentage_male = purchase_data[purchase_data['Gender'] == 'Male'].shape[0] *100 / purchase_data.shape[0]
print("% of Male Players is: ", format(percentage_male, ".2f") + ' %')


% of Male Players is:  83.59 %


In [6]:
Female_count = purchase_data[purchase_data['Gender'] == 'Female'].shape[0]
print("Total number of Female players: ", Female_count)

Total number of Female players:  113


In [7]:
percentage_female = purchase_data[purchase_data['Gender'] == 'Female'].shape[0] *100 / purchase_data.shape[0]
print("% of Female Players is: ", format(percentage_female, ".2f") + ' %')


% of Female Players is:  14.49 %


In [8]:
others_count = (purchase_data.shape[0] - (Male_count + Female_count))
print("Total number of Other / Non-Disclosed players: ", others_count)

Total number of Other / Non-Disclosed players:  15


In [17]:
others_percent = others_count *100 / purchase_data.shape[0]
print("% of Other / Non-Disclosed players: ", format(others_percent, ".2f") + ' %')

% of Other / Non-Disclosed players:  1.92 %



## Purchasing Analysis (Gender)

In [18]:
purchase_data.head()

Unnamed: 0,Purchase ID,SN,Age,Gender,Item ID,Item Name,Price
0,0,Lisim78,20,Male,108,"Extraction, Quickblade Of Trembling Hands",3.53
1,1,Lisovynya38,40,Male,143,Frenzied Scimitar,1.56
2,2,Ithergue48,24,Male,92,Final Critic,4.88
3,3,Chamassasya86,24,Male,100,Blindscythe,3.27
4,4,Iskosia90,23,Male,131,Fury,1.44


* Run basic calculations to obtain purchase count, avg. purchase price, avg. purchase total per person etc. by gender


* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame

In [21]:
# group only those purchases that were catergorized in the Gender.

# Create a list of the columns
columns = [
    "name", "goal", "pledged", "state", 
    "country", "staff_pick", "backers_count", "spotlight"]

#  Create a new df for "US" with the columns. 
#hosted_in_us = reduced_kickstarter_df.loc[reduced_kickstarter_df["country"] == "US",  columns]
#hosted_in_us.head()

# Create a GroupBy object based upon "Gender"
purchase_group = purchase_data.groupby("Gender")
per_person = purchase_data.groupby("SN")
# Find how many rows fall into each bin
print(purchase_group["SN"].count())
#print(purchase_data['Gender'].value_counts()) -  2 ways to show this
#print(per_person["SN"].count())

# Create a new column that finds the average amount pledged to a project
#purchase_data["average_price"] = purchase_data['SN'] / purchase_data['Price']
#purchase_data

# Get the average of each column within the GroupBy object
string_mean1 = purchase_group[["Price", "SN", "Item ID", "Purchase ID"]].mean()
string_mean2 = per_person[["Price"]].mean()
print("% of Other / Non-Disclosed players: ", format(string_mean1, ".2f")+ ' %')

# First convert "average_donation", "goal", and "pledged" columns to float
# Then Format to go to two decimal places, include a dollar sign, and use comma notation

hosted_in_us["average_donation"] = hosted_in_us["average_donation"].astype(float).map(
    "${:,.2f}".format)
hosted_in_us["goal"] = hosted_in_us["goal"].astype(float).map("${:,.2f}".format)
hosted_in_us["pledged"] = hosted_in_us["pledged"].astype(float).map("${:,.2f}".format)


Gender
Female                   113
Male                     652
Other / Non-Disclosed     15
Name: SN, dtype: int64


TypeError: unsupported format string passed to DataFrame.__format__

## Age Demographics

* Establish bins for ages


* Categorize the existing players using the age bins. Hint: use pd.cut()


* Calculate the numbers and percentages by age group


* Create a summary data frame to hold the results


* Optional: round the percentage column to two decimal points


* Display Age Demographics Table


## Purchasing Analysis (Age)

* Bin the purchase_data data frame by age


* Run basic calculations to obtain purchase count, avg. purchase price, avg. purchase total per person etc. in the table below


* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame

## Top Spenders

* Run basic calculations to obtain the results in the table below


* Create a summary data frame to hold the results


* Sort the total purchase value column in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the summary data frame



## Most Popular Items

* Retrieve the Item ID, Item Name, and Item Price columns


* Group by Item ID and Item Name. Perform calculations to obtain purchase count, item price, and total purchase value


* Create a summary data frame to hold the results


* Sort the purchase count column in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the summary data frame



## Most Profitable Items

* Sort the above table by total purchase value in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the data frame

