### Heroes Of Pymoli Data Analysis
* Of the 1163 active players, the vast majority are male (84%). There also exists, a smaller, but notable proportion of female players (14%).

* Our peak age demographic falls between 20-24 (44.8%) with secondary groups falling between 15-19 (18.60%) and 25-29 (13.4%).  
-----

### Note
* Instructions have been included for each segment. You do not have to follow them exactly, but they are included to help you think through the steps.

In [145]:
# import dependencies
import pandas as pd
import numpy as np

# load file
file = "../Instructions/HeroesOfPymoli/Resources/purchase_data.csv"

# read file and store into pandas dataframe
purchase_data = pd.read_csv(file)
print(purchase_data)

     Purchase ID               SN  Age                 Gender  Item ID  \
0              0          Lisim78   20                   Male      108   
1              1      Lisovynya38   40                   Male      143   
2              2       Ithergue48   24                   Male       92   
3              3    Chamassasya86   24                   Male      100   
4              4        Iskosia90   23                   Male      131   
5              5          Yalae81   22                   Male       81   
6              6        Itheria73   36                   Male      169   
7              7      Iskjaskst81   20                   Male      162   
8              8        Undjask33   22                   Male       21   
9              9      Chanosian48   35  Other / Non-Disclosed      136   
10            10        Inguron55   23                   Male       95   
11            11     Haisrisuir60   23                   Male      162   
12            12     Saelaephos52   21

In [146]:
# identify columns
purchase_data.columns

Index(['Purchase ID', 'SN', 'Age', 'Gender', 'Item ID', 'Item Name', 'Price'], dtype='object')

## Player Count

* Display the total number of players


In [147]:
# count players
print(len(purchase_data))

780


## Purchasing Analysis (Total)

* Run basic calculations to obtain number of unique items, average price, etc.


* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame


In [150]:
# calculate totals
total_items = len(purchase_data['Item ID'].value_counts())
average_price = round(purchase_data["Price"].mean(), 2)
total_purchases = purchase_data['Item Name'].count()
total_revenue = purchase_data["Price"].sum()

# create dataframe
purchasing_analysis = pd.DataFrame({"Number of Unique Items": [total_items],
                                           "Average Price": ["$" + str(average_price)],
                                           "Number of Purchases": [total_purchases],
                                           "Total Revenue": ["$" + str('{:,}'.format(total_revenue))]})
# print dataframe
purchasing_analysis

Unnamed: 0,Number of Unique Items,Average Price,Number of Purchases,Total Revenue
0,183,$3.05,780,"$2,379.77"


## Gender Demographics

* Percentage and Count of Male Players


* Percentage and Count of Female Players


* Percentage and Count of Other / Non-Disclosed




In [312]:
# calculate total players
total_players = len(purchase_data['SN'].value_counts())

# calculate individual player type totals
players_male = purchase_data[purchase_data["Gender"] == "Male"]["SN"].nunique()
players_female = purchase_data[purchase_data["Gender"] == "Female"]["SN"].nunique()
players_other = total_players - players_male - players_female

male = (players_male / total_players) * 100
female = (players_female / total_players) * 100
other = (players_other / total_players) * 100

# create dataframe formatted to round off to 2 decimal points
gender_analysis = pd.DataFrame({"": ['Male', 'Female', 'Other/Non-Disclosed'],
                                "Total Count": [players_male, players_female, players_other],
                                "Percentage of Players": [male, female, other]}).round(2)

# set index and print
gender_analysis = gender_analysis.set_index('')
gender_analysis

Unnamed: 0,Total Count,Percentage of Players
,,
Male,484.0,84.03
Female,81.0,14.06
Other/Non-Disclosed,11.0,1.91



## Purchasing Analysis (Gender)

* Run basic calculations to obtain purchase count, avg. purchase price, avg. purchase total per person etc. by gender




* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame

## Age Demographics

* Establish bins for ages


* Categorize the existing players using the age bins. Hint: use pd.cut()


* Calculate the numbers and percentages by age group


* Create a summary data frame to hold the results


* Optional: round the percentage column to two decimal points


* Display Age Demographics Table


## Purchasing Analysis (Age)

* Bin the purchase_data data frame by age


* Run basic calculations to obtain purchase count, avg. purchase price, avg. purchase total per person etc. in the table below


* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame

## Top Spenders

* Run basic calculations to obtain the results in the table below


* Create a summary data frame to hold the results


* Sort the total purchase value column in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the summary data frame



## Most Popular Items

* Retrieve the Item ID, Item Name, and Item Price columns


* Group by Item ID and Item Name. Perform calculations to obtain purchase count, item price, and total purchase value


* Create a summary data frame to hold the results


* Sort the purchase count column in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the summary data frame



## Most Profitable Items

* Sort the above table by total purchase value in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the data frame

