### Heroes Of Pymoli Data Analysis
* Of the 1163 active players, the vast majority are male (84%). There also exists, a smaller, but notable proportion of female players (14%).

* Our peak age demographic falls between 20-24 (44.8%) with secondary groups falling between 15-19 (18.60%) and 25-29 (13.4%).  
-----

### Note
* Instructions have been included for each segment. You do not have to follow them exactly, but they are included to help you think through the steps.

In [2]:
# Dependencies and Setup
import pandas as pd
import numpy as np

# File to Load (Remember to Change These)
purchaseFile = "Resources/purchase_data.csv"

# Read Purchasing File and store into Pandas data frame
purchaseData = pd.read_csv(purchaseFile)

## Player Count

* Display the total number of players


In [3]:
print(purchaseData.columns)




Index(['Purchase ID', 'SN', 'Age', 'Gender', 'Item ID', 'Item Name', 'Price'], dtype='object')


## Purchasing Analysis (Total)

* Run basic calculations to obtain number of unique items, average price, etc.


* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame


In [85]:
purchaseDf= pd.DataFrame(purchaseData)
numPurchases = purchaseDf.groupby("Purchase ID")
purchaseCount = len(numPurchases)
uniqueItems = purchaseDf.groupby("Item ID")
uniqueCount = len(uniqueItems["Item ID"])
uniquePlayer = purchaseDf.groupby("SN")
playerCount = len(uniquePlayer["SN"])
avgPrice = purchaseDf["Price"].mean()
#purchaseSummary['Unique ID'] = purchaseDf["Item ID"].value_counts()
revenue = purchaseDf.groupby("Item ID")["Price"].sum().sum()

purchaseSummary = pd.DataFrame({"Total Players": [playerCount],
                                "Unique Items" : [uniqueCount],
                               "Avg Purchase Price" : [avgPrice],
                               "Number of Purchases" : [purchaseCount],
                               "Total Revenue" : [revenue]})
purchaseSummary["Avg Purchase Price"] = purchaseSummary["Avg Purchase Price"].map("${:.2f}".format)
purchaseSummary["Total Revenue"] = purchaseSummary["Total Revenue"].map("${:.2f}".format)

print(purchaseSummary.T)                           





                            0
Total Players             576
Unique Items              183
Avg Purchase Price      $3.05
Number of Purchases       780
Total Revenue        $2379.77


                    0
Men        484.000000
Men Pct      0.840278
Women       81.000000
Women Pct    0.140625
Others      11.000000
Other Pct    0.019097


In [98]:
genderDf = purchaseDf.drop_duplicates("SN", keep = "first") 
genderCount = genderDf["Gender"].count()
gendersGroup = genderDf.groupby("Gender")
male = gendersGroup["SN"].count()["Male"]
female = gendersGroup["SN"].count()["Female"]

others = gendersGroup["SN"].count()["Other / Non-Disclosed"]

malepct = male / playerCount
femalepct = female / playerCount
otherspct = others / playerCount

GenderSummary = pd.DataFrame({"Men": [male],
                             "Men Pct":[malepct],
                             "Women": [female],
                              "Women Pct": [femalepct],
                             "Others": [others],
                              "Other Pct": [otherspct]})
print(GenderSummary.T)


                    0
Men        484.000000
Men Pct      0.840278
Women       81.000000
Women Pct    0.140625
Others      11.000000
Other Pct    0.019097


## Gender Demographics

* Percentage and Count of Male Players


* Percentage and Count of Female Players


* Percentage and Count of Other / Non-Disclosed




In [147]:
genders = purchaseDf.groupby(["Gender", "Purchase ID"])

malepurchase = genders["Purchase ID"].count()["Male"]
maleavgpurchase =genders.mean() 
print(malepurchase.count())


AttributeError: 'function' object has no attribute 'count'


## Purchasing Analysis (Gender)

* Run basic calculations to obtain purchase count, avg. purchase price, avg. purchase total per person etc. by gender




* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame

In [148]:
maleavgpurchase.describe()


Unnamed: 0,Age,Item ID,Price
count,780.0,780.0,780.0
mean,22.714103,92.114103,3.050987
std,6.659444,52.775943,1.169549
min,7.0,0.0,1.0
25%,20.0,48.0,1.98
50%,22.0,93.0,3.15
75%,25.0,139.0,4.08
max,45.0,183.0,4.99


## Age Demographics

* Establish bins for ages


* Categorize the existing players using the age bins. Hint: use pd.cut()


* Calculate the numbers and percentages by age group


* Create a summary data frame to hold the results


* Optional: round the percentage column to two decimal points


* Display Age Demographics Table


## Purchasing Analysis (Age)

* Bin the purchase_data data frame by age


* Run basic calculations to obtain purchase count, avg. purchase price, avg. purchase total per person etc. in the table below


* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame

## Top Spenders

* Run basic calculations to obtain the results in the table below


* Create a summary data frame to hold the results


* Sort the total purchase value column in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the summary data frame



## Most Popular Items

* Retrieve the Item ID, Item Name, and Item Price columns


* Group by Item ID and Item Name. Perform calculations to obtain purchase count, item price, and total purchase value


* Create a summary data frame to hold the results


* Sort the purchase count column in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the summary data frame



## Most Profitable Items

* Sort the above table by total purchase value in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the data frame

