### Heroes Of Pymoli Data Analysis
* Of the 1163 active players, the vast majority are male (84%). There also exists, a smaller, but notable proportion of female players (14%).

* Our peak age demographic falls between 20-24 (44.8%) with secondary groups falling between 15-19 (18.60%) and 25-29 (13.4%).  
-----

### Note
* Instructions have been included for each segment. You do not have to follow them exactly, but they are included to help you think through the steps.

In [None]:
# Dependencies and Setup
import pandas as pd
import numpy as np
import os
import csv
# File to Load (Remember to Change These)
file_to_load = "Resources/purchase_data.csv"

# Read Purchasing File and store into Pandas data frame
purchase_data = pd.read_csv(file_to_load)


In [None]:
purchase_data.head()

## Player Count

* Display the total number of players


In [None]:
totalnumberplayers = len(purchase_data["SN"].unique())
tnp = totalnumberplayers
print("The total amount of players is " + str(tnp))
tnpfull = len(purchase_data)


## Purchasing Analysis (Total)

* Run basic calculations to obtain number of unique items, average price, etc.


* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame


In [None]:
itemlist = purchase_data["Item Name"].unique()
total_itemlist = len(itemlist)
ave_cost = purchase_data["Price"].mean()
total_purchases = tnpfull
total_revenue = purchase_data["Price"].sum()
analysisdata = {
    "Number \n of Items": [total_itemlist],
    "Average \n Cost": [ave_cost],
    "Total # of Purchases": [total_purchases],
    "Total Revenue":[total_revenue]
}
analysis_df = pd.DataFrame(analysisdata)
analysis_df
    

## Gender Demographics

* Percentage and Count of Male Players


* Percentage and Count of Female Players


* Percentage and Count of Other / Non-Disclosed




In [None]:
male_data = purchase_data.loc[purchase_data["Gender"] == "Male",:]
female_data = purchase_data.loc[purchase_data["Gender"] == "Female",:]
other_data = purchase_data.loc[purchase_data["Gender"] == "Other / Non-Disclosed",:]

male_number = len(male_data["SN"].unique())
male_percent = male_number *100/tnp
female_number = len(female_data["SN"].unique())
female_percent = female_number *100/tnp
other_number = len(other_data["SN"].unique())
other_percent = other_number *100/tnp
total_percent = male_percent + female_percent + other_percent


genderdata = [
    ("Male",male_number,male_percent),
    ("Female",female_number,female_percent),
    ("Other/Non-Disclosed",other_number,other_percent)
]

genderdatadf = pd.DataFrame(genderdata, columns=["Gender","Number of Players","Percent of Players"])
genderdatadf




## Purchasing Analysis (Gender)

* Run basic calculations to obtain purchase count, avg. purchase price, avg. purchase total per person etc. by gender




* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame

In [None]:

### isnt the purchase count what i just calculated to find the percentage?
male_purchase_count = len(male_data)
female_purchase_count = len(female_data)
other_purchase_count = len(other_data)

male_ave_pur_price = male_data["Price"].mean()
female_ave_pur_price = female_data["Price"].mean()
other_ave_pur_price = other_data["Price"].mean()

male_ave_purchase_total = male_data["SN"].value_counts()
female_ave_purchase_total =female_data["SN"].value_counts()
other_ave_purchase_total = other_data["SN"].value_counts()

male_ave_2 = male_ave_purchase_total.mean()
female_ave_2 = female_ave_purchase_total.mean()
other_ave_2 = other_ave_purchase_total.mean()

gender_analysis_data1 = [
    ("Male",male_purchase_count,male_ave_pur_price,male_data["Price"].sum(),male_ave_2),
    ("Female",female_purchase_count,female_ave_pur_price,female_data["Price"].sum(),female_ave_2),
    ("Other/Non-Disclosed",other_purchase_count,other_ave_pur_price,other_data["Price"].sum(),other_ave_2)
]

gender_analysis_1_df = pd.DataFrame(gender_analysis_data1, columns=["Gender","Purchase Count","Average Purchase Price","Total Purchase Value","Average Purcheses per Person"])
gender_analysis_1_df



## Age Demographics

* Establish bins for ages


* Categorize the existing players using the age bins. Hint: use pd.cut()


* Calculate the numbers and percentages by age group


* Create a summary data frame to hold the results


* Optional: round the percentage column to two decimal points


* Display Age Demographics Table


In [None]:
bins = [0,10,20,30,40,50,60,70]
bin_label = ["0 to 10","10 to 20","20 to 30","30 to 40","40 to 50","50 to 60","60 to 70"]
purchase_data["Age Group"] = pd.cut(purchase_data["Age"], bins, labels=bin_label)
purchase_data.head()
#zero_ten = purchase_data.loc[purchase_data["Age Group"] == "0 to 10",:]
bin_groups = purchase_data.groupby("Age Group")

zero_data = purchase_data.loc[purchase_data["Age Group"] == "0 to 10",:]
ten_data = purchase_data.loc[purchase_data["Age Group"] == "10 to 20",:]
twenty_data = purchase_data.loc[purchase_data["Age Group"] == "20 to 30",:]
thirty_data = purchase_data.loc[purchase_data["Age Group"] == "30 to 40",:]
fourty_data = purchase_data.loc[purchase_data["Age Group"] == "40 to 50",:]
fifty_data = purchase_data.loc[purchase_data["Age Group"] == "50 to 60",:]
sixty_data = purchase_data.loc[purchase_data["Age Group"] == "60 to 70",:]

# Find how many rows fall into each bin
#print(bin_groups["Age Group"].count())
zeroto10p = bin_groups["Age Group"].count().iloc[0]*100/tnpfull
tento20p = bin_groups["Age Group"].count().iloc[1]*100/tnpfull
twentyto30p = bin_groups["Age Group"].count().iloc[2]*100/tnpfull
thirtyto40p = bin_groups["Age Group"].count().iloc[3]*100/tnpfull
fourtyto50p = bin_groups["Age Group"].count().iloc[4]*100/tnpfull
fiftyto60p = bin_groups["Age Group"].count().iloc[5]*100/tnpfull
sixtyto70p = bin_groups["Age Group"].count().iloc[6]*100/tnpfull

bindata = [
    ("0-10",bin_groups["Age Group"].count().iloc[0],zeroto10p),
    ("10-20",bin_groups["Age Group"].count().iloc[1],tento20p),
    ("20-30",bin_groups["Age Group"].count().iloc[2],twentyto30p),
    ("30-40",bin_groups["Age Group"].count().iloc[3],thirtyto40p),
    ("40-50",bin_groups["Age Group"].count().iloc[4],fourtyto50p),
    ("50-60",bin_groups["Age Group"].count().iloc[5],fiftyto60p),
    ("60-70",bin_groups["Age Group"].count().iloc[6],sixtyto70p)
]
bin_df = pd.DataFrame(bindata, columns=["Age Group","# of People in Age Group","% of People in Age Group"])
bin_df

## Purchasing Analysis (Age)

* Bin the purchase_data data frame by age


* Run basic calculations to obtain purchase count, avg. purchase price, avg. purchase total per person etc. in the table below


* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame

In [None]:

zerocount = len(zero_data["SN"].unique())
tencount = len(ten_data["SN"].unique())
twentycount = len(twenty_data["SN"].unique())
thirtycount = len(thirty_data["SN"].unique())
fourtycount = len(fourty_data["SN"].unique())
fiftycount = len(fifty_data["SN"].unique())
sixtycount = len(sixty_data["SN"].unique())

ave0 = zero_data["Price"].mean()
ave10 = ten_data["Price"].mean()
ave20 = twenty_data["Price"].mean()
ave30 = thirty_data["Price"].mean()
ave40 = fourty_data["Price"].mean()
ave50 = fifty_data["Price"].mean()
ave60 = sixty_data["Price"].mean()

ave0per = zero_data["SN"].value_counts().mean()
ave10per = ten_data["SN"].value_counts().mean()
ave20per = twenty_data["SN"].value_counts().mean()
ave30per = thirty_data["SN"].value_counts().mean()
ave40per = fourty_data["SN"].value_counts().mean()
ave50per = fifty_data["SN"].value_counts().mean()
ave60per = sixty_data["SN"].value_counts().mean()

bindata = [
    ("0 to 10",zerocount,ave0,ave0per),
    ("10 to 20", tencount, ave10,ave10per),
    ("20 to 30",twentycount,ave20,ave20per),
    ("30 to 40",thirtycount,ave30,ave30per),
    ("40 to 50",fourtycount, ave40, ave40per),
    ("50 to 60",fiftycount,ave50,ave50per),
    ("60 to 70",sixtycount,ave60,ave60per)
]

bin2_df = pd.DataFrame(bindata, columns=["Age Group","Purchase Count","Average Purchase Price","Average Purchases per Person"])
bin2_df

## Top Spenders

* Run basic calculations to obtain the results in the table below


* Create a summary data frame to hold the results


* Sort the total purchase value column in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the summary data frame



In [None]:
pricesum["pricesum"] = purchase_data.groupby('SN').Price.sum()
ranking["pricesum"] = purchase_data.groupby('SN').Price.sum().rank(axis=0,ascending=False)


## Most Popular Items

* Retrieve the Item ID, Item Name, and Item Price columns


* Group by Item ID and Item Name. Perform calculations to obtain purchase count, item price, and total purchase value


* Create a summary data frame to hold the results


* Sort the purchase count column in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the summary data frame



## Most Profitable Items

* Sort the above table by total purchase value in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the data frame



In [None]:
male_data["SN"].value_counts().mean()