### Heroes Of Pymoli Data Analysis
* Of the 1163 active players, the vast majority are male (84%). There also exists, a smaller, but notable proportion of female players (14%).

* Our peak age demographic falls between 20-24 (44.8%) with secondary groups falling between 15-19 (18.60%) and 25-29 (13.4%).  
-----

### Note
* Instructions have been included for each segment. You do not have to follow them exactly, but they are included to help you think through the steps.

In [28]:
# Dependencies and Setup
import pandas as pd
import numpy as np
import os



csvpath = os.path.join("Resources", "heroes_data.csv")
print(csvpath)
file_to_load = "Resources\heroes_data.csv"
heroes_df = pd.read_csv("heroes_data.csv")
heroes_df.head()







Resources\heroes_data.csv


Unnamed: 0,Purchase ID,SN,Age,Gender,Item ID,Item Name,Price
0,0,Lisim78,20,Male,108,"Extraction, Quickblade Of Trembling Hands",3.53
1,1,Lisovynya38,40,Male,143,Frenzied Scimitar,1.56
2,2,Ithergue48,24,Male,92,Final Critic,4.88
3,3,Chamassasya86,24,Male,100,Blindscythe,3.27
4,4,Iskosia90,23,Male,131,Fury,1.44


## Player Count

* Display the total number of players


In [8]:
PlayerCount = len(heroes_df["SN"].unique())
playercount_disp = pd.DataFrame({"Player Count": [PlayerCount]})
playercount_disp

Unnamed: 0,Player Count
0,576


## Purchasing Analysis (Total)

* Run basic calculations to obtain number of unique items, average price, etc.


* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame


In [9]:
# set up your calcualtions
# for avg(mean) for revenue(sum)
UniqueItems = len(heroes_df["Item Name"].unique())
AvgPurchase = heroes_df["Price"].mean()
NumPurchase = len(heroes_df["Item Name"])
Total_Revenue = heroes_df["Price"].sum()

Purchasing_Analysis_total=pd.DataFrame({"number of unique items": [UniqueItems],
                                        "Average price": [AvgPurchase],
                                        "number of purchases": [NumPurchase],
                                        "Total Revenue": [Revenue]})

Purchasing_Analysis_total

                                       


Unnamed: 0,number of unique items,Average price,number of purchases,Total Revenue
0,179,3.050987,780,2379.77


## Gender Demographics

* Percentage and Count of Male Players


* Percentage and Count of Female Players


* Percentage and Count of Other / Non-Disclosed




In [10]:
duplicate = heroes_df.drop_duplicates(subset='SN', keep="first")
TotalGen = duplicate["Gender"].count()
dudes = duplicate["Gender"].value_counts()['Male']
ladies = duplicate["Gender"].value_counts()['Female']
NonGen = TotalGen - dudes - ladies

# percentage generation 

dudegen=(dudes/TotalGen)*100
ladiesgen=(ladies/TotalGen)*100
other=(NonGen/TotalGen)*100

Gender_Demo = pd.DataFrame({"": ['Male', 'Female', 'Other/Non-Disclosed'],
                            "Percentage of Players": [dudegen, ladiesgen, other],
                            "Total Count": [dudes, ladies, NonGen]})

Gender_Demo["Percentage of Players"] = Gender_Demo["Percentage of Players"].map("{:.2f}%".format)
Gender_Demo = Gender_Demo.set_index('')
Gender_Demo












Unnamed: 0,Percentage of Players,Total Count
,,
Male,84.03%,484.0
Female,14.06%,81.0
Other/Non-Disclosed,1.91%,11.0



## Purchasing Analysis (Gender)

* Run basic calculations to obtain purchase count, avg. purchase price, avg. purchase total per person etc. by gender




* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame

In [25]:
# group data 
group_data=heroes_df.groupby(['Gender'])
# data manipulation use the similar patterns above when doing purchasing analyisis. 
purch_count=heroes_df['SN'].count()
purch_price=heroes_df['Price'].mean()
purch_value=heroes_df['Price'].sum()

purchNorm = (heroes_df["Price"].sum() / grouped_dup["SN"].count())

# Create new DataFrame
Purch_Analy_Gen = pd.DataFrame({"Purchase Count": purch_count,
                              "Average Purchase Price": purch_price,
                              "Total Purchase Value": purch_value,
                              "Avg Price per person": purchNorm})

# DataFrame formatting
Purch_Analy_Gen["Average Purchase Price"] = Purch_Analy_Gen["Average Purchase Price"].map("${:.2f}".format)
Purch_Analy_Gen["Total Purchase Value"] = Purch_Analy_Gen["Total Purchase Value"].map("${:.2f}".format)
Purch_Analy_Gen["Avg Price per person"] = Purch_Analy_Gen["Avg Price per person"].map("${:.2f}".format)
Purch_Analy_Gen = Purch_Analy_Gen[["Purchase Count", "Average Purchase Price", "Total Purchase Value", "Avg Price per person"]]
Purch_Analy_Gen







Unnamed: 0_level_0,Purchase Count,Average Purchase Price,Total Purchase Value,Avg Price per person
Gender,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
Female,780,$3.05,$2379.77,$29.38
Male,780,$3.05,$2379.77,$4.92
Other / Non-Disclosed,780,$3.05,$2379.77,$216.34


## Age Demographics

* Establish bins for ages


* Categorize the existing players using the age bins. Hint: use pd.cut()


* Calculate the numbers and percentages by age group


* Create a summary data frame to hold the results


* Optional: round the percentage column to two decimal points


* Display Age Demographics Table


In [27]:
# bining values to discreate intervals 
bins = [0,10,15,20,25,30,35,40,100]
bin_int=['Under 10', '10 - 14', '15 - 19', '20 - 24', '25 - 29', '30 - 34', '35 - 39', 'Over 40']


# Add bins to new dataframe and groupby
binner_df = heroes_df.copy()
binner_df["Age Groups"] = pd.cut(binner_df["Age"], bins, labels=bin_int)
group_bin = binner_df.groupby(["Age Groups"])

# Data manipulation
binne_Count = group_bin["SN"].count()
count_Total = heroes_df["SN"].count()
percentage = (binne_Count / countTotal) * 100
percentage

# Create new DataFrame
Age_Perc = pd.DataFrame({"Total Count": binne_Count,
                         "Percentage of Players": percentage})



# DataFrame formatting using map string
Age_Perc["Percentage of Players"] = Age_Perc["Percentage of Players"].map("{:.2f}%".format)
Age_Perc.head(20)




NameError: name 'countTotal' is not defined

## Purchasing Analysis (Age)

* Bin the purchase_data data frame by age


* Run basic calculations to obtain purchase count, avg. purchase price, avg. purchase total per person etc. in the table below


* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame

## Top Spenders

* Run basic calculations to obtain the results in the table below


* Create a summary data frame to hold the results


* Sort the total purchase value column in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the summary data frame



In [26]:
groupedByPlayer = heroes_df.groupby(["SN"])
groupedPlayerCount = groupedByPlayer["Item ID"].count()
groupedPlayerTotal = groupedByPlayer["Price"].sum()
groupedPlayerAvg = (groupedPlayerTotal / groupedPlayerCount)



purchase_analysis = pd.DataFrame({"Purchase Count": groupedPlayerCount,
                         "Average Purchase Price": groupedPlayerAvg,
                         "Total Purchase Value": groupedPlayerTotal})

purchase_analysis1 = purchase_analysis.sort_values("Total Purchase Value", ascending=False) 
purchase_analysis1["Average Purchase Price"] = purchase_analysis1["Average Purchase Price"].map("${:.2f}".format)
purchase_analysis["Total Purchase Value"] = purchase_analysis1["Total Purchase Value"].map("${:.2f}".format)
purchase_analysis1 = purchase_analysis1[["Purchase Count", "Average Purchase Price", "Total Purchase Value"]]
purchase_analysis1.head()


Unnamed: 0_level_0,Purchase Count,Average Purchase Price,Total Purchase Value
SN,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Lisosia93,5,$3.79,18.96
Idastidru52,4,$3.86,15.45
Chamjask73,3,$4.61,13.83
Iral74,4,$3.40,13.62
Iskadarya95,3,$4.37,13.1


## Most Popular Items

* Retrieve the Item ID, Item Name, and Item Price columns


* Group by Item ID and Item Name. Perform calculations to obtain purchase count, item price, and total purchase value


* Create a summary data frame to hold the results


* Sort the purchase count column in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the summary data frame



## Most Profitable Items

* Sort the above table by total purchase value in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the data frame

