### Heroes Of Pymoli Data Analysis
* Of the 1163 active players, the vast majority are male (84%). There also exists, a smaller, but notable proportion of female players (14%).

* Our peak age demographic falls between 20-24 (44.8%) with secondary groups falling between 15-19 (18.60%) and 25-29 (13.4%).  
-----

### Note
* Instructions have been included for each segment. You do not have to follow them exactly, but they are included to help you think through the steps.

In [1]:
# Dependencies and Setup
import pandas as pd

# File to Load (Remember to Change These)
heroes = "Resources/purchase_data.csv"

# Read Purchasing File and store into Pandas data frame
heroes_df = pd.read_csv(heroes)
heroes_df.head()

Unnamed: 0,Purchase ID,SN,Age,Gender,Item ID,Item Name,Price
0,0,Lisim78,20,Male,108,"Extraction, Quickblade Of Trembling Hands",3.53
1,1,Lisovynya38,40,Male,143,Frenzied Scimitar,1.56
2,2,Ithergue48,24,Male,92,Final Critic,4.88
3,3,Chamassasya86,24,Male,100,Blindscythe,3.27
4,4,Iskosia90,23,Male,131,Fury,1.44


## Player Count

* Display the total number of players


In [17]:
unique_players = heroes_df["SN"].unique()
player_count = len(unique_players)
player_ct_df = pd.DataFrame({"Player Count":[player_count]})
player_ct_df.head()

Unnamed: 0,Player Count
0,576


## Purchasing Analysis (Total)

* Run basic calculations to obtain number of unique items, average price, etc.


* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame


In [3]:
unique_items = heroes_df["Item ID"].unique()
item_count = len(unique_items)
item_count

avg_price = heroes_df["Price"].mean()
avg_price

total_rev = heroes_df["Price"].sum()
total_rev

num_purchases = heroes_df["Purchase ID"].count()
num_purchases

summary_df = pd.DataFrame([{"Item Count": item_count,"Average Price": avg_price,
                           "Total Revenue": total_rev,"Total Number of Purchases": num_purchases}])
summary_df["Average Price"]=summary_df["Average Price"].astype(float).map("${:.2f}".format)
summary_df["Total Revenue"]=summary_df["Total Revenue"].astype(float).map("${:,.2f}".format)
summary_df.head()

Unnamed: 0,Average Price,Item Count,Total Number of Purchases,Total Revenue
0,$3.05,183,780,"$2,379.77"


## Gender Demographics

* Percentage and Count of Male Players


* Percentage and Count of Female Players


* Percentage and Count of Other / Non-Disclosed




In [4]:
male_df = heroes_df.loc[heroes_df["Gender"]=="Male"]
unique_male = male_df["SN"].unique()
num_males = len(unique_male)
per_males = (num_males/player_count)*100

female_df = heroes_df.loc[heroes_df["Gender"]=="Female"]
unique_female = female_df["SN"].unique()
num_females = len(unique_female)
per_females = (num_females/player_count)*100
per_females

other_and_male_df = heroes_df.loc[heroes_df["Gender"]!= "Female"]

other_df = other_and_male_df.loc[other_and_male_df["Gender"]!="Male"]
unique_other = other_df["SN"].unique()
num_other = len(unique_other)
per_other = (num_other/player_count)*100
per_other

percentages = [per_males,per_other,per_females]
numbers = [num_males,num_other,num_females]
genders = heroes_df["Gender"].unique()
gender_df = pd.DataFrame({"Gender":genders,"Percentage of Players":percentages,"Number of Players":numbers},
                         columns = ["Gender","Percentage of Players","Number of Players"])
gender_df["Percentage of Players"] = gender_df["Percentage of Players"].astype(float).map("{:.2f}%".format)
df = gender_df.set_index("Gender")
df.head()

Unnamed: 0_level_0,Percentage of Players,Number of Players
Gender,Unnamed: 1_level_1,Unnamed: 2_level_1
Male,84.03%,484
Other / Non-Disclosed,1.91%,11
Female,14.06%,81



## Purchasing Analysis (Gender)

* Run basic calculations to obtain purchase count, avg. purchase price, avg. purchase total per person etc. by gender




* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame

In [5]:
male_pur_count = male_df["Purchase ID"].count()
total_male_rev = male_df["Price"].sum()
avg_male_total = total_male_rev/male_pur_count
male_pur_count

female_pur_count = female_df["Purchase ID"].count()
total_female_rev = female_df["Price"].sum()
avg_female_total = total_female_rev/female_pur_count
female_pur_count

other_pur_count = other_df["Purchase ID"].count()
total_other_rev = other_df["Price"].sum()
avg_other_total = total_other_rev/other_pur_count
other_pur_count
avg_other_total
total_pur_value = [total_male_rev,total_other_rev,total_female_rev]
avg_pur_price = [avg_male_total,avg_other_total,avg_female_total]

In [6]:
male_df["SN"] = heroes_df.loc[:,"SN"]
grouped_males = male_df.groupby(["SN"])
total_male_pur_df = grouped_males.sum()
total_male_pur = total_male_pur_df["Price"].sum()
total_male_pur
per_male = total_male_pur/num_males

female_df["SN"] = heroes_df.loc[:,"SN"]
grouped_females = female_df.groupby(["SN"])
total_female_pur_df = grouped_females.sum()
total_female_pur = total_female_pur_df["Price"].sum()
total_female_pur
per_female = total_female_pur/num_females

other_df["SN"] = heroes_df.loc[:,"SN"]
grouped_other = other_df.groupby(["SN"])
total_other_pur_df = grouped_other.sum()
total_other_pur = total_other_pur_df["Price"].sum()
total_other_pur
per_other = total_other_pur/num_other


avg_per = [per_male,per_other,per_female]
total_num_pur = [male_df["Purchase ID"].count(),other_df["Purchase ID"].count(),female_df["Purchase ID"].count()]


more_gender_df = pd.DataFrame({"Gender": genders, "Total Purchase Value":total_pur_value,
                               "Average Purchase Price":avg_pur_price, "Average Purchase per Person":avg_per,
                              "Purchase Count": total_num_pur}
                             ,columns = ["Gender","Total Purchase Value","Average Purchase Price","Average Purchase per Person",
                                        "Purchase Count"])
more_gender_df["Total Purchase Value"] = more_gender_df["Total Purchase Value"].astype(float).map("${:,.2f}".format)
more_gender_df["Average Purchase Price"] = more_gender_df["Average Purchase Price"].astype(float).map("${:.2f}".format)
more_gender_df["Average Purchase per Person"] = more_gender_df["Average Purchase per Person"].astype(float).map("${:.2f}".format)
more_gender_df.head()

df_2 = more_gender_df.set_index("Gender")
df_2.head()


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  """Entry point for launching an IPython kernel.
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  from ipykernel import kernelapp as app


Unnamed: 0_level_0,Total Purchase Value,Average Purchase Price,Average Purchase per Person,Purchase Count
Gender,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
Male,"$1,967.64",$3.02,$4.07,652
Other / Non-Disclosed,$50.19,$3.35,$4.56,15
Female,$361.94,$3.20,$4.47,113


## Age Demographics

* Establish bins for ages


* Categorize the existing players using the age bins. Hint: use pd.cut()


* Calculate the numbers and percentages by age group


* Create a summary data frame to hold the results


* Optional: round the percentage column to two decimal points


* Display Age Demographics Table


In [7]:
heroes_df["Age"].mean()
bins = [0,10,13,18,35,50]
age_range = ["Children","Tweens","Teens","Young Adults","Adults"]
heroes_df["Age Range"] = pd.cut(heroes_df["Age"],bins,labels = age_range)
grouped_ages = heroes_df.groupby(["Age Range"])

children_df = heroes_df.loc[heroes_df["Age Range"]=="Children"]
unique_children = children_df["SN"].unique()
num_children = len(unique_children)
per_children = (num_children/player_count)*100

tween_df = heroes_df.loc[heroes_df["Age Range"]=="Tweens"]
unique_tween = tween_df["SN"].unique()
num_tween = len(unique_tween)
per_tween = (num_tween/player_count)*100

teen_df = heroes_df.loc[heroes_df["Age Range"]=="Teens"]
unique_teen = teen_df["SN"].unique()
num_teen = len(unique_teen)
per_teen = (num_teen/player_count)*100

y_adult_df = heroes_df.loc[heroes_df["Age Range"]=="Young Adults"]
unique_y_adult = y_adult_df["SN"].unique()
num_y_adult = len(unique_y_adult)
per_y_adult = (num_y_adult/player_count)*100

adult_df = heroes_df.loc[heroes_df["Age Range"]=="Adults"]
unique_adult = adult_df["SN"].unique()
num_adult = len(unique_adult)
per_adult = (num_adult/player_count)*100

num_tot = [num_children,num_tween,num_teen,num_y_adult,num_adult]
age_per = [per_children,per_tween,per_teen,per_y_adult,per_adult]

age_df = pd.DataFrame({"Age Range":age_range,"Number of Players":num_tot,"Percentage of Players":age_per},
                     columns = ["Age Range","Number of Players","Percentage of Players"])
age_df["Percentage of Players"]= age_df["Percentage of Players"].astype(float).map("{:.2f}%".format)
new_age_df = age_df.set_index("Age Range")
new_age_df.head()

Unnamed: 0_level_0,Number of Players,Percentage of Players
Age Range,Unnamed: 1_level_1,Unnamed: 2_level_1
Children,24,4.17%
Tweens,13,2.26%
Teens,92,15.97%
Young Adults,414,71.88%
Adults,33,5.73%


## Purchasing Analysis (Age)

* Bin the purchase_data data frame by age


* Run basic calculations to obtain purchase count, avg. purchase price, avg. purchase total per person etc. in the table below


* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame

In [8]:
purchase_count_children = children_df["Purchase ID"].count()
children_rev = children_df["Price"].sum()
avg_children_pur = children_rev/purchase_count_children
avg_per_children = children_rev/num_children
purchase_count_children

purchase_count_tween = tween_df["Purchase ID"].count()
tween_rev = tween_df["Price"].sum()
avg_tween_pur = tween_rev/purchase_count_tween
avg_per_tween = tween_rev/num_tween
purchase_count_tween

purchase_count_teen = teen_df["Purchase ID"].count()
teen_rev = teen_df["Price"].sum()
avg_teen_pur = teen_rev/purchase_count_teen
avg_per_teen = teen_rev/num_teen
purchase_count_teen

purchase_count_y_adult = y_adult_df["Purchase ID"].count()
y_adult_rev = y_adult_df["Price"].sum()
avg_y_adult_pur = y_adult_rev/purchase_count_y_adult
avg_per_y_adult = y_adult_rev/num_y_adult
purchase_count_y_adult

purchase_count_adult = adult_df["Purchase ID"].count()
adult_rev = adult_df["Price"].sum()
avg_adult_pur = adult_rev/purchase_count_adult
avg_per_adult = adult_rev/num_adult
purchase_count_adult

purchase_count_age = [purchase_count_children,purchase_count_tween,purchase_count_teen,purchase_count_y_adult,
                      purchase_count_adult]
rev_per = [children_rev,tween_rev,teen_rev,y_adult_rev,adult_rev]
avg_pur_age = [avg_children_pur,avg_tween_pur,avg_teen_pur,avg_y_adult_pur,avg_adult_pur]
avg_per_age = [avg_per_children,avg_per_tween,avg_per_teen,avg_per_y_adult,avg_per_adult]

In [9]:
pur_age_df = pd.DataFrame({"Age Range":age_range,"Purchase Count":purchase_count_age,
                          "Total Purchase Value":rev_per,"Average Purchase Price":avg_pur_age,
                          "Average Purchase per Person": avg_per_age},
                         columns=["Age Range","Purchase Count","Total Purchase Value","Average Purchase Price",
                                 "Average Purchase per Person"])
pur_age_df["Total Purchase Value"] = pur_age_df["Total Purchase Value"].astype(float).map("${:,.2f}".format)
pur_age_df["Average Purchase Price"] = pur_age_df["Average Purchase Price"].astype(float).map("${:.2f}".format)
pur_age_df["Average Purchase per Person"] = pur_age_df["Average Purchase per Person"].astype(float).map("${:.2f}".format)
pur_age = pur_age_df.set_index("Age Range")
pur_age.head()

Unnamed: 0_level_0,Purchase Count,Total Purchase Value,Average Purchase Price,Average Purchase per Person
Age Range,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
Children,32,$108.96,$3.41,$4.54
Tweens,17,$44.04,$2.59,$3.39
Teens,115,$349.82,$3.04,$3.80
Young Adults,576,"$1,743.07",$3.03,$4.21
Adults,40,$133.88,$3.35,$4.06


## Top Spenders

* Run basic calculations to obtain the results in the table below


* Create a summary data frame to hold the results


* Sort the total purchase value column in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the summary data frame



In [10]:
unique_sn = heroes_df["SN"].unique()
num_pur_sn = heroes_df["SN"].value_counts()
groupe = heroes_df.groupby("SN")
total_spent_sn = groupe["Price"].sum()
avg_pur_sn = total_spent_sn/num_pur_sn
top_spender_df = pd.DataFrame({"Number of Purchases":num_pur_sn,"Total Spent":total_spent_sn,
                               "Average Purchase Price":avg_pur_sn},columns = ["Number of Purchases","Total Spent",
                                                                              "Average Purchase Price"])
top_spender_df = top_spender_df.sort_values("Total Spent",ascending=False)
top_spender_df["Total Spent"] = top_spender_df["Total Spent"].astype(float).map("${:.2f}".format)
top_spender_df["Average Purchase Price"] = top_spender_df["Average Purchase Price"].astype(float).map("${:.2f}".format)
top_spender_df.index.name = ("SN")
top_spender_df.head()


Unnamed: 0_level_0,Number of Purchases,Total Spent,Average Purchase Price
SN,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Lisosia93,5,$18.96,$3.79
Idastidru52,4,$15.45,$3.86
Chamjask73,3,$13.83,$4.61
Iral74,4,$13.62,$3.40
Iskadarya95,3,$13.10,$4.37


## Most Popular Items

* Retrieve the Item ID, Item Name, and Item Price columns


* Group by Item ID and Item Name. Perform calculations to obtain purchase count, item price, and total purchase value


* Create a summary data frame to hold the results


* Sort the purchase count column in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the summary data frame



In [11]:
popular_df = heroes_df[["Item ID","Item Name","Price"]]
unique_id = heroes_df["Item ID"].unique()
unique_name = heroes_df["Item Name"].unique()
purchase_count = popular_df["Item Name"].value_counts()

pop_df = popular_df.groupby(["Item Name","Item ID"])
pop_df_2 = popular_df.groupby(["Item Name"])
num_sold = pop_df["Item Name"].value_counts()
price_per_game = pop_df_2["Price"].mean()
total_rev_per_game = price_per_game*purchase_count
unique_pop_df = pd.DataFrame({"Number of Sales":purchase_count,"Item Price":price_per_game,"Total Revenue":total_rev_per_game},
                            columns = ["Number of Sales","Item Price","Total Revenue"])
unique_pop_df = unique_pop_df.sort_values("Number of Sales",ascending = False)
pop_df_sorted = unique_pop_df.sort_values("Total Revenue",ascending = False) 
unique_pop_df.index.name = "Item Name"
unique_pop_df["Item Price"] = unique_pop_df["Item Price"].astype(float).map("${:.2f}".format)
unique_pop_df["Total Revenue"] = unique_pop_df["Total Revenue"].astype(float).map("${:.2f}".format)
unique_pop_df.head()


Unnamed: 0_level_0,Number of Sales,Item Price,Total Revenue
Item Name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Final Critic,13,$4.61,$59.99
"Oathbreaker, Last Hope of the Breaking Storm",12,$4.23,$50.76
Persuasion,9,$3.22,$28.99
Nirvana,9,$4.90,$44.10
"Extraction, Quickblade Of Trembling Hands",9,$3.53,$31.77


## Most Profitable Items

* Sort the above table by total purchase value in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the data frame



In [12]:
pop_df_sorted["Item Price"] = pop_df_sorted["Item Price"].astype(float).map("${:.2f}".format)
pop_df_sorted["Total Revenue"] = pop_df_sorted["Total Revenue"].astype(float).map("${:.2f}".format)
pop_df_sorted.index.name = "Item Name"
pop_df_sorted.head()

Unnamed: 0_level_0,Number of Sales,Item Price,Total Revenue
Item Name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Final Critic,13,$4.61,$59.99
"Oathbreaker, Last Hope of the Breaking Storm",12,$4.23,$50.76
Nirvana,9,$4.90,$44.10
Fiery Glass Crusader,9,$4.58,$41.22
Singed Scalpel,8,$4.35,$34.80
