### Heroes Of Pymoli Data Analysis
* Of the 1163 active players, the vast majority are male (84%). There also exists, a smaller, but notable proportion of female players (14%).

* Our peak age demographic falls between 20-24 (44.8%) with secondary groups falling between 15-19 (18.60%) and 25-29 (13.4%).  
-----

### Note
* Instructions have been included for each segment. You do not have to follow them exactly, but they are included to help you think through the steps.

In [42]:
# Dependencies and Setup
import pandas as pd
import numpy as np

# File to Load (Remember to Change These)
file_to_load = "Resources/purchase_data.csv"

# Read Purchasing File and store into Pandas data frame
purchase_data = pd.read_csv(file_to_load)
purchase_data.head()

Unnamed: 0,Purchase ID,SN,Age,Gender,Item ID,Item Name,Price
0,0,Lisim78,20,Male,108,"Extraction, Quickblade Of Trembling Hands",3.53
1,1,Lisovynya38,40,Male,143,Frenzied Scimitar,1.56
2,2,Ithergue48,24,Male,92,Final Critic,4.88
3,3,Chamassasya86,24,Male,100,Blindscythe,3.27
4,4,Iskosia90,23,Male,131,Fury,1.44


In [43]:
#How many values are in each column
purchase_data.count()

Purchase ID    780
SN             780
Age            780
Gender         780
Item ID        780
Item Name      780
Price          780
dtype: int64

In [44]:
# Times each player is repeated
player_counts = purchase_data["SN"].value_counts()
player_counts.head()

Lisosia93      5
Idastidru52    4
Iral74         4
Iskadarya95    3
Lisim78        3
Name: SN, dtype: int64

## Player Count

* Display the total number of players


In [45]:
#Number of total players
player_counts.count()

576

## Purchasing Analysis (Total)

* Run basic calculations to obtain number of unique items, average price, etc.


* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame


In [46]:
# Times each item is repeated
items = purchase_data["Item Name"].value_counts()
items.head()

Final Critic                                    13
Oathbreaker, Last Hope of the Breaking Storm    12
Nirvana                                          9
Persuasion                                       9
Fiery Glass Crusader                             9
Name: Item Name, dtype: int64

In [47]:
# Total unique items
items.count()

179

In [48]:
# Ponderated average of price
paverage = purchase_data["Price"].mean()
print(paverage)

3.050987179487176


In [49]:
# Total number of purchases
purchase_data['Purchase ID'].count()

780

In [50]:
# Total Revenue
total_revenue = purchase_data["Price"].sum()
print(total_revenue)

2379.77


## Gender Demographics

* Percentage and Count of Male Players


* Percentage and Count of Female Players


* Percentage and Count of Other / Non-Disclosed




In [51]:
# Percentage of female and Male Players
gender_counts = purchase_data["Gender"].value_counts()
gender_counts

Male                     652
Female                   113
Other / Non-Disclosed     15
Name: Gender, dtype: int64

In [52]:
#Percentage of males
Males=gender_counts['Male']/780*100
print(Males)

83.58974358974359


In [53]:
#Percentage of females
Females=gender_counts['Female']/780*100
print(Females)

14.487179487179489


In [54]:
#Percentage of Non-Discolsed
Non=gender_counts['Other / Non-Disclosed']/780*100
print(Non)

1.9230769230769231



## Purchasing Analysis (Gender)

* Run basic calculations to obtain purchase count, avg. purchase price, avg. purchase total per person etc. by gender




* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame

In [112]:
gender_groups= purchase_data.groupby(['Gender'])

In [113]:
purchase_count = gender_groups["SN"].count()
print(purchase_count)

Gender
Female                   113
Male                     652
Other / Non-Disclosed     15
Name: SN, dtype: int64


In [114]:
price_average = gender_groups["Price"].mean()
print(price_average)

Gender
Female                   3.203009
Male                     3.017853
Other / Non-Disclosed    3.346000
Name: Price, dtype: float64


In [115]:
price_sum = gender_groups["Price"].sum()
print(price_sum)

Gender
Female                    361.94
Male                     1967.64
Other / Non-Disclosed      50.19
Name: Price, dtype: float64


In [116]:
average_persn= price_sum/purchase_count
average_persn.head()

Gender
Female                   3.203009
Male                     3.017853
Other / Non-Disclosed    3.346000
dtype: float64

In [59]:
# Falta avg total purchase per person: resultados FEM: 4.47, MALE:4.07, OTHER/NON: 4.56

In [60]:
# Calculate the total reviews for the entire dataset
summary_data = pd.DataFrame({"Purchase Count": purchase_count, "Average Purchase Price": price_average, "Total Purchase Value": price_sum})
summary_data

Unnamed: 0_level_0,Purchase Count,Average Purchase Price,Total Purchase Value
Gender,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Female,113,3.203009,361.94
Male,652,3.017853,1967.64
Other / Non-Disclosed,15,3.346,50.19


## Age Demographics

* Establish bins for ages


* Categorize the existing players using the age bins. Hint: use pd.cut()


* Calculate the numbers and percentages by age group


* Create a summary data frame to hold the results


* Optional: round the percentage column to two decimal points


* Display Age Demographics Table


In [61]:
# Create the names for the four bins
purchase_data["Age"].min()
purchase_data["Age"].max()

45

In [62]:
bins = [0, 10, 15, 20, 25, 30, 35, 40, 100]
group_names = ["<10", "10-14", "15-19", "20-24", "25-29", "30-34", "35-39", "40+"]

In [63]:
purchase_data["Age Range"]= pd.cut(purchase_data["Age"], bins, labels= group_names)
type(purchase_data)

pandas.core.frame.DataFrame

In [64]:
age_range= purchase_data.groupby(['Age Range'])
age_range.count().head()

Unnamed: 0_level_0,Purchase ID,SN,Age,Gender,Item ID,Item Name,Price
Age Range,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
<10,32,32,32,32,32,32,32
10-14,54,54,54,54,54,54,54
15-19,200,200,200,200,200,200,200
20-24,325,325,325,325,325,325,325
25-29,77,77,77,77,77,77,77


In [65]:
total_count = age_range["Age"].count()
print(total_count)

Age Range
<10       32
10-14     54
15-19    200
20-24    325
25-29     77
30-34     52
35-39     33
40+        7
Name: Age, dtype: int64


In [66]:
percentage_players = total_count/780*100
print(percentage_players)

Age Range
<10       4.102564
10-14     6.923077
15-19    25.641026
20-24    41.666667
25-29     9.871795
30-34     6.666667
35-39     4.230769
40+       0.897436
Name: Age, dtype: float64


In [67]:
summary_age = pd.DataFrame({"Total Count": total_count, "Percentage of Players": percentage_players})
summary_age

Unnamed: 0_level_0,Total Count,Percentage of Players
Age Range,Unnamed: 1_level_1,Unnamed: 2_level_1
<10,32,4.102564
10-14,54,6.923077
15-19,200,25.641026
20-24,325,41.666667
25-29,77,9.871795
30-34,52,6.666667
35-39,33,4.230769
40+,7,0.897436


## Purchasing Analysis (Age)

* Bin the purchase_data data frame by age


* Run basic calculations to obtain purchase count, avg. purchase price, avg. purchase total per person etc. in the table below


* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame

In [68]:
age_ave_price = age_range['Price'].mean()
print(age_ave_price)

Age Range
<10      3.405000
10-14    2.900000
15-19    3.107800
20-24    3.020431
25-29    2.875584
30-34    2.994423
35-39    3.404545
40+      3.075714
Name: Price, dtype: float64


In [69]:
age_sum_price = age_range['Price'].sum()
print(age_sum_price)

Age Range
<10      108.96
10-14    156.60
15-19    621.56
20-24    981.64
25-29    221.42
30-34    155.71
35-39    112.35
40+       21.53
Name: Price, dtype: float64


In [70]:
#FALTA AVG TOTAL PURCHASE PER PERSON y modificar rangos

In [71]:
summary_age = pd.DataFrame({"Purchase Count": total_count, "Average Purchase Price": age_ave_price, "Total Purchase Value": age_sum_price})
summary_age

Unnamed: 0_level_0,Purchase Count,Average Purchase Price,Total Purchase Value
Age Range,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
<10,32,3.405,108.96
10-14,54,2.9,156.6
15-19,200,3.1078,621.56
20-24,325,3.020431,981.64
25-29,77,2.875584,221.42
30-34,52,2.994423,155.71
35-39,33,3.404545,112.35
40+,7,3.075714,21.53


## Top Spenders

* Run basic calculations to obtain the results in the table below


* Create a summary data frame to hold the results


* Sort the total purchase value column in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the summary data frame



In [72]:
sn_group= purchase_data.groupby(['SN'])

In [73]:
purchase_count = sn_group['Price'].count()

In [74]:
purchase_mean = sn_group['Price'].mean()

In [75]:
purchase_sum = sn_group['Price'].sum()

In [76]:
summary_sn = pd.DataFrame({"Purchase Count": purchase_count, "Average Purchase Price": purchase_mean, "Total Purchase Value": purchase_sum})
sn_sorted =summary_sn.sort_values("Total Purchase Value", ascending=False)
sn_sorted.head()

Unnamed: 0_level_0,Purchase Count,Average Purchase Price,Total Purchase Value
SN,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Lisosia93,5,3.792,18.96
Idastidru52,4,3.8625,15.45
Chamjask73,3,4.61,13.83
Iral74,4,3.405,13.62
Iskadarya95,3,4.366667,13.1


## Most Popular Items

* Retrieve the Item ID, Item Name, and Item Price columns


* Group by Item ID and Item Name. Perform calculations to obtain purchase count, item price, and total purchase value


* Create a summary data frame to hold the results


* Sort the purchase count column in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the summary data frame



In [85]:
ordered_data= purchase_data[["Item ID", "Item Name", "Price"]]
ordered_data.head()

Unnamed: 0,Item ID,Item Name,Price
0,108,"Extraction, Quickblade Of Trembling Hands",3.53
1,143,Frenzied Scimitar,1.56
2,92,Final Critic,4.88
3,100,Blindscythe,3.27
4,131,Fury,1.44


In [104]:
items_groups=ordered_data.groupby(['Item ID', 'Item Name', 'Price'])

In [105]:
item_count= items_groups['Price'].count()
item_price_sum = items_groups['Price'].sum()
price= items_groups['Price']

In [108]:
summary_items = pd.DataFrame({"Purchase Count": item_count, "Total Purchase Value": item_price_sum})
sorted_items = summary_items.sort_values('Total Purchase Value', ascending=False)

sorted_items.head()

Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,Purchase Count,Total Purchase Value
Item ID,Item Name,Price,Unnamed: 3_level_1,Unnamed: 4_level_1
178,"Oathbreaker, Last Hope of the Breaking Storm",4.23,12,50.76
82,Nirvana,4.9,9,44.1
145,Fiery Glass Crusader,4.58,9,41.22
92,Final Critic,4.88,8,39.04
103,Singed Scalpel,4.35,8,34.8


## Most Profitable Items

* Sort the above table by total purchase value in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the data frame



In [10]:
# The commands are already above

Unnamed: 0_level_0,Unnamed: 1_level_0,Purchase Count,Item Price,Total Purchase Value
Item ID,Item Name,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
178,"Oathbreaker, Last Hope of the Breaking Storm",12,$4.23,$50.76
82,Nirvana,9,$4.90,$44.10
145,Fiery Glass Crusader,9,$4.58,$41.22
92,Final Critic,8,$4.88,$39.04
103,Singed Scalpel,8,$4.35,$34.80
