### Heroes Of Pymoli Data Analysis
* Of the 1163 active players, the vast majority are male (84%). There also exists, a smaller, but notable proportion of female players (14%).

* Our peak age demographic falls between 20-24 (44.8%) with secondary groups falling between 15-19 (18.60%) and 25-29 (13.4%).  
-----

### Note
* Instructions have been included for each segment. You do not have to follow them exactly, but they are included to help you think through the steps.

In [1]:
# Dependencies and Setup
import pandas as pd
import numpy as np

In [2]:
# File to Load (Remember to Change These)
video_games = "Resources/purchase_data.csv"

In [3]:
# Read Purchasing File and store into Pandas data frame
video_games_data = pd.read_csv(video_games)

In [4]:
video_games_data.head(4)

Unnamed: 0,Purchase ID,SN,Age,Gender,Item ID,Item Name,Price
0,0,Lisim78,20,Male,108,"Extraction, Quickblade Of Trembling Hands",3.53
1,1,Lisovynya38,40,Male,143,Frenzied Scimitar,1.56
2,2,Ithergue48,24,Male,92,Final Critic,4.88
3,3,Chamassasya86,24,Male,100,Blindscythe,3.27


In [5]:
video_games_data.describe()

Unnamed: 0,Purchase ID,Age,Item ID,Price
count,780.0,780.0,780.0,780.0
mean,389.5,22.714103,92.114103,3.050987
std,225.310896,6.659444,52.775943,1.169549
min,0.0,7.0,0.0,1.0
25%,194.75,20.0,48.0,1.98
50%,389.5,22.0,93.0,3.15
75%,584.25,25.0,139.0,4.08
max,779.0,45.0,183.0,4.99


In [6]:
video_games_data.count()
player_count = video_games_data.drop_duplicates(subset="SN")

In [7]:
vg = video_games_data["SN"].count()
vg

780

## Player Count

* Display the total number of players


In [8]:
z = player_count["SN"].count()
z

576

In [9]:
gender_count =video_games_data.groupby("Gender").count()
gender_count

Unnamed: 0_level_0,Purchase ID,SN,Age,Item ID,Item Name,Price
Gender,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Female,113,113,113,113,113,113
Male,652,652,652,652,652,652
Other / Non-Disclosed,15,15,15,15,15,15


In [10]:
players_count = video_games_data.groupby("SN").count()
players_count.describe()

Unnamed: 0,Purchase ID,Age,Gender,Item ID,Item Name,Price
count,576.0,576.0,576.0,576.0,576.0,576.0
mean,1.354167,1.354167,1.354167,1.354167,1.354167,1.354167
std,0.626585,0.626585,0.626585,0.626585,0.626585,0.626585
min,1.0,1.0,1.0,1.0,1.0,1.0
25%,1.0,1.0,1.0,1.0,1.0,1.0
50%,1.0,1.0,1.0,1.0,1.0,1.0
75%,2.0,2.0,2.0,2.0,2.0,2.0
max,5.0,5.0,5.0,5.0,5.0,5.0


In [11]:
video_games_data.columns

Index(['Purchase ID', 'SN', 'Age', 'Gender', 'Item ID', 'Item Name', 'Price'], dtype='object')

## Purchasing Analysis (Total)

* Run basic calculations to obtain number of unique items, average price, etc.


* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame


In [12]:
# Run basic calculations to obtain number of unique items, average price, etc.
Unique_Items = video_games_data.drop_duplicates(subset="Item ID")
x = Unique_Items["Item ID"].count()

average_price = video_games_data["Price"].mean()

Total_purchases = video_games_data["Purchase ID"].count()

Total_Revenue = video_games_data["Price"].sum()

print(x)
print(average_price)
print(Total_purchases)
print(Total_Revenue)

183
3.050987179487176
780
2379.77


In [13]:
Purchase_Analysis = pd.DataFrame({"# of Unique Items": [x], "Average Price":[average_price], 
                                  "# Of Purchases":[Total_purchases],"Total Revnue": [Total_Revenue]})
Purchase_Analysis.round(2)

Unnamed: 0,# of Unique Items,Average Price,# Of Purchases,Total Revnue
0,183,3.05,780,2379.77


In [14]:
#format DataFrame


## Gender Demographics

* Percentage and Count of Male Players


* Percentage and Count of Female Players


* Percentage and Count of Other / Non-Disclosed




In [15]:
gender_count = player_count.groupby("Gender").count()["Age"]
gender_count

Gender
Female                    81
Male                     484
Other / Non-Disclosed     11
Name: Age, dtype: int64

In [16]:
gender_percentage = (gender_count/z)*100
gender_percentage.round(2)

Gender
Female                   14.06
Male                     84.03
Other / Non-Disclosed     1.91
Name: Age, dtype: float64

In [17]:
demographic_Analysis = pd.DataFrame({"Total Count": gender_count,"% of Players":gender_percentage})
demographic_Analysis.sort_values("Total Count",ascending=False)

Unnamed: 0_level_0,Total Count,% of Players
Gender,Unnamed: 1_level_1,Unnamed: 2_level_1
Male,484,84.027778
Female,81,14.0625
Other / Non-Disclosed,11,1.909722



## Purchasing Analysis (Gender)

* Run basic calculations to obtain purchase count, avg. purchase price, avg. purchase total per person etc. by gender




* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame

In [18]:
player_count.head(4)

Unnamed: 0,Purchase ID,SN,Age,Gender,Item ID,Item Name,Price
0,0,Lisim78,20,Male,108,"Extraction, Quickblade Of Trembling Hands",3.53
1,1,Lisovynya38,40,Male,143,Frenzied Scimitar,1.56
2,2,Ithergue48,24,Male,92,Final Critic,4.88
3,3,Chamassasya86,24,Male,100,Blindscythe,3.27


In [19]:
purchase_count = video_games_data.groupby("Gender").count()["Item ID"]
average_purchase = video_games_data.groupby("Gender").mean()["Price"]
total_purchase_value =video_games_data.groupby("Gender").sum()["Price"]
average_pp_person = total_purchase_value/gender_count

In [20]:
Purchasing_Analysis = pd.DataFrame({"Purchase Count": purchase_count,"Avg. Purchase Price":average_purchase,
                                    "Total Purchase Value":total_purchase_value, 
                                    "Avg. Total Purchase/Person":average_pp_person})
Purchasing_Analysis.round(2)

Unnamed: 0_level_0,Purchase Count,Avg. Purchase Price,Total Purchase Value,Avg. Total Purchase/Person
Gender,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
Female,113,3.2,361.94,4.47
Male,652,3.02,1967.64,4.07
Other / Non-Disclosed,15,3.35,50.19,4.56


## Age Demographics

* Establish bins for ages


* Categorize the existing players using the age bins. Hint: use pd.cut()


* Calculate the numbers and percentages by age group


* Create a summary data frame to hold the results


* Optional: round the percentage column to two decimal points


* Display Age Demographics Table


## Purchasing Analysis (Age)

* Bin the purchase_data data frame by age


* Run basic calculations to obtain purchase count, avg. purchase price, avg. purchase total per person etc. in the table below


* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame

## Top Spenders

* Run basic calculations to obtain the results in the table below


* Create a summary data frame to hold the results


* Sort the total purchase value column in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the summary data frame



## Most Popular Items

* Retrieve the Item ID, Item Name, and Item Price columns


* Group by Item ID and Item Name. Perform calculations to obtain purchase count, item price, and total purchase value


* Create a summary data frame to hold the results


* Sort the purchase count column in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the summary data frame



In [21]:
video_games_data.head()

Unnamed: 0,Purchase ID,SN,Age,Gender,Item ID,Item Name,Price
0,0,Lisim78,20,Male,108,"Extraction, Quickblade Of Trembling Hands",3.53
1,1,Lisovynya38,40,Male,143,Frenzied Scimitar,1.56
2,2,Ithergue48,24,Male,92,Final Critic,4.88
3,3,Chamassasya86,24,Male,100,Blindscythe,3.27
4,4,Iskosia90,23,Male,131,Fury,1.44


In [22]:
removed_video_games_data = video_games_data[["Item ID", "Item Name", "Price"]]
removed_video_games_data.head()

Unnamed: 0,Item ID,Item Name,Price
0,108,"Extraction, Quickblade Of Trembling Hands",3.53
1,143,Frenzied Scimitar,1.56
2,92,Final Critic,4.88
3,100,Blindscythe,3.27
4,131,Fury,1.44


In [26]:
ID_Purchase_Count = video_games_data.groupby("Item Name").count()["Item ID"]
ID_Purchase_Count.head(4)

Item Name
Abyssal Shard                     5
Aetherius, Boon of the Blessed    5
Agatha                            6
Alpha                             3
Name: Item ID, dtype: int64

In [23]:
ID_Item_price = video_games_data.groupby("Item Name").sum()["Price"]
ID_Item_price.head(4)

Item Name
Abyssal Shard                     13.35
Aetherius, Boon of the Blessed    16.95
Agatha                            18.48
Alpha                              6.21
Name: Price, dtype: float64

In [32]:
Item_Price = ID_Item_price/ID_Purchase_Count
Item_Price.head(3)

Item Name
Abyssal Shard                     2.67
Aetherius, Boon of the Blessed    3.39
Agatha                            3.08
dtype: float64

In [33]:
Total_Purchase_Value = ID_Purchase_Count * Item_Price
Total_Purchase_Value.head(3)

Item Name
Abyssal Shard                     13.35
Aetherius, Boon of the Blessed    16.95
Agatha                            18.48
dtype: float64

In [36]:
Popular_Item_Analysis = pd.DataFrame({"Purchase Count": ID_Purchase_Count,"Item Price":Item_Price,
                                    "Total Purchase Value":Total_Purchase_Value})
Popular_Item_Analysis.round(2)

Unnamed: 0_level_0,Purchase Count,Item Price,Total Purchase Value
Item Name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Abyssal Shard,5,2.67,13.35
"Aetherius, Boon of the Blessed",5,3.39,16.95
Agatha,6,3.08,18.48
Alpha,3,2.07,6.21
"Alpha, Oath of Zeal",3,4.05,12.15
"Alpha, Reach of Ending Hope",1,3.58,3.58
Amnesia,6,2.18,13.08
Apocalyptic Battlescythe,6,1.97,11.82
Arcane Gem,3,3.79,11.37
Avenger,6,3.44,20.64


## Most Profitable Items

* Sort the above table by total purchase value in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the data frame



In [None]:
video_games_data