## Heroes of Pymoli Data Analysis

1) Within the analysis below, it is worth noting that your cliente is mainly comprised of men. There are other takeaways to be gained from this data, but keep in mind that an overwhelming 81 percent of your playerbase are male.

2) On top of your playerbase being overwhelmingly male, they also comprise most of your purchases (633 out of 780 total purchases). 

3) This could go along with having the highest number of total purchases, but they also have the highest average transaction price per transaction. Higher average transacion numbers will yield higher profit if total number of transactions stays the same. Increase the female playerbase and raise the total playerbase and average transaction price won't be as big of a factor in total profit.

In [1]:
import pandas as pd
import numpy as py

In [2]:
purchase_data_df = pd.read_json('purchase_data.json', typ='frame', orient='columns')
purchase_data_df.head()

Unnamed: 0,Age,Gender,Item ID,Item Name,Price,SN
0,38,Male,165,Bone Crushing Silver Skewer,3.37,Aelalis34
1,21,Male,119,"Stormbringer, Dark Blade of Ending Misery",2.32,Eolo46
2,34,Male,174,Primitive Blade,2.46,Assastnya25
3,21,Male,92,Final Critic,1.36,Pheusrical25
4,23,Male,63,Stormfury Mace,1.27,Aela59


## Player Total

In [3]:
player_count = purchase_data_df['SN'].value_counts()
player_count_unique = len((player_count))
player_count_table = pd.DataFrame({'Total Players':[player_count_unique]})
player_count_table

Unnamed: 0,Total Players
0,573


## Purchasing Analysis (Total)

Number of Unique Items

In [33]:
items_unique = purchase_data_df["Item Name"].value_counts()
items_unique_total = len((items_unique))

Average Purchase Price

In [34]:
avg_purchase_price = purchase_data_df["Price"].mean()

Total Number of Purchases

In [35]:
total_number_of_purchases = purchase_data_df["Price"].count()

Total Revenue

In [36]:
total_revenue = purchase_data_df["Price"].sum()

In [8]:
purchasing_analysis = pd.DataFrame({"Number of Unique Items":[items_unique_total], 
                                    "Average Price":[avg_purchase_price],
                                    "Number of Purchases":[total_number_of_purchases],
                                    "Total Revenue":[total_revenue]
})
purchasing_analysis

Unnamed: 0,Average Price,Number of Purchases,Number of Unique Items,Total Revenue
0,2.931192,780,179,2286.33


## Gender Demographics

Percentage and Count of Male Players

In [68]:
grouped_gender_df = purchase_data_df.groupby(['Gender'])
total_player_count_df = grouped_gender_df["Gender"].count()

In [21]:
total_player_count = total_player_count_df.sum()
male_player_count = total_player_count_df['Gender':'Male'].sum()
female_player_count = (purchase_data_df.Gender == 'Female').sum()
other_player_count = (purchase_data_df.Gender == 'Other / Non-Disclosed').sum()

In [23]:
male_player_percentage = male_player_count/total_player_count * 100
female_player_percentage = female_player_count/total_player_count * 100
other_player_percentage = other_player_count/total_player_count * 100

In [54]:
gender_demo = pd.DataFrame({"Percentage of Players":[female_player_percentage, male_player_percentage, undisclosed_player_percentage],
                            "Total Count":total_player_count_df
})
gender_demo

Unnamed: 0_level_0,Percentage of Players,Total Count
Gender,Unnamed: 1_level_1,Unnamed: 2_level_1
Female,17.435897,136
Male,81.153846,633
Other / Non-Disclosed,1.410256,11


## Purchasing Analysis (Gender)

In [78]:
purchase_count_gender_grouped_df = purchase_data_df.groupby(['Gender'])
purchase_count_gender = purchase_count_gender_grouped_df["Gender"].count()

In [82]:
average_purchase_price_gender = purchase_data_df.groupby('Gender').Price.mean()

In [83]:
total_purchase_value_gender = purchase_data_df.groupby('Gender').Price.sum()
total_purchase_value_male = total_purchase_value_gender['Male'].sum()
total_purchase_value_female = total_purchase_value_gender['Female'].sum()
total_purchase_value_other = total_purchase_value_gender['Other / Non-Disclosed'].sum()

In [84]:
normalized_values_male = total_purchase_value_male/(purchase_count_gender['Male'].sum())
normalized_values_female = total_purchase_value_female/(purchase_count_gender['Female'].sum())
normalized_values_other = total_purchase_value_other/(purchase_count_gender['Other / Non-Disclosed'].sum())

In [85]:
purchasing_analysis_summary = pd.DataFrame({"Purchase Count":purchase_count_gender,
                                            "Average Purchase Price":average_purchase_price_gender,
                                            "Total Purchase Value":total_purchase_value_gender,
                                            "Normalized Totals":[normalized_values_female, normalized_values_male, normalized_values_other]
                                           })

purchasing_analysis_summary

Unnamed: 0_level_0,Average Purchase Price,Normalized Totals,Purchase Count,Total Purchase Value
Gender,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
Female,2.815515,2.815515,136,382.91
Male,2.950521,2.950521,633,1867.68
Other / Non-Disclosed,3.249091,3.249091,11,35.74


## Age Demographics

In [86]:
bins = [0, 10, 15, 20, 25, 30, 35, 40, 100]
group_names = ["<10", "10-14", "15-19", "20-24", "25-29", "30-34", "35-39", "40+"]

In [98]:
purchase_data_df["Age Demographics"] = pd.cut(purchase_data_df['Age'], bins, labels=group_names)
age_bins = purchase_data_df.groupby('Age Demographics').count()
age_bins

Unnamed: 0_level_0,Age,Gender,Item ID,Item Name,Price,SN
Age Demographics,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
<10,32,32,32,32,32,32
10-14,78,78,78,78,78,78
15-19,184,184,184,184,184,184
20-24,305,305,305,305,305,305
25-29,76,76,76,76,76,76
30-34,58,58,58,58,58,58
35-39,44,44,44,44,44,44
40+,3,3,3,3,3,3
