# Heroes Of Pymoli Report

### Importing Modules

In [1]:
import pandas as pd
import os

### Reading in Data Files 

In [2]:
#path to the data file
file_url = os.path.join('Data','purchase_data_1.json')

#reading the json file into as a DataFrame
user_data_df = pd.read_json(file_url)

### Display DataFrame

In [3]:
user_data_df.head()

Unnamed: 0,Age,Gender,Item ID,Item Name,Price,SN
0,38,Male,165,Bone Crushing Silver Skewer,3.37,Aelalis34
1,21,Male,119,"Stormbringer, Dark Blade of Ending Misery",2.32,Eolo46
2,34,Male,174,Primitive Blade,2.46,Assastnya25
3,21,Male,92,Final Critic,1.36,Pheusrical25
4,23,Male,63,Stormfury Mace,1.27,Aela59


## Data Cleaning 

### Check if all columns are equal: Pass

In [4]:
user_data_df.count()

Age          780
Gender       780
Item ID      780
Item Name    780
Price        780
SN           780
dtype: int64

### Drop empty rows and check the column count again

In [5]:
user_data_df = user_data_df.dropna(how='any')

user_data_df.count()

Age          780
Gender       780
Item ID      780
Item Name    780
Price        780
SN           780
dtype: int64

### Change the column name

In [6]:
user_data_df = user_data_df.rename(columns={'SN':"Username"})

### Display the DataFrame after Data Cleaning

In [7]:
user_data_df.head()

Unnamed: 0,Age,Gender,Item ID,Item Name,Price,Username
0,38,Male,165,Bone Crushing Silver Skewer,3.37,Aelalis34
1,21,Male,119,"Stormbringer, Dark Blade of Ending Misery",2.32,Eolo46
2,34,Male,174,Primitive Blade,2.46,Assastnya25
3,21,Male,92,Final Critic,1.36,Pheusrical25
4,23,Male,63,Stormfury Mace,1.27,Aela59


## Analysis

* Player Count
* Purchasing Analysis (Total)
* Gender Demographics
* Purchasing Analysis (Gender)
* Age Demographics
* Top Spenders
* Most Popular Items
* Most Profitable Items



### Player Count

In [8]:
player_count = len(user_data_df['Username'].unique())
print("Number of Players: {}".format(player_count))

Number of Players: 573


### Purchasing Analysis
* Number of Unique Items
* Average Purchase Price
* Total Number of Purchases
* Total Revenue

In [9]:
unique_items = len(user_data_df['Item ID'].unique())
avg_price = user_data_df['Price'].mean()
total_no_of_purchases = user_data_df['Price'].count()
total_revenue = user_data_df['Price'].sum()

purchasing_analysis_df = pd.DataFrame({
    "Number of Unique Items": unique_items,
    "Average Purchase Price": avg_price,
    "Total Number of Purchases": total_no_of_purchases,
    "Total Revenue": total_revenue,
},index = [0])

# data mugging 
purchasing_analysis_df['Average Purchase Price'] = purchasing_analysis_df['Average Purchase Price'].map("$ {:,.2f}".format)
purchasing_analysis_df['Total Revenue'] = purchasing_analysis_df['Total Revenue'].map("$ {:,.2f}".format)

purchasing_analysis_df

Unnamed: 0,Number of Unique Items,Average Purchase Price,Total Number of Purchases,Total Revenue
0,183,$ 2.93,780,"$ 2,286.33"


### Gender Demographics
* Percentage and Count of Male Players
* Percentage and Count of Female Players
* Percentage and Count of Other / Non-Disclosed

In [24]:
player_demographics = user_data_df[['Gender','Age','Username']]
player_demographics = player_demographics.drop_duplicates()

total_count = player_demographics['Gender'].value_counts()

gender_demographics_df = pd.DataFrame({
    "Total Count": total_count
})

gender_demographics_df

Unnamed: 0,Total Count
Male,465
Female,100
Other / Non-Disclosed,8


### Purchasing Analysis (Gender)

`The below each broken by gender`
* Purchase Count
* Average Purchase Price
* Total Purchase Value
* Normalized Totals

In [29]:
purchase_count = user_data_df.groupby('Gender').count()['Price']
avg_purchase_price = user_data_df.groupby('Gender').mean()['Price']
total_purchase_value = user_data_df.groupby('Gender').sum()['Price']
normalized_total = user_data_df.groupby('Gender').sum()['Price'] / gender_demographics_df['Total Count']

pa = pd.DataFrame({
    "Purchase Count": purchase_count,
    "Average Purchase Price": avg_purchase_price,
    "Total Purchase Value": total_purchase_value,
    "Normalized Totals": normalized_total
})

dollar_value_handler(pa['Average Purchase Price']) 
pa

Unnamed: 0_level_0,Purchase Count,Average Purchase Price,Total Purchase Value,Normalized Totals
Gender,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
Female,136,2.815515,382.91,3.8291
Male,633,2.950521,1867.68,4.016516
Other / Non-Disclosed,11,3.249091,35.74,4.4675


### Age Demographics

`The below each broken into bins of 4 years (i.e. <10, 10-14, 15-19, etc.)`
* Purchase Count
* Average Purchase Price
* Total Purchase Value
* Normalized Totals

### Top Spenders
`Identify the the top 5 spenders in the game by total purchase value, then list (in a table):`
* SN
* Purchase Count
* Average Purchase Price
* Total Purchase Value

### Most Popular Items

`Identify the 5 most popular items by purchase count, then list (in a table):`
* Item ID
* Item Name
* Purchase Count
* Item Price
* Total Purchase Value

### Most Profitable Items

`Identify the 5 most profitable items by total purchase value, then list (in a table):`
- Item ID
- Item Name
- Purchase Count
- Item Price
- Total Purchase Value

## Handler Functions