### Note
* Instructions have been included for each segment. You do not have to follow them exactly, but they are included to help you think through the steps.

In [1]:
# Dependencies and Setup
import pandas as pd
import os
import csv

# File to Load (Remember to Change These)
purchase_data = "Resources/purchase_data.csv"

# Read Purchasing File and store into Pandas data frame
Unclean_purchase_data_df = pd.read_csv(purchase_data)
Unclean_purchase_data_df.head()


Unnamed: 0,Purchase ID,SN,Age,Gender,Item ID,Item Name,Price
0,0,Lisim78,20,Male,108,"Extraction, Quickblade Of Trembling Hands",3.53
1,1,Lisovynya38,40,Male,143,Frenzied Scimitar,1.56
2,2,Ithergue48,24,Male,92,Final Critic,4.88
3,3,Chamassasya86,24,Male,100,Blindscythe,3.27
4,4,Iskosia90,23,Male,131,Fury,1.44


## Player Count

* Display the total number of players


In [2]:
#Clean the data of any empties
purchase_data_df=Unclean_purchase_data_df.dropna(how="any")
purchase_data_df.count()

Purchase ID    780
SN             780
Age            780
Gender         780
Item ID        780
Item Name      780
Price          780
dtype: int64

In [3]:
purchase_data_df.head()

Unnamed: 0,Purchase ID,SN,Age,Gender,Item ID,Item Name,Price
0,0,Lisim78,20,Male,108,"Extraction, Quickblade Of Trembling Hands",3.53
1,1,Lisovynya38,40,Male,143,Frenzied Scimitar,1.56
2,2,Ithergue48,24,Male,92,Final Critic,4.88
3,3,Chamassasya86,24,Male,100,Blindscythe,3.27
4,4,Iskosia90,23,Male,131,Fury,1.44


In [4]:
player_count= len(purchase_data_df["SN"].unique())
player_count

576

## Purchasing Analysis (Total)

* Run basic calculations to obtain number of unique items, average price, etc.


* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame


In [5]:
#Calculates the average price
average_price = purchase_data_df["Price"].mean()
average_price_format = '${:.2f}'.format(average_price)
print(average_price_format)

$3.05


In [6]:
#Calculates number of Purchase
num_purchaseID= len(purchase_data_df["Purchase ID"].unique())
num_purchaseID

780

In [7]:
#Calculate number of unique items
items_unique= len(purchase_data_df["Item Name"].unique())
items_unique

179

In [8]:
#Calculate Total Revenue
total_revenue = purchase_data_df["Price"].sum()
total_revenue_format = "${:.2f}".format(total_revenue)
total_revenue_format

'$2379.77'

In [9]:
summary1_df = pd.DataFrame({"Number of Unique Items":items_unique,
                           "Average Price":[average_price_format],
                           "Number of Purchases":num_purchaseID,
                           "Total Revenue":total_revenue_format})
summary1_df

Unnamed: 0,Number of Unique Items,Average Price,Number of Purchases,Total Revenue
0,179,$3.05,780,$2379.77


## Gender Demographics

* Percentage and Count of Male Players


* Percentage and Count of Female Players


* Percentage and Count of Other / Non-Disclosed




In [10]:
#Group by Gender in order to separate the data fields acordingto "Gender values and put it in a Data Frame"
gender_counts_df = purchase_data_df["Gender"].value_counts().to_frame()
gender_counts_df

Unnamed: 0,Gender
Male,652
Female,113
Other / Non-Disclosed,15


In [11]:
#percentage of Players based on genders
percentage_gender_df = (purchase_data_df["Gender"].value_counts().to_frame()/player_count)*100
percentage_gender_df

Unnamed: 0,Gender
Male,113.194444
Female,19.618056
Other / Non-Disclosed,2.604167


In [12]:
# Add the percentage to Data Frame
gender_counts_df["Percentage of Players"]= percentage_gender_df
gender_counts_df

Unnamed: 0,Gender,Percentage of Players
Male,652,113.194444
Female,113,19.618056
Other / Non-Disclosed,15,2.604167


In [13]:
#format the column Percentage of Players into Percentage
gender_counts_df.dtypes
gender_counts_df.loc[:, "Percentage of Players"] = gender_counts_df["Percentage of Players"].map("{:,.2f}%".format)
gender_counts_df

Unnamed: 0,Gender,Percentage of Players
Male,652,113.19%
Female,113,19.62%
Other / Non-Disclosed,15,2.60%



## Purchasing Analysis (Gender)

* Run basic calculations to obtain purchase count, avg. purchase price, avg. purchase total per person etc. by gender




* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame

In [15]:
#Sort the Data based on Gender
group_gender_df=purchase_data_df.sort_values(['Gender'])
print(group_gender_df)

     Purchase ID           SN  Age                 Gender  Item ID  \
119          119  Frichosia58   15                 Female       87   
667          667    Inasuir29   40                 Female       66   
666          666   Assilsan72   20                 Female       82   
134          134     Phyali88   15                 Female        2   
658          658   Quilassa66    7                 Female      178   
..           ...          ...  ...                    ...      ...   
629          629   Maluncil97   25  Other / Non-Disclosed      107   
350          350    Rairith81   15  Other / Non-Disclosed       34   
637          637       Airi27   24  Other / Non-Disclosed      163   
401          401     Lirtim36   15  Other / Non-Disclosed       46   
82            82   Haerithp41   16  Other / Non-Disclosed      160   

                                        Item Name  Price  
119                      Deluge, Edge of the West   4.43  
667                            Victor Iro

## Age Demographics

* Establish bins for ages


* Categorize the existing players using the age bins. Hint: use pd.cut()


* Calculate the numbers and percentages by age group


* Create a summary data frame to hold the results


* Optional: round the percentage column to two decimal points


* Display Age Demographics Table


## Purchasing Analysis (Age)

* Bin the purchase_data data frame by age


* Run basic calculations to obtain purchase count, avg. purchase price, avg. purchase total per person etc. in the table below


* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame

## Top Spenders

* Run basic calculations to obtain the results in the table below


* Create a summary data frame to hold the results


* Sort the total purchase value column in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the summary data frame



## Most Popular Items

* Retrieve the Item ID, Item Name, and Item Price columns


* Group by Item ID and Item Name. Perform calculations to obtain purchase count, average item price, and total purchase value


* Create a summary data frame to hold the results


* Sort the purchase count column in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the summary data frame



## Most Profitable Items

* Sort the above table by total purchase value in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the data frame

