# Game Analysis

## Analysis of in-app purchases of a game

* Analysis is done in python using the pandas library

* Includes summary of purchase information and demographic breakdown of users

* Insights included at conclusion

## Import pandas then load the "purchace_data.csv" file

In [1]:
# Import panda library
import pandas as pd

# File to Load
file_to_load = "purchase_data.csv"

# Read csv file of purchase data and store into Pandas data frame
purchase_df = pd.read_csv(file_to_load)

# Display first 5 rows
purchase_df.head()

Unnamed: 0,Purchase ID,SN,Age,Gender,Item ID,Item Name,Price
0,0,Lisim78,20,Male,108,"Extraction, Quickblade Of Trembling Hands",3.53
1,1,Lisovynya38,40,Male,143,Frenzied Scimitar,1.56
2,2,Ithergue48,24,Male,92,Final Critic,4.88
3,3,Chamassasya86,24,Male,100,Blindscythe,3.27
4,4,Iskosia90,23,Male,131,Fury,1.44


## Player Count

* Total number of unique players

In [2]:
# total_players holds number of unique SN (screen names)
total_players = purchase_df["SN"].nunique()

# Convert to dataframe to display
players_df = pd.DataFrame({
    "Total Players": [total_players]
})
players_df

Unnamed: 0,Total Players
0,576


## Purchasing Analysis (Total)

### Displays a summary dataframe with the following calculations:

* Number of Unique Items

* Average Price	

* Number of Purchases

* Total Revenue

In [3]:
# Confirm data types before analysis
print(purchase_df.dtypes)

Purchase ID      int64
SN              object
Age              int64
Gender          object
Item ID          int64
Item Name       object
Price          float64
dtype: object


In [4]:
# Unique number of items
unique_items = purchase_df["Item ID"].nunique()

# Average price of items
average_price = purchase_df["Price"].mean()

# Total number of purchases
total_purchase = purchase_df["Purchase ID"].count()

# Total revenue
total_revenue = purchase_df["Price"].sum()

# Convert variables into a dataframe
totals_df = pd.DataFrame({
    "Number of Unique Items": [unique_items],
    "Average Price": [average_price],
    "Number of Purchases": [total_purchase],
    "Total Revenue": [total_revenue]
})

# Change formatting for financial columns
totals_df["Average Price"] = totals_df["Average Price"].map("${:,.2f}".format)
totals_df["Total Revenue"] = totals_df["Total Revenue"].map("${:,.2f}".format)

# Display results
totals_df


Unnamed: 0,Number of Unique Items,Average Price,Number of Purchases,Total Revenue
0,179,$3.05,780,"$2,379.77"


## Gender Demographics

Count and Percentage of Players by Gender:
 * Male, Female, and Other / Non-Disclosed


In [15]:
# Group by gender
gender_df = purchase_df.groupby(["Gender"])

# Variable with unique players based on SN (screen name)
count_gender = gender_df["SN"].nunique()

# Variable with percentage of players
percent_gender = (gender_df["SN"].nunique() / count_gender.sum()) * 100

# New dataframe with summary variables and formatting
summary_gender_df = pd.DataFrame({
    "Total Count": count_gender,
    "Percentage of Players": percent_gender.map("{:.2f}%".format)
})

# Sort by decending values
summary_gender_df = summary_gender_df.sort_values(["Total Count"], ascending=False)

# Remove index name in the corner
summary_gender_df.index.name = None

# Display new dataframe
summary_gender_df


Unnamed: 0,Total Count,Percentage of Players
Male,484,84.03%
Female,81,14.06%
Other / Non-Disclosed,11,1.91%



## Purchasing Analysis (Gender)

Purchasing calculations aggregated by gender:
* Purchase Count
* Average Purchase Price
* Total Purchase Value
* Avg Total Purchase per Person

In [21]:
# Uses the gender_df created in previous step (dataframe grouped by gender)

# Purchase Count
count_purchase_gender = gender_df["Purchase ID"].count()

# Average Purchase Price
avg_price_gender = gender_df["Price"].mean()

# Total Purchase Value
total_price_gender = gender_df["Price"].sum()

# Avg Total Purchase per Person
avg_price_per_person_gender = total_price_gender / count_gender

# Place variables into dataframe and format price numbers
purchase_gender_df = pd.DataFrame({
    "Purchase Count": count_purchase_gender.map("{:.2f}%".format),
    "Average Purchase Price": avg_price_gender.map("${:,.2f}".format),
    "Total Purchase Value": total_price_gender.map("${:,.2f}".format),
    "Avg Total Purchase per Person": avg_price_per_person_gender.map("${:,.2f}".format)
})

# Display dataframe
purchase_gender_df

Unnamed: 0_level_0,Purchase Count,Average Purchase Price,Total Purchase Value,Avg Total Purchase per Person
Gender,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
Female,113.00%,$3.20,$361.94,$4.47
Male,652.00%,$3.02,"$1,967.64",$4.07
Other / Non-Disclosed,15.00%,$3.35,$50.19,$4.56


## Age Demographics

* Establish bins for ages


* Categorize the existing players using the age bins. Hint: use pd.cut()


* Calculate the numbers and percentages by age group


* Create a summary data frame to hold the results


* Optional: round the percentage column to two decimal points


* Display Age Demographics Table


## Purchasing Analysis (Age)

* Bin the purchase_data data frame by age


* Run basic calculations to obtain purchase count, avg. purchase price, avg. purchase total per person etc. in the table below


* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame

## Top Spenders

* Run basic calculations to obtain the results in the table below


* Create a summary data frame to hold the results


* Sort the total purchase value column in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the summary data frame



## Most Popular Items

* Retrieve the Item ID, Item Name, and Item Price columns


* Group by Item ID and Item Name. Perform calculations to obtain purchase count, average item price, and total purchase value


* Create a summary data frame to hold the results


* Sort the purchase count column in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the summary data frame



## Most Profitable Items

* Sort the above table by total purchase value in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the data frame

