### Note
* Instructions have been included for each segment. You do not have to follow them exactly, but they are included to help you think through the steps.

In [1]:
# Dependencies and Setup
import pandas as pd

# File to Load (Remember to Change These)
file = "Resources/purchase_data.csv"

# Read Purchasing File and store into Pandas data frame
pdata = pd.read_csv(file)

## Player Count

* Display the total number of players


In [2]:
pdata.head()

Unnamed: 0,Purchase ID,SN,Age,Gender,Item ID,Item Name,Price
0,0,Lisim78,20,Male,108,"Extraction, Quickblade Of Trembling Hands",3.53
1,1,Lisovynya38,40,Male,143,Frenzied Scimitar,1.56
2,2,Ithergue48,24,Male,92,Final Critic,4.88
3,3,Chamassasya86,24,Male,100,Blindscythe,3.27
4,4,Iskosia90,23,Male,131,Fury,1.44


In [3]:
pdata.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 780 entries, 0 to 779
Data columns (total 7 columns):
 #   Column       Non-Null Count  Dtype  
---  ------       --------------  -----  
 0   Purchase ID  780 non-null    int64  
 1   SN           780 non-null    object 
 2   Age          780 non-null    int64  
 3   Gender       780 non-null    object 
 4   Item ID      780 non-null    int64  
 5   Item Name    780 non-null    object 
 6   Price        780 non-null    float64
dtypes: float64(1), int64(3), object(3)
memory usage: 42.8+ KB


In [4]:
snsum = pdata["SN"].unique().size

In [5]:
iid = pdata["Item ID"].unique().size

In [6]:
iid = pdata["Item ID"].value_counts().count()

In [7]:
avgpr = pdata["Price"].mean()

In [8]:
ttlpur = pdata["Purchase ID"].count()

In [9]:
rev = pdata["Price"].sum()

In [10]:
analyze_df = pd.DataFrame({"Total Users":[snsum], "Number of Unique Items":[iid], 
"Average Price" : [f'${avgpr:,.2f}'],
"Number of Purchases" : [ttlpur],
"Total Revenue" : [f'${rev:,.2f}']})

## Player Count/Purchasing Analysis

In [11]:
analyze_df

Unnamed: 0,Total Users,Number of Unique Items,Average Price,Number of Purchases,Total Revenue
0,576,179,$3.05,780,"$2,379.77"


In [12]:
genders = pdata.drop_duplicates("SN")

In [13]:
gdemos = pd.DataFrame(genders)

In [14]:
gdemos["Gender"].value_counts()

Male                     484
Female                    81
Other / Non-Disclosed     11
Name: Gender, dtype: int64

In [15]:
##gdemos["Gender"].value_counts(ascending=False)

In [16]:
gders = gdemos["Gender"].unique()
gders

array(['Male', 'Other / Non-Disclosed', 'Female'], dtype=object)

In [17]:
male = gdemos["Gender"].value_counts()[0]

In [18]:
female = gdemos["Gender"].value_counts()[2]

In [19]:
other = gdemos["Gender"].value_counts()[1]

In [20]:
mper = (male/snsum)

In [21]:
fper = (female/snsum)

In [22]:
oper = (other/snsum)

In [23]:
gcounts = (male, female, other)

In [24]:
pctg = (mper, fper, oper)

In [25]:
gdemo_df = pd.DataFrame()
gdemo_df["Genders"] = gders
gdemo_df["Count"] = gcounts
gdemo_df["Percentage"] = pctg 

In [26]:
gper = gdemo_df["Percentage"].map('{:.2%}'.format)

In [27]:
gdemo_df["Percent"] = gper

In [28]:
del gdemo_df["Percentage"]

## Gender Demographics

In [29]:
gdemo_df.head()

Unnamed: 0,Genders,Count,Percent
0,Male,484,84.03%
1,Other / Non-Disclosed,11,1.91%
2,Female,81,14.06%


## **do not edit code above this line**


## Purchasing Analysis (Gender)

* Run basic calculations to obtain purchase count, avg. purchase price, avg. purchase total per person etc. by gender




* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame

In [38]:
cbg = pdata["Gender"].value_counts()
cbg

Male                     652
Female                   113
Other / Non-Disclosed     15
Name: Gender, dtype: int64

In [34]:
### test code

arrays = [gders, cbg, gcounts, gper]
index = pd.MultiIndex.from_arrays(arrays, names=("Gender", "Purchase Count", "Average Purchase Price", "Total Purchase Value"))
df = pd.DataFrame({'Max Speed': [123, 456, 789]},
                  index=index)
df


TypeError: Input must be a list / sequence of array-likes.

In [None]:
                Max Speed
Animal Type
Falcon Captive      390.0
       Wild         350.0
Parrot Captive       30.0
       Wild          20.0
df.groupby(level=0).mean()
        Max Speed
Animal
Falcon      370.0
Parrot       25.0
df.groupby(level="Type").mean()
         Max Speed
Type
Captive      210.0
Wild         185.0

## Age Demographics

* Establish bins for ages


* Categorize the existing players using the age bins. Hint: use pd.cut()


* Calculate the numbers and percentages by age group


* Create a summary data frame to hold the results


* Optional: round the percentage column to two decimal points


* Display Age Demographics Table


## Purchasing Analysis (Age)

* Bin the purchase_data data frame by age


* Run basic calculations to obtain purchase count, avg. purchase price, avg. purchase total per person etc. in the table below


* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame

## Top Spenders

* Run basic calculations to obtain the results in the table below


* Create a summary data frame to hold the results


* Sort the total purchase value column in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the summary data frame



## Most Popular Items

* Retrieve the Item ID, Item Name, and Item Price columns


* Group by Item ID and Item Name. Perform calculations to obtain purchase count, average item price, and total purchase value


* Create a summary data frame to hold the results


* Sort the purchase count column in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the summary data frame



## Most Profitable Items

* Sort the above table by total purchase value in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the data frame

