Skip to content

jammilo/Pandas-Jupyter-Notebook

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Pandas Jupyter Notebook

Background

Fantasy

Analyze the data for an independent gaming company for their most recent fantasy game Heroes of Pymoli as a Lead Analyst.

Like many others in its genre, the game is free-to-play, but players are encouraged to purchase optional items that enhance their playing experience. As a first task, the company would like to generate a report that breaks down the game's purchasing data into meaningful insights.

The final report should include each of the following:

Player Count

  • Total Number of Players
#get the count
playerCount = df.SN.nunique()

#create table
playerCount_df = pd.DataFrame()
playerCount_df["Total Players"] = [playerCount]

#print
playerCount_df

圖片

Purchasing Analysis (Total)

  • Number of Unique Items
  • Average Purchase Price
  • Total Number of Purchases
  • Total Revenue
#get the number
uniqItem = df["Item Name"].nunique()
avePrice = df.Price.mean()
numPurchasing = df["Purchase ID"].count() 
totRev = df.Price.sum()


#create summary table
purchasingAnalysis = pd.DataFrame()
purchasingAnalysis["Number of Unique Items"] = [uniqItem]
purchasingAnalysis["Average Price"] = [avePrice]
purchasingAnalysis["Number of Purchases"] = [numPurchasing]
purchasingAnalysis["Total Revenue"] = [totRev]

#format the number
purchasingAnalysis["Average Price"] = purchasingAnalysis["Average Price"].map("${:.2f}".format)
purchasingAnalysis["Total Revenue"] = purchasingAnalysis["Total Revenue"].map("${:.2f}".format)

purchasingAnalysis

圖片

Gender Demographics

  • Percentage and Count of Male Players
  • Percentage and Count of Female Players
  • Percentage and Count of Other / Non-Disclosed
people_df = df.groupby(['SN','Gender']).size().reset_index().rename(columns={0:'count'})
#make the columns first
genderCount = people_df.Gender.value_counts()
genderPerc = genderCount / len(people_df)

#make the table
genderTable = pd.DataFrame()
genderTable["Total Count"] = genderCount
genderTable["Percentage of Players"] = genderPerc * 100

#format
genderTable["Percentage of Players"] = genderTable["Percentage of Players"].map("{:.2f}%".format)

#print
genderTable

圖片

Purchasing Analysis (Gender)

  • The below each broken by gender
    • Purchase Count
    • Average Purchase Price
    • Total Purchase Value
    • Average Purchase Total per Person by Gender
#get the columns
genderSize = df.groupby("Gender").size()
genderAvgPrice = df.groupby("Gender").Price.mean()
genderTotPrice = df.groupby("Gender").Price.sum()
genderAvgPerPerson = df.groupby("Gender").Price.sum() / genderTable["Total Count"]
#make the table
genderPurchasingTable = pd.DataFrame()
genderPurchasingTable["Purchase Count"] = genderSize
genderPurchasingTable["Average Purchase Price"] = genderAvgPrice
genderPurchasingTable["Total Purchase Value"] = genderTotPrice
genderPurchasingTable["Avg Total Purchase per Person"] = genderAvgPerPerson

#format
genderPurchasingTable["Average Purchase Price"] = genderPurchasingTable["Average Purchase Price"].map("${:,.2f}".format)
genderPurchasingTable["Total Purchase Value"] = genderPurchasingTable["Total Purchase Value"].map("${:,.2f}".format)
genderPurchasingTable["Avg Total Purchase per Person"] = genderPurchasingTable["Avg Total Purchase per Person"].map("${:,.2f}".format)

#print
genderPurchasingTable

圖片

Age Demographics

  • The below each broken into bins of 4 years (i.e. <10, 10-14, 15-19, etc.)
    • Purchase Count
    • Average Purchase Price
    • Total Purchase Value
    • Average Purchase Total per Person by Age Group
# Create bins in which to place values based upon player ages
bins = [0, 9, 14, 19, 24, 29, 34, 39, 50]

# Create labels for these bins
group_labels = [" <10", "10-14", "15-19", "20-24", "25-29", "30-34", "35-39", "40+"]

# Slice the data and place it into bins
df["Age Group"] = pd.cut(df["Age"], bins, labels=group_labels).astype(str)
df.head()
people_df2 = df.groupby(['SN','Age Group']).size().reset_index().rename(columns={0:'count'})
people_df2
# Create bins in which to place values based upon player ages
bins = [0, 9, 14, 19, 24, 29, 34, 39, 50]

# Create labels for these bins
group_labels = [" <10", "10-14", "15-19", "20-24", "25-29", "30-34", "35-39", "40+"]

# Slice the data and place it into bins
df["Age Group"] = pd.cut(df["Age"], bins, labels=group_labels).astype(str)


ageGroupTable = people_df2.groupby("Age Group").size().reset_index().rename(columns={0:'Total Count'})
ageGroupTable["Percentage of Players"] = ageGroupTable["Total Count"] / len(people_df2) * 100

#format
ageGroupTable["Percentage of Players"] = ageGroupTable["Percentage of Players"].map("{:.2f}%".format)

ageGroupTable = ageGroupTable.set_index("Age Group")
ageGroupTable
#get the columns
ageSize = df.groupby("Age Group").size()
ageAvgPrice = df.groupby("Age Group").Price.mean()
ageTotPrice = df.groupby("Age Group").Price.sum()
ageAvgPerPerson = df.groupby("Age Group").Price.sum() / ageGroupTable["Total Count"] # NOTE THE AGE COUNT USE HERE - HAVE TO GET UNIQUE PEOPLE

#make the table
agePurchasingTable = pd.DataFrame()
agePurchasingTable["Purchase Count"] = ageSize
agePurchasingTable["Average Purchase Price"] = ageAvgPrice
agePurchasingTable["Total Purchase Value"] = ageTotPrice
agePurchasingTable["Avg Total Purchase per Person"] = ageAvgPerPerson

#format
agePurchasingTable["Average Purchase Price"] = agePurchasingTable["Average Purchase Price"].map("${:,.2f}".format)
agePurchasingTable["Total Purchase Value"] = agePurchasingTable["Total Purchase Value"].map("${:,.2f}".format)
agePurchasingTable["Avg Total Purchase per Person"] = agePurchasingTable["Avg Total Purchase per Person"].map("${:,.2f}".format)

#print
agePurchasingTable

圖片

圖片

圖片

Top Spenders

  • Identify the the top 5 spenders in the game by total purchase value, then list (in a table):
    • SN
    • Purchase Count
    • Average Purchase Price
    • Total Purchase Value
#get the columns
sNSize = df.groupby("SN").size()
sNAvgPrice = df.groupby("SN").Price.mean()
sNTotPrice = df.groupby("SN").Price.sum()

#make the table
sNPurchasingTable = pd.DataFrame()
sNPurchasingTable["Purchase Count"] = sNSize
sNPurchasingTable["Average Purchase Price"] = sNAvgPrice
sNPurchasingTable["Total Purchase Value"] = sNTotPrice

#format
sNPurchasingTable["Average Purchase Price"] = sNPurchasingTable["Average Purchase Price"].map("${:,.2f}".format)
sNPurchasingTable["Total Purchase Value"] = sNPurchasingTable["Total Purchase Value"].map("${:,.2f}".format)

#print
sNPurchasingTable = sNPurchasingTable.sort_values(by="Total Purchase Value", ascending=False)
sNPurchasingTable.head()

圖片

Most Popular Items

  • Identify the 5 most popular items by purchase count, then list (in a table):
    • Item ID
    • Item Name
    • Purchase Count
    • Item Price
    • Total Purchase Value
#get the columns
itemSize = df.groupby(["Item ID", "Item Name"]).size()
itemAvgPrice = df.groupby(["Item ID", "Item Name"]).Price.mean()
itemTotPrice = df.groupby(["Item ID", "Item Name"]).Price.sum()

#make the table
itemPurchasingTable = pd.DataFrame()
itemPurchasingTable["Purchase Count"] = itemSize
itemPurchasingTable["Item Price"] = itemAvgPrice
itemPurchasingTable["Total Purchase Value"] = itemTotPrice

#sort
itemPurchasingTable = itemPurchasingTable.sort_values(by="Purchase Count", ascending=False)

#format
itemPurchasingTable["Item Price"] = itemPurchasingTable["Item Price"].map("${:,.2f}".format)
itemPurchasingTable["Total Purchase Value"] = itemPurchasingTable["Total Purchase Value"].map("${:,.2f}".format)

#print
itemPurchasingTable.head()

圖片

Most Profitable Items

  • Identify the 5 most profitable items by total purchase value, then list (in a table):
    • Item ID
    • Item Name
    • Purchase Count
    • Item Price
    • Total Purchase Value
#get the columns
itemSize = df.groupby(["Item ID", "Item Name"]).size()
itemAvgPrice = df.groupby(["Item ID", "Item Name"]).Price.mean()
itemTotPrice = df.groupby(["Item ID", "Item Name"]).Price.sum()

#make the table
itemPurchasingTable = pd.DataFrame()
itemPurchasingTable["Purchase Count"] = itemSize
itemPurchasingTable["Item Price"] = itemAvgPrice
itemPurchasingTable["Total Purchase Value"] = itemTotPrice

#sort
itemPurchasingTable = itemPurchasingTable.sort_values(by="Total Purchase Value", ascending=False)

#format
itemPurchasingTable["Item Price"] = itemPurchasingTable["Item Price"].map("${:,.2f}".format)
itemPurchasingTable["Total Purchase Value"] = itemPurchasingTable["Total Purchase Value"].map("${:,.2f}".format)

#print
itemPurchasingTable.head()

圖片

About

Python Pandas Data Analysis for Gaming Company

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published