# Heroes Of Pymoli Data Analysis
* Of the 1163 active players, the vast majority are male (82%). There also exists, a smaller, but notable proportion of female players (16%).

* Our peak age demographic falls between 20-24 (42%) with secondary groups falling between 15-19 (17.80%) and 25-29 (15.48%).

* Our players are putting in significant cash during the lifetime of their gameplay. Across all major age and gender demographics, the average purchase for a user is roughly $491.   
-----

In [2]:
import pandas as pd
import numpy


## Player Count

In [30]:
file = 'purchase_data.json'

df = pd.read_json(file)
sn_count=len(df["SN"].value_counts())
total_players=pd.DataFrame([sn_count],columns=["Total Players"])
print(total_players)






   Total Players
0            573


## Purchasing Analysis (Total)

In [35]:
unique_things=len(df["Item ID"].value_counts())
total_sales=round(df["Price"].sum(),2)
mean_price=round(df["Price"].mean(),2)
purchase_amount=df["Price"].count()
print("Purchasing Analyis-Total")
print("The number of unique items was " + str(unique_things) + ".")
print("The total revenue was $"+str(total_sales)+".")
print("The purchasing amount was $"+str(purchase_amount)+".")

Purchasing Analyis-Total
The number of unique items was 183.
The total revenue was $2286.33.
The purchasing amount was $780.


## Gender Demographics


## Purchasing Analysis (Gender)

## Age Demographics

In [61]:
players_df = df.drop_duplicates(['SN'], keep ='last')
players_count=players_df.count()
# Determine the number of male, female, and other players
malePlayer_df=players_df.loc[players_df["Gender"]=="Male"]
malePlayers=len(malePlayer_df)
femalePlayer_df=players_df.loc[players_df["Gender"]=="Female"]
femalePlayers=len(femalePlayer_df)
otherPlayer_df=players_df.loc[players_df["Gender"]=="Other / Non-Disclosed"]
otherPlayers=len(otherPlayer_df)

print("There were",malePlayers,"males,",femalePlayers,"females, and",otherPlayers,"other (non-disclosed) players.")
print("There were",str(100*round(malePlayers/sn_count,4)),"% male,",str(100*round(femalePlayers/sn_count,4)),"% female, and",str(100*round(otherPlayers/sn_count,4)),"% other (non-disclosed) of total players, respectively.")

      
      

There were 465 males, 100 females, and 8 other (non-disclosed) players.
There were 81.15 % male, 17.45 % female, and 1.4000000000000001 % other (non-disclosed) of total players, respectively.


## Purchasing Analysis (Age)

In [67]:
male_df=df.loc[df["Gender"]=="Male"]
male_purchase_count=male_df["Item ID"].count()
male_df=df.loc[df["Gender"]=="Male"]
male_mean_price=round(male_df["Price"].mean(),2)
male_total_price=round(male_df["Price"].sum(),2)
male_player_df=players_df.loc[players_df["Gender"]=="Male"]
male_players=len(male_player_df)
normal_male_total_price=round(male_total_price/male_players,2)
print("----------------------------")
print("PURCHASING ANALYSIS (GENDER)")
print("The total purchases made by male players were " + str(male_purchase_count) + ".")
print("The average purchase price for males was $"+str(male_mean_price)+".")
print("The total value purchased by males was $"+str(male_total_price)+".")
print("The average male players bought a total of $"+str(male_players)+".\n")


female_df=df.loc[df["Gender"]=="Female"]
female_purchase_count=female_df["Item ID"].count()
female_df=df.loc[df["Gender"]=="Female"]
female_mean_price=round(female_df["Price"].mean(),2)
female_total_price=round(female_df["Price"].sum(),2)
female_player_df=players_df.loc[players_df["Gender"]=="Female"]
female_players=len(female_player_df)
normal_female_total_price=round(female_total_price/female_players,2)
print("The total purchases made by female players were " + str(female_purchase_count) + ".")
print("The average purchase price for males was $"+str(female_mean_price)+".")
print("The total value purchased by males was $"+str(female_total_price)+".")
print("The average male players bought a total of $"+str(female_players)+".\n")


other_df=df.loc[df["Gender"]=="Other / Non-Disclosed"]
other_purchase_count=other_df["Item ID"].count()
other_df=df.loc[df["Gender"]=="Other / Non-Disclosed"]
other_mean_price=round(other_df["Price"].mean(),2)
other_total_price=round(other_df["Price"].sum(),2)
other_player_df=players_df.loc[players_df["Gender"]=="Other / Non-Disclosed"]
other_players=len(other_player_df)
normal_other_total_price=round(other_total_price/other_players,2)
print("The total purchases made by other/non-disclosed players were " + str(other_purchase_count) + ".")
print("The average purchase price for others was $"+str(other_mean_price)+".")
print("The total value purchased by others was $"+str(other_total_price)+".")
print("The average other players bought a total of $"+str(other_players)+".")



----------------------------
PURCHASING ANALYSIS (GENDER)
The total purchases made by male players were 633.
The average purchase price for males was $2.95.
The total value purchased by males was $1867.68.
The average male players bought a total of $465.

The total purchases made by female players were 136.
The average purchase price for males was $2.82.
The total value purchased by males was $382.91.
The average male players bought a total of $100.

The total purchases made by other/non-disclosed players were 11.
The average purchase price for others was $3.25.
The total value purchased by others was $35.74.
The average other players bought a total of $8.


## Top Spenders

In [15]:
bins=[10,15,20,25,30,35,40,120]
group_labels=["10 and younger", "11 to 15", "16 to 20", "21 to 25", "30 to 35", "35 to 40", "41 and older"]


df["Age Range"]=pd.cut(df["Age"],bins,labels=group_labels)
players_df["Age Range"]=pd.cut(df["Age"],bins,labels=group_labels)
obj1=df.groupby("Age Range")

count_in_obj1=obj1["SN"].count()
count_table=pd.DataFrame({"Purchase Count":count_in_obj1})

avgprice_in_obj1=round(obj1["Price"].mean(),2)
avgprice_table=pd.DataFrame({"Average Price ($)":avgprice_in_obj1})

totprice_in_obj1=obj1["Price"].sum()
totprice_table=pd.DataFrame({"Purchase Value ($)":totprice_in_obj1})

age_df=pd.merge(count_table,avgprice_table,left_index=True,right_index=True).merge(totprice_table,left_index=True,
            right_index=True)

objAge=players_df.groupby("Age Range")

count_in_objAge=objAge["Age"].count()
player_count_table=pd.DataFrame({"Purchase Count":count_in_objAge})

totprice_in_objage=objAge["Price"].sum()
player_totprice_table=pd.DataFrame({"Purchase Value ($)":totprice_in_objage})

age_df["Normalized Price ($)"]=round((player_totprice_table["Purchase Value ($)"]/
                                             player_count_table["Purchase Count"]),2)
age_df

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  


Unnamed: 0_level_0,Purchase Count,Average Price ($),Purchase Value ($),Normalized Price ($)
Age Range,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
10 and younger,78,2.87,224.15,2.84
11 to 15,184,2.87,528.74,2.86
16 to 20,305,2.96,902.61,2.9
21 to 25,76,2.89,219.82,2.9
30 to 35,58,3.07,178.26,2.99
35 to 40,44,2.9,127.49,2.74
41 and older,3,2.88,8.64,2.88


## Most Popular Items

In [20]:
objSN=df.groupby("SN")

count_in_objSN=objSN["Age"].count()
SN_count_table=pd.DataFrame({"Purchase Count":count_in_objSN})

average_price_in_objSN=round(objSN["Price"].mean(),2)
SN_average_price_table=pd.DataFrame({"Average Price ($)":average_price_in_objSN})

totalprice_in_objSN=objSN["Price"].sum()
SN_total_price_table=pd.DataFrame({"Total Purchase Value ($)":totalprice_in_objSN})

SN_df=pd.merge(SN_count_table,SN_average_price_table,left_index=True,right_index=True).merge(
    SN_total_price_table,left_index=True,right_index=True)

sorted_SN=SN_df.sort_values(by="Total Purchase Value ($)",ascending=False)

print("----------------------------------------------------------------------")
print("TOP SPENDERS")
sorted_SN.head()

----------------------------------------------------------------------
TOP SPENDERS


Unnamed: 0_level_0,Purchase Count,Average Price ($),Total Purchase Value ($)
SN,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Undirrala66,5,3.41,17.06
Saedue76,4,3.39,13.56
Mindimnya67,4,3.18,12.74
Haellysu29,3,4.24,12.73
Eoda93,3,3.86,11.58


## Most Profitable Items

In [22]:
objid=df.groupby("Item ID")


count_in_objid=objid["Item ID"].count()
ID_count_table=pd.DataFrame({"Purchase Count":count_in_objid})

averageprice_in_objid=round(objid["Price"].mean(),2)
ID_average_price_table=pd.DataFrame({"Item Price ($)":averageprice_in_objid})

totalprice_in_objid=objid["Price"].sum()
Id_totalprice_table=pd.DataFrame({"Item Total Sales ($)":totalprice_in_objid})

singleID_df = df.drop_duplicates(['Item ID'], keep ='last')

reduced_singleID_df=singleID_df.loc[:,["Item ID","Item Name"]]


ID_df=pd.merge(reduced_singleID_df,ID_count_table,left_index=True,right_index=True).merge(ID_average_price_table,
               left_index=True,right_index=True).merge(Id_totalprice_table,left_index=True,right_index=True)


indexed_ID_df=ID_df.set_index("Item ID")

sorted_ID=indexed_ID_df.sort_values(by="Purchase Count",ascending=False)

print("----------------------------------------------------------------------")
print("MOST POPULAR ITEMS")
sorted_ID.head()

----------------------------------------------------------------------
MOST POPULAR ITEMS


Unnamed: 0_level_0,Item Name,Purchase Count,Item Price ($),Item Total Sales ($)
Item ID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
0,Splinter,5,4.83,24.15
28,"Flux, Destroyer of Due Diligence",4,1.27,5.08
126,Exiled Mithril Longsword,4,1.55,6.2
155,War-Forged Gold Deflector,4,4.89,19.56
59,"Lightning, Etcher of the King",3,3.47,10.41


In [23]:
sorted_ID=indexed_ID_df.sort_values(by="Item Total Sales ($)",ascending=False)

print("----------------------------------------------------------------------")
print("MOST PROFITABLE ITEMS")
sorted_ID.head()

----------------------------------------------------------------------
MOST PROFITABLE ITEMS


Unnamed: 0_level_0,Item Name,Purchase Count,Item Price ($),Item Total Sales ($)
Item ID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
0,Splinter,5,4.83,24.15
155,War-Forged Gold Deflector,4,4.89,19.56
59,"Lightning, Etcher of the King",3,3.47,10.41
3,Phantomlight,3,3.27,9.81
132,Persuasion,2,4.1,8.2
