### Heroes Of Pymoli Data Analysis
* Of the 1163 active players, the vast majority are male (84%). There also exists, a smaller, but notable proportion of female players (14%).

* Our peak age demographic falls between 20-24 (44.8%) with secondary groups falling between 15-19 (18.60%) and 25-29 (13.4%).  
-----

### Note
* Instructions have been included for each segment. You do not have to follow them exactly, but they are included to help you think through the steps.

In [1]:
# Dependencies and Setup
import pandas as pd
import numpy as np

# File to Load (Remember to Change These)
file_to_load = "Resources/purchase_data.csv"

# Read Purchasing File and store into Pandas data frame
purchase_data = pd.read_csv(file_to_load)
purchase_data.head(10)

Unnamed: 0,Purchase ID,SN,Age,Gender,Item ID,Item Name,Price
0,0,Lisim78,20,Male,108,"Extraction, Quickblade Of Trembling Hands",3.53
1,1,Lisovynya38,40,Male,143,Frenzied Scimitar,1.56
2,2,Ithergue48,24,Male,92,Final Critic,4.88
3,3,Chamassasya86,24,Male,100,Blindscythe,3.27
4,4,Iskosia90,23,Male,131,Fury,1.44
5,5,Yalae81,22,Male,81,Dreamkiss,3.61
6,6,Itheria73,36,Male,169,"Interrogator, Blood Blade of the Queen",2.18
7,7,Iskjaskst81,20,Male,162,Abyssal Shard,2.67
8,8,Undjask33,22,Male,21,Souleater,1.1
9,9,Chanosian48,35,Other / Non-Disclosed,136,Ghastly Adamantite Protector,3.58


## Player Count

* Display the total number of players


In [2]:
#purchase_data["Purchase ID"].sum()
#purchase_data_info = purchase_data.sum()
#f"Total number of Players {purchase_data_info}"

total = purchase_data['SN'].count()

print("Total number of Players",total)







Total number of Players 780


## Purchasing Analysis (Total)

* Run basic calculations to obtain number of unique items, average price, etc.


* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame


In [3]:
purchase_item_name = len(purchase_data["Item Name"].unique())

purchase_average = purchase_data["Price"].mean()

purchase_total = purchase_data["Price"].sum()

summary_table = pd.DataFrame(
    {"Total Unique Item": purchase_item_name,
    "Average Item Price": [purchase_average],
    "Total Purchase Price": [purchase_total]})
summary_table




Unnamed: 0,Total Unique Item,Average Item Price,Total Purchase Price
0,179,3.050987,2379.77


## Gender Demographics

* Percentage and Count of Male Players


* Percentage and Count of Female Players


* Percentage and Count of Other / Non-Disclosed




In [4]:

gender_total = purchase_data.groupby("Gender")

total_gender= gender_total.nunique()["SN"]

player_percentage = total_gender / total * 100

summary_gender = pd.DataFrame({
          "Gender totals":[total_gender],
          "Persentage male and female": [player_percentage]})
summary_gender.index.name = None
summary_gender.sort_values(["Gender totals"], ascending = False).style.format({"player_percentage":"{:.2f}"})
#summary_gender.head(15)




#summary_gender.index.name =None
#summary_gender.head()


#Gender_total_df = pd.DataFrame({"Gender Percentage": player_percentage,
                               #"Count":gender_total})
#

#Gender_total_df.sort_values(["Count"], ascending = False)



Unnamed: 0,Gender totals,Persentage male and female
0,"Gender Female 81 Male 484 Other / Non-Disclosed 11 Name: SN, dtype: int64","Gender Female 10.384615 Male 62.051282 Other / Non-Disclosed 1.410256 Name: SN, dtype: float64"


In [5]:
gender_total = purchase_data.groupby("Gender")
total_gender= gender_total.nunique()["SN"]
player_percentage = total_gender / total * 100
player_percentage

Gender
Female                   10.384615
Male                     62.051282
Other / Non-Disclosed     1.410256
Name: SN, dtype: float64


## Purchasing Analysis (Gender)

* Run basic calculations to obtain purchase count, avg. purchase price, avg. purchase total per person etc. by gender




* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame

In [6]:
#count_buy = gender_total["Purchase ID"].count()

#price_avg = gender_total["Price"].mean()
price_total = gender_total["Price"].agg([np.mean, np.sum , np.std,len])
price_total

Unnamed: 0_level_0,mean,sum,std,len
Gender,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
Female,3.203009,361.94,1.158194,113.0
Male,3.017853,1967.64,1.175625,652.0
Other / Non-Disclosed,3.346,50.19,0.883813,15.0


## Age Demographics

* Establish bins for ages


* Categorize the existing players using the age bins. Hint: use pd.cut()


* Calculate the numbers and percentages by age group


* Create a summary data frame to hold the results


* Optional: round the percentage column to two decimal points


* Display Age Demographics Table


In [11]:
Age_bin = [0,10,14,19,24,29,34,41,100]
group_names = ["<10", "10-14", "15-19", "20-24", "25-29", "30-34", "35-39", "40+"]

purchase_data["age group"] = pd.cut(purchase_data["Age"],Age_bin,labels=group_names )
purchase_data


age_costgroup = purchase_data.groupby("age group")

age_count = age_costgroup["SN"].nunique()

Percentage = (age_count/total) * 100

Playercount_byage = pd.DataFrame({"Player Percentage": Percentage,"Count": age_count})

Playercount_byage.head()


Unnamed: 0_level_0,Player Percentage,Count
age group,Unnamed: 1_level_1,Unnamed: 2_level_1
<10,3.076923,24
10-14,1.923077,15
15-19,13.717949,107
20-24,33.076923,258
25-29,9.871795,77


## Purchasing Analysis (Age)

* Bin the purchase_data data frame by age


* Run basic calculations to obtain purchase count, avg. purchase price, avg. purchase total per person etc. in the table below


* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame

In [12]:
total_purch_age = age_costgroup["Purchase ID"].count()
average_purch = age_costgroup["Price"].mean()
total_age_purch = age_costgroup["Price"].sum()

purchase_per_age_avg = (total_age_purch/age_count) *100

Age_Analysis_pd = pd.DataFrame({"Player Percent": purchase_per_age_avg,
                              "count": age_count })

Age_Analysis_pd.head()


Unnamed: 0_level_0,Player Percent,count
age group,Unnamed: 1_level_1,Unnamed: 2_level_1
<10,454.0,24
10-14,339.666667,15
15-19,385.878505,107
20-24,431.806202,258
25-29,380.519481,77


## Top Spenders

* Run basic calculations to obtain the results in the table below


* Create a summary data frame to hold the results


* Sort the total purchase value column in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the summary data frame



In [22]:
top_spender =purchase_data.groupby("SN")

spender_count = top_spender["Purchase ID"].count()

spender_count

SN
Adairialis76       1
Adastirin33        1
Aeda94             1
Aela59             1
Aelaria33          1
Aelastirin39       2
Aelidru27          1
Aelin32            3
Aelly27            2
Aellynun67         1
Aellyria80         1
Aelollo59          2
Aenarap34          1
Aeral43            1
Aeral68            1
Aeral97            1
Aeralria27         1
Aeralstical35      1
Aeri84             1
Aerillorin70       1
Aerithllora36      2
Aerithnucal56      1
Aerithnuphos61     1
Aerithriaphos45    1
Aerithriaphos46    1
Aesri53            1
Aesty53            2
Aestysu37          2
Aesur96            1
Aesurstilis64      1
                  ..
Undosia27          1
Undosian34         2
Undotesta33        1
Wailin72           1
Yadacal26          2
Yadaisuir65        1
Yadam35            1
Yadanu52           1
Yadaphos40         2
Yalae81            2
Yalaeria91         1
Yaliru88           1
Yalo85             1
Yalostiphos68      1
Yana46             1
Yarithllodeu72     1
Yarithrgue

In [21]:
top_spender =purchase_data.groupby("SN")

spender_count = top_spender["Purchase ID"].count()
avg_spender = top_spender["Price"].mean()

spender_total = top_spender["Price"].sum()

Most_spenders_pd = pd.DataFrame({"Spender Count": spender_count,
                                 "Avg spent": avg_spender,
                                 "Spender total": spender_total})

Most_spenders_pd.head()
# was up late dont know what I am doing wrong cant get it to count all the 
SN 

Unnamed: 0_level_0,Spender Count,Avg spent,Spender total
SN,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Adairialis76,1,2.28,2.28
Adastirin33,1,4.48,4.48
Aeda94,1,4.91,4.91
Aela59,1,4.32,4.32
Aelaria33,1,1.79,1.79


## Most Popular Items

* Retrieve the Item ID, Item Name, and Item Price columns


* Group by Item ID and Item Name. Perform calculations to obtain purchase count, item price, and total purchase value


* Create a summary data frame to hold the results


* Sort the purchase count column in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the summary data frame



In [30]:
most = purchase_data[["Item ID", "Item Name", "Price"]]
item_grup = most.groupby(["Item ID", "Item Name"])  

item_count = item_grup["Price"].count()

Item_price = item_grup["Price"].sum()

real_cost = Item_price/item_count


Most_popular_Stuff = pd.DataFrame({"count of items": item_count,
                                  "item price": Item_price,
                                  "Real cost": real_cost})



## Most Profitable Items

* Sort the above table by total purchase value in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the data frame



In [31]:
most_profit = Most_popular_Stuff.sort_values(["Real cost"], ascending=False).head()

most_profit

Unnamed: 0_level_0,Unnamed: 1_level_0,count of items,item price,Real cost
Item ID,Item Name,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
63,Stormfury Mace,2,9.98,4.99
139,"Mercy, Katana of Dismay",5,24.7,4.94
173,Stormfury Longsword,2,9.86,4.93
147,"Hellreaver, Heirloom of Inception",3,14.79,4.93
128,"Blazeguard, Reach of Eternity",5,24.55,4.91
