### Heroes Of Pymoli Data Analysis
* Of the 1163 active players, the vast majority are male (84%). There also exists, a smaller, but notable proportion of female players (14%).

* Our peak age demographic falls between 20-24 (44.8%) with secondary groups falling between 15-19 (18.60%) and 25-29 (13.4%).  
-----

### Note
* Instructions have been included for each segment. You do not have to follow them exactly, but they are included to help you think through the steps.

In [7]:
# Dependencies and Setup
import pandas as pd
import numpy as np

# File to Load (Remember to Change These)
purchase_data = pd.read_csv("Resources/purchase_data.csv", delimiter=",",
                            index_col="Purchase ID")

# Read Purchasing File and store into Pandas data frame
purchase_data.head()


Unnamed: 0_level_0,SN,Age,Gender,Item ID,Item Name,Price
Purchase ID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
0,Lisim78,20,Male,108,"Extraction, Quickblade Of Trembling Hands",3.53
1,Lisovynya38,40,Male,143,Frenzied Scimitar,1.56
2,Ithergue48,24,Male,92,Final Critic,4.88
3,Chamassasya86,24,Male,100,Blindscythe,3.27
4,Iskosia90,23,Male,131,Fury,1.44


## Player Count

* Display the total number of players


In [14]:
players = purchase_data['SN'].count()
players

780

## Purchasing Analysis (Total)

* Run basic calculations to obtain number of unique items, average price, etc.


* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame


In [12]:
uniq_players=purchase_data['SN'].nunique()
avg_purchase=purchase_data['Price'].mean()
total_purchases=purchase_data['Price'].count()
total_revenue=purchase_data['Price'].sum()

print("# of Unique Items: "+ str(uniq_players))
print("Average Purchase Price: " +str(avg_purchase))
print("Total # of Purchases: " + str(total_purchases))
print("Total Revenue: "+ str(total_revenue))

purchase_analysis = pd.DataFrame(
    {
      "# of Unique Items" : [uniq_players],
        "Average Purchase Price" : [avg_purchase],
        "Total # of Purchases" : [total_purchases],
        "Total Revenue" : [total_revenue]
    })
purchase_analysis

# of Unique Items: 576
Average Purchase Price: 3.050987179487176
Total # of Purchases: 780
Total Revenue: 2379.77


Unnamed: 0,# of Unique Items,Average Purchase Price,Total # of Purchases,Total Revenue
0,576,3.050987,780,2379.77


## Gender Demographics

* Percentage and Count of Male Players


* Percentage and Count of Female Players


* Percentage and Count of Other / Non-Disclosed





## Purchasing Analysis (Gender)

In [291]:
count_gender = purchase_data['Gender'].value_counts()
#count_gender = purchase_data.groupby('Gender')
#t_count_gender = count_gender['SN'].nunique()
percent = count_gender /total_purchases
 

demo_gender = pd.DataFrame({"Gender Total" :count_gender, "Percent" :percent})

demo_gender






Unnamed: 0,Gender Total,Percent
Male,652,0.835897
Female,113,0.144872
Other / Non-Disclosed,15,0.019231


* Run basic calculations to obtain purchase count, avg. purchase price, avg. purchase total per person etc. by gender




* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame

In [287]:



#avg_total_per = purchase_data.groupby(['Gender']).sum()
#avg_per_gender = pd.concat([demo_gender, purchase_analysis], axis=0)
#avg_per_gender


# Count the total purchases by gender 
purch_total = count_gender


# Average purchase total by gender divivded by purchase count by unique shoppers
avg_purch_per_person = avg_purch_total/uniq_players

# Create data frame with obtained values 
gender_analysis = pd.DataFrame(
    {
      "# of Unique Items" : [uniq_players],
        "Average Purchase Price" : [avg_purchase],
        "Total # of Purchases" : [total_purchases],
        "Total Revenue" : [total_revenue]
    })


final_gender_a = pd.concat([gender_analysis, demo_gender], axis=0)

# Provide index in top left as "Gender"

#gender_analysis.index.name='Gender'
#gender_analysis

final_gender_a


of pandas will change to not sort by default.

To accept the future behavior, pass 'sort=False'.




Unnamed: 0,# of Unique Items,Average Purchase Price,Gender Total,Percent,Total # of Purchases,Total Revenue
0,576.0,3.050987,,,780.0,2379.77
Male,,,652.0,0.835897,,
Female,,,113.0,0.144872,,
Other / Non-Disclosed,,,15.0,0.019231,,


## Age Demographics

 
* Establish bins for ages

* Categorize the existing players using the age bins. Hint: use pd.cut()


* Calculate the numbers and percentages by age group


* Create a summary data frame to hold the results


* Optional: round the percentage column to two decimal points


* Display Age Demographics Table


In [317]:

#age_min = purchase_data['Age'].min()
#age_max = purchase_data['Age'].max()
#print(age_min)
#print(age_max)
#min_max_age = 7 to 45

bins = [0, 7, 12, 21, 35, 40]
group_labels  = ["Adolesant", "Teen", "Prime", "Ripe", "Full"]

age_bin=purchase_data["Player Categories"] = pd.cut(purchase_data["Age"],bins, labels=group_labels)

#age_analysis = pd.DataFrame(age_bin).groupby['Age']

avg_per_gender = pd.concat([final_gender_a, age_analysis], axis=0)

avg_per_gender

of pandas will change to not sort by default.

To accept the future behavior, pass 'sort=False'.


  


Unnamed: 0,# of Unique Items,Age,Average Purchase Price,Gender Total,Percent,Total # of Purchases,Total Revenue
0,576.0,,3.050987,,,780.0,2379.77
Male,,,,652.0,0.835897,,
Female,,,,113.0,0.144872,,
Other / Non-Disclosed,,,,15.0,0.019231,,
0,,Prime,,,,,
1,,Full,,,,,
2,,Ripe,,,,,
3,,Ripe,,,,,
4,,Ripe,,,,,
5,,Ripe,,,,,


## Purchasing Analysis (Age)

* Bin the purchase_data data frame by age


* Run basic calculations to obtain purchase count, avg. purchase price, avg. purchase total per person etc. in the table below


* Create a summary data frame to hold the results


* Optional: give the displayed data cleaner formatting


* Display the summary data frame

In [224]:
#purchase_analysis = purchase_data.groupby('Player Categories')
#purchase_analysis.count()
purch_anls_age = purchase_data

Unnamed: 0_level_0,SN,Age,Gender,Item ID,Item Name,Price,Percentage
Player Categories,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
Adolesant,9,9,9,9,9,9,0
Teen,36,36,36,36,36,36,0
Prime,303,303,303,303,303,303,0
Ripe,392,392,392,392,392,392,0
Full,33,33,33,33,33,33,0


## Top Spenders

* Run basic calculations to obtain the results in the table below


* Create a summary data frame to hold the results


* Sort the total purchase value column in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the summary data frame



## Most Popular Items

* Retrieve the Item ID, Item Name, and Item Price columns


* Group by Item ID and Item Name. Perform calculations to obtain purchase count, item price, and total purchase value


* Create a summary data frame to hold the results


* Sort the purchase count column in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the summary data frame



In [332]:
#popi = purchase_data.groupby('SN')
#item_id = popi['Item ID'].count()
#avg_item_price = popi['Price'].mean() 
#item_total = popi['Price'].sum()


#t_popi = pd.DataFrame({"Purchase Count": [item_id],
                      #"Item Price": [item_price],
                      #"Total Value": [item_total]})

#t_popi

popi = purchase_data.groupby("SN")
item_id = popi["Item ID"].count()
avg_item_price = popi["Price"].mean()
item_total = popi["Price"].sum()

top_pop = pd.DataFrame({"Purchase Count": item_id,
                             "Average Item Price": avg_item_price,
                             "Total Item Value":item_total})
top_pop.head()

Unnamed: 0_level_0,Purchase Count,Average Item Price,Total Item Value
SN,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Adairialis76,1,2.28,2.28
Adastirin33,1,4.48,4.48
Aeda94,1,4.91,4.91
Aela59,1,4.32,4.32
Aelaria33,1,1.79,1.79


## Most Profitable Items

* Sort the above table by total purchase value in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the data frame



In [336]:
top_pop.sort_values(by='Total Item Value', ascending=False)

Unnamed: 0_level_0,Purchase Count,Average Item Price,Total Item Value
SN,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Lisosia93,5,3.792000,18.96
Idastidru52,4,3.862500,15.45
Chamjask73,3,4.610000,13.83
Iral74,4,3.405000,13.62
Iskadarya95,3,4.366667,13.10
Ilarin91,3,4.233333,12.70
Ialallo29,3,3.946667,11.84
Tyidaim51,3,3.943333,11.83
Lassilsala30,3,3.836667,11.51
Chadolyla44,3,3.820000,11.46
