### Heroes Of Pymoli Data Analysis
* Of the 1163 active players, the vast majority are male (84%). There also exists, a smaller, but notable proportion of female players (14%).

* Our peak age demographic falls between 20-24 (44.8%) with secondary groups falling between 15-19 (18.60%) and 25-29 (13.4%).  
-----

### Note
* Instructions have been included for each segment. You do not have to follow them exactly, but they are included to help you think through the steps.

## Most Popular Items

* Retrieve the Item ID, Item Name, and Item Price columns


* Group by Item ID and Item Name. Perform calculations to obtain purchase count, item price, and total purchase value


* Create a summary data frame to hold the results


* Sort the purchase count column in descending order


* Optional: give the displayed data cleaner formatting


* Display a preview of the summary data frame



In [76]:
# Import modules:
import pandas as pd
import csv
import os
import openpyxl

# Set parameters for input and output files:
path = os.path.join("Resources", "purchase_data.csv")
#output_path = os.path.join("Output", "purchase_df.xlsx")
purchase_data = pd.read_csv(path)

In [77]:
# Create a dataframe from the read csv file:
df = purchase_data[["Item ID", "Item Name", "Price"]]
df.head()

Unnamed: 0,Item ID,Item Name,Price
0,108,"Extraction, Quickblade Of Trembling Hands",3.53
1,143,Frenzied Scimitar,1.56
2,92,Final Critic,4.88
3,100,Blindscythe,3.27
4,131,Fury,1.44


In [78]:
# Try to group the items and use aggregation method to output count and sum for Price column:
grouped_df = pd.DataFrame(df.groupby(['Item ID', 'Item Name']).agg({'Price':['count', 'sum']}))
grouped_df

Unnamed: 0_level_0,Unnamed: 1_level_0,Price,Price
Unnamed: 0_level_1,Unnamed: 1_level_1,count,sum
Item ID,Item Name,Unnamed: 2_level_2,Unnamed: 3_level_2
0,Splinter,4,5.12
1,Crucifer,4,11.77
2,Verdict,6,14.88
3,Phantomlight,6,14.94
4,Bloodlord's Fetish,5,8.50
...,...,...,...
178,"Oathbreaker, Last Hope of the Breaking Storm",12,50.76
179,"Wolf, Promise of the Moonwalker",6,26.88
181,Reaper's Toll,5,8.30
182,Toothpick,3,12.09


In [79]:
## Add new column for the item price by dividing the sum over count:
#grouped_df['Item Price'] = grouped_df['sum'] / grouped_df['count']
#grouped_df
# Error, might need to use get_level_values method to get values for a level of a multi-index:
grouped_df.columns = grouped_df.columns.get_level_values(1)
grouped_df

Unnamed: 0_level_0,Unnamed: 1_level_0,count,sum
Item ID,Item Name,Unnamed: 2_level_1,Unnamed: 3_level_1
0,Splinter,4,5.12
1,Crucifer,4,11.77
2,Verdict,6,14.88
3,Phantomlight,6,14.94
4,Bloodlord's Fetish,5,8.50
...,...,...,...
178,"Oathbreaker, Last Hope of the Breaking Storm",12,50.76
179,"Wolf, Promise of the Moonwalker",6,26.88
181,Reaper's Toll,5,8.30
182,Toothpick,3,12.09


In [80]:
# Calculate the average price and put in a new column:
grouped_df['Item Price'] = grouped_df['sum'] / grouped_df['count']
grouped_df.head()

Unnamed: 0_level_0,Unnamed: 1_level_0,count,sum,Item Price
Item ID,Item Name,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
0,Splinter,4,5.12,1.28
1,Crucifer,4,11.77,2.9425
2,Verdict,6,14.88,2.48
3,Phantomlight,6,14.94,2.49
4,Bloodlord's Fetish,5,8.5,1.7


In [81]:
# Rename the columns:
grouped_df = grouped_df.rename(columns={'count': 'Purchase Count',
                                        'sum': 'Total Purchase Value'})
grouped_df                                

Unnamed: 0_level_0,Unnamed: 1_level_0,Purchase Count,Total Purchase Value,Item Price
Item ID,Item Name,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
0,Splinter,4,5.12,1.2800
1,Crucifer,4,11.77,2.9425
2,Verdict,6,14.88,2.4800
3,Phantomlight,6,14.94,2.4900
4,Bloodlord's Fetish,5,8.50,1.7000
...,...,...,...,...
178,"Oathbreaker, Last Hope of the Breaking Storm",12,50.76,4.2300
179,"Wolf, Promise of the Moonwalker",6,26.88,4.4800
181,Reaper's Toll,5,8.30,1.6600
182,Toothpick,3,12.09,4.0300


In [82]:
# Try to use sorting method on Purchase Count data to get the top selling items:
grouped_df = grouped_df.sort_values(by='Purchase Count', ascending=False)
grouped_df

Unnamed: 0_level_0,Unnamed: 1_level_0,Purchase Count,Total Purchase Value,Item Price
Item ID,Item Name,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
92,Final Critic,13,59.99,4.614615
178,"Oathbreaker, Last Hope of the Breaking Storm",12,50.76,4.230000
145,Fiery Glass Crusader,9,41.22,4.580000
132,Persuasion,9,28.99,3.221111
108,"Extraction, Quickblade Of Trembling Hands",9,31.77,3.530000
...,...,...,...,...
42,The Decapitator,1,1.75,1.750000
51,Endbringer,1,4.66,4.660000
118,"Ghost Reaver, Longsword of Magic",1,2.17,2.170000
104,Gladiator's Glaive,1,1.93,1.930000


In [83]:
# Get the top 5 items:
grouped_df_top5 = grouped_df[:5]
grouped_df_top5

Unnamed: 0_level_0,Unnamed: 1_level_0,Purchase Count,Total Purchase Value,Item Price
Item ID,Item Name,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
92,Final Critic,13,59.99,4.614615
178,"Oathbreaker, Last Hope of the Breaking Storm",12,50.76,4.23
145,Fiery Glass Crusader,9,41.22,4.58
132,Persuasion,9,28.99,3.221111
108,"Extraction, Quickblade Of Trembling Hands",9,31.77,3.53


In [84]:
# Swap the columns:
grouped_df_top5 = grouped_df_top5[['Purchase Count', 'Item Price', 'Total Purchase Value']]
grouped_df_top5

Unnamed: 0_level_0,Unnamed: 1_level_0,Purchase Count,Item Price,Total Purchase Value
Item ID,Item Name,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
92,Final Critic,13,4.614615,59.99
178,"Oathbreaker, Last Hope of the Breaking Storm",12,4.23,50.76
145,Fiery Glass Crusader,9,4.58,41.22
132,Persuasion,9,3.221111,28.99
108,"Extraction, Quickblade Of Trembling Hands",9,3.53,31.77


In [85]:
# Format the float number:
grouped_df_top5["Item Price"] = grouped_df_top5["Item Price"].map("${:,.2f}".format)
grouped_df_top5["Total Purchase Value"] = grouped_df_top5["Total Purchase Value"].map("${:,.2f}".format)
grouped_df_top5

Unnamed: 0_level_0,Unnamed: 1_level_0,Purchase Count,Item Price,Total Purchase Value
Item ID,Item Name,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
92,Final Critic,13,$4.61,$59.99
178,"Oathbreaker, Last Hope of the Breaking Storm",12,$4.23,$50.76
145,Fiery Glass Crusader,9,$4.58,$41.22
132,Persuasion,9,$3.22,$28.99
108,"Extraction, Quickblade Of Trembling Hands",9,$3.53,$31.77


In [86]:
# Save to excel file:
output_file = grouped_df_top5.to_excel("6_Most_Popular_Items.xlsx")
pd.ExcelWriter
writer = pd.ExcelWriter("6_Most_Popular_Items.xlsx")
# Write purchase summary to the same excel file in a new sheet:
grouped_df_top5.to_excel(writer, sheet_name = 'Most Popular Items')
writer.save()


In [9]:
# Example below (do not run this cell)

Unnamed: 0_level_0,Unnamed: 1_level_0,Purchase Count,Item Price,Total Purchase Value
Item ID,Item Name,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
92,Final Critic,13,$4.61,$59.99
178,"Oathbreaker, Last Hope of the Breaking Storm",12,$4.23,$50.76
145,Fiery Glass Crusader,9,$4.58,$41.22
132,Persuasion,9,$3.22,$28.99
108,"Extraction, Quickblade Of Trembling Hands",9,$3.53,$31.77
