## Perform the following operations using Python on the Facebook metrics data sets
  * Create data subsets
  * Merge Data
  * Sort Data
  * Transposing Data
  * Shape and reshape Data
---

## Dataset Description:
Credit: 
   (Moro et al., 2016) S. Moro, P. Rita and B. Vala. Predicting social media performance metrics and evaluation 
   of the impact on brand building: A data mining approach. Journal of Business Research, Elsevier, In press.
 
   Available at: http://dx.doi.org/10.1016/j.jbusres.2016.02.010


1. Title: Facebook performance metrics

2. Sources
   Created by: Sérgio Moro, Paulo Rita and Bernardo Vala (ISCTE-IUL) @ 2016
   
3. Past Usage:

   The full dataset was described and analyzed in:

   S. Moro, P. Rita and B. Vala. Predicting social media performance metrics and evaluation of the impact on 
   brand building: A data mining approach. Journal of Business Research, Elsevier, In press, Available online 
   since 28 February 2016.

4. Relevant Information:

   The data is related to posts' published during the year of 2014 on the Facebook's page of a renowned cosmetics brand.
   This dataset contains 500 of the 790 rows and part of the features analyzed by Moro et al. (2016). The remaining were 
   omitted due to confidentiality issues.


5. Number of Instances: 500

6. Number of Attributes: 19

7. Attribute information:

   It includes 7 features known prior to post publication and 12 features for evaluating post impact (see Tables 2 and 3 
   from Moro et al., 2016 - complete reference in the "Citation Request")

8. Missing Attribute Values: None












In [1]:
import pandas as pd

In [2]:
df = pd.read_csv('dataset_Facebook.csv',delimiter=';')

In [3]:
df

Unnamed: 0,Page total likes,Type,Category,Post Month,Post Weekday,Post Hour,Paid,Lifetime Post Total Reach,Lifetime Post Total Impressions,Lifetime Engaged Users,Lifetime Post Consumers,Lifetime Post Consumptions,Lifetime Post Impressions by people who have liked your Page,Lifetime Post reach by people who like your Page,Lifetime People who have liked your Page and engaged with your post,comment,like,share,Total Interactions
0,139441,Photo,2,12,4,3,0.0,2752,5091,178,109,159,3078,1640,119,4,79.0,17.0,100
1,139441,Status,2,12,3,10,0.0,10460,19057,1457,1361,1674,11710,6112,1108,5,130.0,29.0,164
2,139441,Photo,3,12,3,3,0.0,2413,4373,177,113,154,2812,1503,132,0,66.0,14.0,80
3,139441,Photo,2,12,2,10,1.0,50128,87991,2211,790,1119,61027,32048,1386,58,1572.0,147.0,1777
4,139441,Photo,2,12,2,3,0.0,7244,13594,671,410,580,6228,3200,396,19,325.0,49.0,393
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
495,85093,Photo,3,1,7,2,0.0,4684,7536,733,708,985,4750,2876,392,5,53.0,26.0,84
496,81370,Photo,2,1,5,8,0.0,3480,6229,537,508,687,3961,2104,301,0,53.0,22.0,75
497,81370,Photo,1,1,5,2,0.0,3778,7216,625,572,795,4742,2388,363,4,93.0,18.0,115
498,81370,Photo,3,1,4,11,0.0,4156,7564,626,574,832,4534,2452,370,7,91.0,38.0,136


In [4]:
df.describe()

Unnamed: 0,Page total likes,Category,Post Month,Post Weekday,Post Hour,Paid,Lifetime Post Total Reach,Lifetime Post Total Impressions,Lifetime Engaged Users,Lifetime Post Consumers,Lifetime Post Consumptions,Lifetime Post Impressions by people who have liked your Page,Lifetime Post reach by people who like your Page,Lifetime People who have liked your Page and engaged with your post,comment,like,share,Total Interactions
count,500.0,500.0,500.0,500.0,500.0,499.0,500.0,500.0,500.0,500.0,500.0,500.0,500.0,500.0,500.0,499.0,496.0,500.0
mean,123194.176,1.88,7.038,4.15,7.84,0.278557,13903.36,29585.95,920.344,798.772,1415.13,16766.38,6585.488,609.986,7.482,177.945892,27.266129,212.12
std,16272.813214,0.852675,3.307936,2.030701,4.368589,0.448739,22740.78789,76803.25,985.016636,882.505013,2000.594118,59791.02,7682.009405,612.725618,21.18091,323.398742,42.613292,380.233118
min,81370.0,1.0,1.0,1.0,1.0,0.0,238.0,570.0,9.0,9.0,9.0,567.0,236.0,9.0,0.0,0.0,0.0,0.0
25%,112676.0,1.0,4.0,2.0,3.0,0.0,3315.0,5694.75,393.75,332.5,509.25,3969.75,2181.5,291.0,1.0,56.5,10.0,71.0
50%,129600.0,2.0,7.0,4.0,9.0,0.0,5281.0,9051.0,625.5,551.5,851.0,6255.5,3417.0,412.0,3.0,101.0,19.0,123.5
75%,136393.0,3.0,10.0,6.0,11.0,1.0,13168.0,22085.5,1062.0,955.5,1463.0,14860.5,7989.0,656.25,7.0,187.5,32.25,228.5
max,139441.0,3.0,12.0,7.0,23.0,1.0,180480.0,1110282.0,11452.0,11328.0,19779.0,1107833.0,51456.0,4376.0,372.0,5172.0,790.0,6334.0


# Create Data Subsets

In [5]:
# First subset: Like and Share
df_subset_1 = df[['like','share']]
df_subset_1

Unnamed: 0,like,share
0,79.0,17.0
1,130.0,29.0
2,66.0,14.0
3,1572.0,147.0
4,325.0,49.0
...,...,...
495,53.0,26.0
496,53.0,22.0
497,93.0,18.0
498,91.0,38.0


In [6]:
# second subset: Comment and Type
df_subset_2 = df[['comment','Type']]
df_subset_2

Unnamed: 0,comment,Type
0,4,Photo
1,5,Status
2,0,Photo
3,58,Photo
4,19,Photo
...,...,...
495,5,Photo
496,0,Photo
497,4,Photo
498,7,Photo


# Merge Data

In [7]:
merged_data  = pd.merge(df_subset_2, df_subset_1, left_on='comment', right_on= 'like')
merged_data

Unnamed: 0,comment,Type,like,share
0,4,Photo,4.0,2.0
1,4,Photo,4.0,1.0
2,4,Photo,4.0,0.0
3,4,Photo,4.0,1.0
4,4,Status,4.0,2.0
...,...,...,...,...
1462,56,Photo,56.0,17.0
1463,56,Photo,56.0,8.0
1464,56,Photo,56.0,12.0
1465,56,Photo,56.0,9.0


## Example for merge

In [8]:
# Define a dictionary containing employee data
data1 = {
'key': ['K0', 'K1', 'K2', 'K3'],
'Name':['Jai', 'Princi', 'Gaurav', 'Anuj'],
'Age':[27, 24, 22, 32],}
# Define a dictionary containing employee data
data2 = {
'key': ['K0', 'K1', 'K2', 'K3'],
'Address':['Nagpur', 'Kanpur', 'Allahabad', 'Kannuaj'],
'Qualification':['Btech', 'B.A', 'Bcom', 'B.hons']}
# Convert the dictionary into DataFrame
data1 = pd.DataFrame(data1)
# Convert the dictionary into DataFrame
data2 = pd.DataFrame(data2)

# print(df, "\n\n", df1)
res = pd.merge(data1, data2, on='key')
res

Unnamed: 0,key,Name,Age,Address,Qualification
0,K0,Jai,27,Nagpur,Btech
1,K1,Princi,24,Kanpur,B.A
2,K2,Gaurav,22,Allahabad,Bcom
3,K3,Anuj,32,Kannuaj,B.hons


# Sort Data

In [9]:
# Sorting merged_data in descending order wrt 'Like'
merged_data.sort_values(by=['like'],ascending=False)

Unnamed: 0,comment,Type,like,share
1461,146,Photo,146.0,15.0
1460,146,Photo,146.0,9.0
1438,144,Photo,144.0,10.0
1439,144,Photo,144.0,29.0
1448,64,Photo,64.0,22.0
...,...,...,...,...
563,0,Photo,0.0,0.0
564,0,Photo,0.0,0.0
565,0,Photo,0.0,0.0
566,0,Photo,0.0,0.0


# Transposing Data

In [10]:
# Method 1
merged_data.transpose()

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,...,1457,1458,1459,1460,1461,1462,1463,1464,1465,1466
comment,4,4,4,4,4,4,4,4,4,4,...,51,51,51,146,146,56,56,56,56,56
Type,Photo,Photo,Photo,Photo,Status,Status,Status,Status,Status,Status,...,Photo,Photo,Photo,Photo,Photo,Photo,Photo,Photo,Photo,Photo
like,4.0,4.0,4.0,4.0,4.0,4.0,4.0,4.0,4.0,4.0,...,51.0,51.0,51.0,146.0,146.0,56.0,56.0,56.0,56.0,56.0
share,2.0,1.0,0.0,1.0,2.0,1.0,0.0,1.0,2.0,1.0,...,11.0,6.0,6.0,9.0,15.0,17.0,8.0,12.0,9.0,25.0


In [11]:
# Method 2
merged_data.T

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,...,1457,1458,1459,1460,1461,1462,1463,1464,1465,1466
comment,4,4,4,4,4,4,4,4,4,4,...,51,51,51,146,146,56,56,56,56,56
Type,Photo,Photo,Photo,Photo,Status,Status,Status,Status,Status,Status,...,Photo,Photo,Photo,Photo,Photo,Photo,Photo,Photo,Photo,Photo
like,4.0,4.0,4.0,4.0,4.0,4.0,4.0,4.0,4.0,4.0,...,51.0,51.0,51.0,146.0,146.0,56.0,56.0,56.0,56.0,56.0
share,2.0,1.0,0.0,1.0,2.0,1.0,0.0,1.0,2.0,1.0,...,11.0,6.0,6.0,9.0,15.0,17.0,8.0,12.0,9.0,25.0


# Shape And Reshape Data

In [12]:
df

Unnamed: 0,Page total likes,Type,Category,Post Month,Post Weekday,Post Hour,Paid,Lifetime Post Total Reach,Lifetime Post Total Impressions,Lifetime Engaged Users,Lifetime Post Consumers,Lifetime Post Consumptions,Lifetime Post Impressions by people who have liked your Page,Lifetime Post reach by people who like your Page,Lifetime People who have liked your Page and engaged with your post,comment,like,share,Total Interactions
0,139441,Photo,2,12,4,3,0.0,2752,5091,178,109,159,3078,1640,119,4,79.0,17.0,100
1,139441,Status,2,12,3,10,0.0,10460,19057,1457,1361,1674,11710,6112,1108,5,130.0,29.0,164
2,139441,Photo,3,12,3,3,0.0,2413,4373,177,113,154,2812,1503,132,0,66.0,14.0,80
3,139441,Photo,2,12,2,10,1.0,50128,87991,2211,790,1119,61027,32048,1386,58,1572.0,147.0,1777
4,139441,Photo,2,12,2,3,0.0,7244,13594,671,410,580,6228,3200,396,19,325.0,49.0,393
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
495,85093,Photo,3,1,7,2,0.0,4684,7536,733,708,985,4750,2876,392,5,53.0,26.0,84
496,81370,Photo,2,1,5,8,0.0,3480,6229,537,508,687,3961,2104,301,0,53.0,22.0,75
497,81370,Photo,1,1,5,2,0.0,3778,7216,625,572,795,4742,2388,363,4,93.0,18.0,115
498,81370,Photo,3,1,4,11,0.0,4156,7564,626,574,832,4534,2452,370,7,91.0,38.0,136


In [13]:
df.Type.unique()

array(['Photo', 'Status', 'Link', 'Video'], dtype=object)

In [17]:
# Reshape
# Comment is id_vars and Type is value_vars
pd.melt(df, id_vars =['Type'], value_vars =['comment'])

Unnamed: 0,Type,variable,value
0,Photo,comment,4
1,Status,comment,5
2,Photo,comment,0
3,Photo,comment,58
4,Photo,comment,19
...,...,...,...
495,Photo,comment,5
496,Photo,comment,0
497,Photo,comment,4
498,Photo,comment,7


## Examples 

In [15]:
# Reshape
df_temp = pd.DataFrame({'foo': ['one', 'one', 'one', 'two', 'two', 'two'],
                   'bar': ['A', 'B', 'C', 'A', 'B', 'C'],                    
                   'baz': [1, 2, 3, 4, 5, 6],
                   'zoo': ['x', 'y', 'z', 'q', 'w', 't']})
df_temp

Unnamed: 0,foo,bar,baz,zoo
0,one,A,1,x
1,one,B,2,y
2,one,C,3,z
3,two,A,4,q
4,two,B,5,w
5,two,C,6,t


In [16]:
df_temp.pivot(index='foo', columns='bar', values='baz')

bar,A,B,C
foo,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
one,1,2,3
two,4,5,6
