## Final Exam WQD 7005 Data Mining 
## Question 4

### Name: Nurullainy binti Mat Rashid                   
### ID :  17036591

### Topic: Rule Mining  of Internet Movie Database (IMDb)

The objectives are to find frequent itemsets and mining Associatio Rules using data from IMDb. The data was collected from the following website : https://www.imdb.com/search/title/?year=2017


In order to achieve the task, I will be going to cover the following steps:

    1) Importing required libraries
    2) Creating a list from dataset (Question 1)
    3) Convert list to dataframe with boolean values
    4) Find frequently occurring itemsets using Apriori Algorithm
    5) Find frequently occurring itemsets using F-P Growth
    6) Mine the Association Rules

### 1) Importing required libraries

In [1]:
import pandas as pd
import numpy as np

from mlxtend.preprocessing import TransactionEncoder
from mlxtend.frequent_patterns import apriori
from mlxtend.frequent_patterns import fpgrowth
from mlxtend.frequent_patterns import association_rules

import csv

In [2]:
# Load dataset

df = pd.read_csv('movies_imdb_preprocessed.csv')
df.head()

Unnamed: 0,movie_name,year_released,runtime_in_min,genre,revenues,imdb_rating,user_votes,director,actor
0,Gladiator,2000,155,"Action, Adventure, Drama",187705427,8.5,1295546,Ridley Scott,"Russell Crowe, Joaquin Phoenix, Connie Nielsen..."
1,Memento,2000,113,"Mystery, Thriller",25544867,8.4,1088700,Christopher Nolan,"Guy Pearce, Carrie-Anne Moss, Joe Pantoliano, ..."
2,Snatch,2000,104,"Comedy, Crime",30328156,8.3,760646,Guy Ritchie,"Jason Statham, Brad Pitt, Benicio Del Toro, De..."
3,Requiem for a Dream,2000,102,Drama,3635482,8.3,742193,Darren Aronofsky,"Ellen Burstyn, Jared Leto, Jennifer Connelly, ..."
4,X-Men,2000,104,"Action, Adventure, Sci-Fi",157299717,7.4,558716,Bryan Singer,"Patrick Stewart, Hugh Jackman, Ian McKellen, F..."


In [3]:
df.isnull().sum()

movie_name         0
year_released      0
runtime_in_min     0
genre              0
revenues           0
imdb_rating        0
user_votes         0
director          78
actor             78
dtype: int64

In [4]:
df.dropna(how='any', inplace=True)

In [5]:
df.shape

(816, 9)

### 2) Creating a list from dataset (Question 1)



In [6]:
# Create new subset of dataset

revenue_actor = df[['revenues', 'genre']]

revenue_actor.head(10)

Unnamed: 0,revenues,genre
0,187705427,"Action, Adventure, Drama"
1,25544867,"Mystery, Thriller"
2,30328156,"Comedy, Crime"
3,3635482,Drama
4,157299717,"Action, Adventure, Sci-Fi"
5,233632142,"Adventure, Drama, Romance"
6,15070285,"Comedy, Crime, Drama"
7,95011339,"Drama, Mystery, Sci-Fi"
8,215409889,"Action, Adventure, Thriller"
9,166244045,"Comedy, Romance"


In [7]:
# Change format of revenues data

df1 = revenue_actor['revenues'].div(1000000).to_frame('col') # Change to Million notation
df1.shape

revenue_actor['revenues'] = df1['col']
revenue_actor.info()


revenue_actor['revenues'] = revenue_actor['revenues'].round(0).astype(int)
revenue_actor.columns = ['revenues in mil', 'genre']  # Rename the columns name

revenue_actor.head()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 816 entries, 0 to 815
Data columns (total 2 columns):
revenues    816 non-null float64
genre       816 non-null object
dtypes: float64(1), object(1)
memory usage: 19.1+ KB


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  # Remove the CWD from sys.path while we load stuff.


Unnamed: 0,revenues in mil,genre
0,188,"Action, Adventure, Drama"
1,26,"Mystery, Thriller"
2,30,"Comedy, Crime"
3,4,Drama
4,157,"Action, Adventure, Sci-Fi"


The above code shows that using Revenues in Million in integer format as number of transaction is preferred. There are 298 unique revenue number out of 895 of total rows

Consolidate the items into 1 transaction per row with each revenue number, in this case revenues in million

In [8]:
# Group actor by revenue generated (298 unique revenue number) 

basket = revenue_actor.groupby(['revenues in mil'])['genre'].apply(list)

print("\n", basket[:6])


 revenues in mil
0    [Action, Crime, Thriller, Crime, Drama, Myster...
1    [Drama, Mystery, Sci-Fi, Action, Drama, Sci-Fi...
2    [Crime, Drama, Crime, Drama, Comedy, Romance, ...
3                                     [Drama, Romance]
4    [Drama, Crime, Drama, Musical, Comedy, Drama, ...
5    [Drama, Thriller, Animation, Adventure, Family...
Name: genre, dtype: object


In [9]:
# List all the actor in list format (for model preparation)

basket_list = list(basket)

print("\n", basket_list[:10])


 [['Action, Crime, Thriller', 'Crime, Drama, Mystery', 'Action, Crime, Drama', 'Crime, Drama, Sport', 'Adventure, Comedy, Sci-Fi', 'Drama', 'Drama, Fantasy, Romance', 'Comedy, Horror', 'Action, Adventure, Drama'], ['Drama, Mystery, Sci-Fi', 'Action, Drama, Sci-Fi', 'Action, Drama, Mystery', 'Drama, Thriller', 'Drama, Thriller', 'Drama'], ['Crime, Drama', 'Crime, Drama', 'Comedy, Romance, Sport', 'Drama, Horror, Romance', 'Biography, Crime, Drama'], ['Drama, Romance'], ['Drama', 'Crime, Drama, Musical', 'Comedy, Drama, Romance', 'Action, Comedy, Crime', 'Sci-Fi, Thriller'], ['Drama, Thriller', 'Animation, Adventure, Family', 'Drama, Mystery, Sci-Fi', 'Action, Drama, Sci-Fi', 'Animation, Drama, Fantasy'], ['Comedy, Drama', 'Biography, Drama, History', 'Drama, Romance', 'Action, Crime, Thriller', 'Drama, Mystery, Romance', 'Action, Adventure, Comedy', 'Comedy, Drama'], ['Drama, Mystery, Thriller', 'Comedy, Drama', 'Drama'], ['Crime, Drama', 'Comedy, Crime, Drama'], ['Comedy, Drama, Roman

### 3) Convert list to dataframe with Boolean values

In [10]:
# Convert list to dataframe with Boolean values

te = TransactionEncoder()
te_ary = te.fit(basket_list).transform(basket_list)

df2 = pd.DataFrame(te_ary, columns=te.columns_)
df2.head(10)

Unnamed: 0,"Action, Adventure","Action, Adventure, Biography","Action, Adventure, Comedy","Action, Adventure, Crime","Action, Adventure, Drama","Action, Adventure, Family","Action, Adventure, Fantasy","Action, Adventure, History","Action, Adventure, Horror","Action, Adventure, Mystery",...,Horror,"Horror, Mystery","Horror, Mystery, Thriller","Horror, Sci-Fi","Horror, Sci-Fi, Thriller","Horror, Thriller","Mystery, Sci-Fi, Thriller","Mystery, Thriller","Romance, Sci-Fi, Thriller","Sci-Fi, Thriller"
0,False,False,False,False,True,False,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False
1,False,False,False,False,False,False,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False
2,False,False,False,False,False,False,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False
3,False,False,False,False,False,False,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False
4,False,False,False,False,False,False,False,False,False,False,...,False,False,False,False,False,False,False,False,False,True
5,False,False,False,False,False,False,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False
6,False,False,True,False,False,False,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False
7,False,False,False,False,False,False,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False
8,False,False,False,False,False,False,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False
9,False,False,False,False,False,False,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False


The above table shows the distribution of each movie genre in one revenue number. False indicates no genre by the specific revenue number whereas True indicates that the movie genre generating the specific revenue number.

### 4) Find frequently occurring itemsets using Apriori Algorithm

`Apriori` is an algorithm for frequent itemset mining and association rule learning over relational databases. Apriori is an algorithm for frequent itemset mining and Association Rule learning over relational databases. The algorithm identify the frequent individual items in the database and extending them to larger and larger itemsets as long as those itemsets appear sufficiently often in the database.

The frequent itemsets determined by `Apriori` can be used to determine `Association Rules` which highlight general trends in the database. This has applications in domains such as market basket analysis.

The `Support` and `Confidence` are measures to measure how interesting a rule is. These parameters are used to exclude rules in the result that have a `Support` or a `Confidence` lower than the minimum support and minimum confidence respectively. I have experimented a number trial of minimum support number and 0.01 is the best for this dataset.

1)	Pros: Easy to code up

2)	Cons: May be slow on large datasets

3)	Works with: Numeric values, nominal values


#### General approach to Apriori algorithm:

    1)	Preparation: Any data type will work because we storing sets.
    2)	Train: Use the Apriori algorithm to find frequent itemsets.
    3)	Test: Doesn’t apply.
    4)	Application: This will be used to find frequent itemsets and association rules between items.

In [11]:
# Frequently occurring itemsets using Apriori Algorithm

frequent_itemsets_apriori = apriori(df2, min_support=0.01, 
                                    use_colnames=True).sort_values(by='support', ascending=0)
frequent_itemsets_apriori

Unnamed: 0,support,itemsets
7,0.204698,"(Action, Adventure, Sci-Fi)"
31,0.117450,"(Animation, Adventure, Comedy)"
5,0.097315,"(Action, Adventure, Fantasy)"
39,0.083893,(Comedy)
45,0.080537,"(Comedy, Drama, Romance)"
3,0.070470,"(Action, Adventure, Drama)"
1,0.070470,"(Action, Adventure, Comedy)"
51,0.057047,"(Crime, Drama, Thriller)"
10,0.057047,"(Action, Comedy, Crime)"
48,0.057047,"(Comedy, Romance)"


### 5) Find frequently occurring itemsets using F-P Growth (Frequent Pattern Growth) 

1) Pros: Usually faster than Apriori.

2) Cons: Difficult to implement; certain datasets degrade the performance.

3) Works with Nominal values.

#### General approach to FP-growth algorithm

    1) Preparation: Discrete data is needed because we’re storing sets. For continuous data, it will need to be 
    quantized into discrete values.
    2) Train: Build an FP-tree and mine the tree.
    3) Test: Doesn’t apply.
    4) Application: This can be used to identify commonly occurring items that can be used to make decisions, 
    suggest items, make forecasts, and so on.

The frequently occurring movie genre using F-P Growth as follows:

In [12]:
# Frequently occurring itemsets using F-P Growth (Frequent Pattern Growth)

frequent_itemsets_fpgrowth = fpgrowth(df2, min_support=0.01, 
                                     use_colnames=True).sort_values(by='support', ascending=0)
frequent_itemsets_fpgrowth

Unnamed: 0,support,itemsets
39,0.204698,"(Action, Adventure, Sci-Fi)"
37,0.117450,"(Animation, Adventure, Comedy)"
45,0.097315,"(Action, Adventure, Fantasy)"
33,0.083893,(Comedy)
15,0.080537,"(Comedy, Drama, Romance)"
0,0.070470,"(Action, Adventure, Drama)"
18,0.070470,"(Action, Adventure, Comedy)"
1,0.057047,"(Action, Crime, Thriller)"
32,0.057047,"(Crime, Drama, Thriller)"
16,0.057047,"(Action, Comedy, Crime)"


### 6) Mine the Association Rules

Association rules analysis is a technique to uncover how items are associated to each other. There are 3 common ways to measure association:

1) Measure 1: `Support` - This says how popular an itemset is, as measured by the proportion of transactions in which an itemset appears. If we discover that sales of certain items beyond a certain proportion or tend to have a significant impact on our profits, we might consider using that proportion as your `support` threshold. Thus, we identify itemsets with `support values above this threshold` as significant itemsets.

2) Measure 2: `Confidence`. This says how likely item B is purchased when item A is purchased, expressed as {A -> B}. This is measured by the proportion of transactions with item A, in which item B also appears. One drawback of the `confidence` measure is that it might misrepresent the importance of an association. This is because it only accounts for how popular A are, but not B. If B are also very popular in general, there will be a higher chance that a transaction containing A will also contain B, thus inflating the confidence measure. To account for the base popularity of both constituent items, we use a third measure called `Lift`.

3) Measure 3: `Lift`. This says how likely item B is purchased when item A is purchased, while controlling for how popular item B is. 

    a)Lift of {A -> B} = 1, means no association between items. 
    b)Lift {A -> B} > 1, means that item B is likely to be bought if item A is bought, 
    c)Lift {A -> B} < 1, means that item B is unlikely to be bought if item A is bought.

Now I want to build our association rules. Apyori's apriori function accepts a number of arguments, mainly:

    a)transaction: list of list of items in transactions (eg. [['A', 'B'], ['B', 'C']]).
    b)min_support: Minimum support of relations in float percentage. Default 0.003. 
    c)min_confidence: Minimum confidence of relations in float percentage. Default 0.0.
    d)min_lift: Minimum lift of relations in float percentage. Default 0.0.
    e)max_length: Max length of the relations. Default None. 

#### 6a) Mine the Association Rules using Apriori Algorithm

In [13]:
# Generate the Association Rules using Apriori Algorithm with their corresponding support, confidence and lift. 

rules_apriori = association_rules(frequent_itemsets_apriori, metric="lift", min_threshold=1)
rules_apriori = rules_apriori.sort_values(by='lift', ascending=0)

In [14]:
# View top 10 rules 

rules_apriori

Unnamed: 0,antecedents,consequents,antecedent support,consequent support,support,confidence,lift,leverage,conviction
71,"(Comedy, Drama)","(Action, Adventure, Comedy, Comedy)",0.053691,0.016779,0.010067,0.187500,11.175000,0.009166,1.210119
70,"(Action, Adventure, Comedy, Comedy)","(Comedy, Drama)",0.016779,0.053691,0.010067,0.600000,11.175000,0.009166,2.365772
69,"(Comedy, Drama, Comedy)","(Action, Adventure, Comedy)",0.013423,0.070470,0.010067,0.750000,10.642857,0.009121,3.718121
72,"(Action, Adventure, Comedy)","(Comedy, Drama, Comedy)",0.070470,0.013423,0.010067,0.142857,10.642857,0.009121,1.151007
47,"(Comedy, Drama, Romance)","(Comedy, Crime, Drama)",0.080537,0.020134,0.010067,0.125000,6.208333,0.008446,1.119847
46,"(Comedy, Crime, Drama)","(Comedy, Drama, Romance)",0.020134,0.080537,0.010067,0.500000,6.208333,0.008446,1.838926
49,"(Comedy, Drama)","(Drama, Mystery, Sci-Fi)",0.053691,0.030201,0.010067,0.187500,6.208333,0.008446,1.193598
48,"(Drama, Mystery, Sci-Fi)","(Comedy, Drama)",0.030201,0.053691,0.010067,0.333333,6.208333,0.008446,1.419463
68,"(Action, Adventure, Comedy, Comedy, Drama)",(Comedy),0.020134,0.083893,0.010067,0.500000,5.960000,0.008378,1.832215
44,"(Comedy, Crime)",(Comedy),0.020134,0.083893,0.010067,0.500000,5.960000,0.008378,1.832215


`Antecedent` and a `Consequent`, both of which are a list of genres. Note that implication here is co-occurrence and not causality.

The maximum value of `Lift` is `11.1` and maximum value for `Confidence` is `0.75` found in `Association Rules using Apriori Algorithm`. I want to view for a large value of `Lift` and `Conficence` with range value of more than 1 and more than 0.4 repectively. This means that genre B is likely to be chosen if genre A is chosen

In [15]:
# Filter the dataframe for Lift > 1 and high confidence >= 0.5

rules_apriori[(rules_apriori['lift'] > 1) & 
              (rules_apriori['confidence'] >= 0.5)].sort_values(by='lift', ascending=0)

Unnamed: 0,antecedents,consequents,antecedent support,consequent support,support,confidence,lift,leverage,conviction
70,"(Action, Adventure, Comedy, Comedy)","(Comedy, Drama)",0.016779,0.053691,0.010067,0.6,11.175,0.009166,2.365772
69,"(Comedy, Drama, Comedy)","(Action, Adventure, Comedy)",0.013423,0.07047,0.010067,0.75,10.642857,0.009121,3.718121
46,"(Comedy, Crime, Drama)","(Comedy, Drama, Romance)",0.020134,0.080537,0.010067,0.5,6.208333,0.008446,1.838926
68,"(Action, Adventure, Comedy, Comedy, Drama)",(Comedy),0.020134,0.083893,0.010067,0.5,5.96,0.008378,1.832215
44,"(Comedy, Crime)",(Comedy),0.020134,0.083893,0.010067,0.5,5.96,0.008378,1.832215
59,"(Drama, Sport)","(Action, Adventure, Sci-Fi)",0.013423,0.204698,0.010067,0.75,3.663934,0.007319,3.181208


Now I want to view small value of `Lift` and `Confidence` with range value of less than 1 and more than 0.4 respectively. This means that genre B is unlikely to be chosen if genre A is chosen

In [16]:
# Filter the dataframe for Lift < 1 and high confidence >= 0.5

rules_apriori[(rules_apriori['lift'] < 1) & 
              (rules_apriori['confidence'] >= 0.5)].sort_values(by='lift', ascending=0)

Unnamed: 0,antecedents,consequents,antecedent support,consequent support,support,confidence,lift,leverage,conviction


In [17]:
# Filter the dataframe for Lift = 1 and high confidence > 0.5

rules_apriori[(rules_apriori['lift'] == 1) & 
              (rules_apriori['confidence'] >= 0.5)].sort_values(by='lift', ascending=0)

Unnamed: 0,antecedents,consequents,antecedent support,consequent support,support,confidence,lift,leverage,conviction


Findings 1:

1) About 74 rules have a high Lift value (more than 1), which means that it increase the chances of occurence of movie genre in `Consequents` in spite high `Confidence` value.

2) A value of `Lift` which greater than 1 indicate for high association between `Antecedents` and `Consequents`. The greater the value of `Lift`, the greater are the chances of preference to choose genre in `Consequents`. Here, if the viewer has already watched movie with genre of (Comedy, Action, Adventure, Comedy), viewer will likely watch (Comedy, Drama) movie genre.

3) Comedy genre has high `Lift` value 

4) `Lift` is the measure that will help movie producer to decide what kind of movie genre to produce next based on revenue generated from an individual movie.

5) These 74 rules also have wide range of `Confidence` number, range between 0.04 to 0.75.

6) There is no Association Rule for Lift value less than 1

#### 6b) Mine the Association Rules using F-P Growth

In [18]:
rules_fpgrowth = association_rules(frequent_itemsets_fpgrowth, metric="lift", min_threshold=1)
rules_fpgrowth = rules_fpgrowth.sort_values(by='lift', ascending=0)

In [19]:
rules_fpgrowth

Unnamed: 0,antecedents,consequents,antecedent support,consequent support,support,confidence,lift,leverage,conviction
32,"(Action, Adventure, Comedy, Comedy)","(Comedy, Drama)",0.016779,0.053691,0.010067,0.600000,11.175000,0.009166,2.365772
33,"(Comedy, Drama)","(Action, Adventure, Comedy, Comedy)",0.053691,0.016779,0.010067,0.187500,11.175000,0.009166,1.210119
31,"(Comedy, Drama, Comedy)","(Action, Adventure, Comedy)",0.013423,0.070470,0.010067,0.750000,10.642857,0.009121,3.718121
34,"(Action, Adventure, Comedy)","(Comedy, Drama, Comedy)",0.070470,0.013423,0.010067,0.142857,10.642857,0.009121,1.151007
57,"(Comedy, Drama, Romance)","(Comedy, Crime, Drama)",0.080537,0.020134,0.010067,0.125000,6.208333,0.008446,1.119847
56,"(Comedy, Crime, Drama)","(Comedy, Drama, Romance)",0.020134,0.080537,0.010067,0.500000,6.208333,0.008446,1.838926
71,"(Comedy, Drama)","(Drama, Mystery, Sci-Fi)",0.053691,0.030201,0.010067,0.187500,6.208333,0.008446,1.193598
70,"(Drama, Mystery, Sci-Fi)","(Comedy, Drama)",0.030201,0.053691,0.010067,0.333333,6.208333,0.008446,1.419463
48,"(Comedy, Crime)",(Comedy),0.020134,0.083893,0.010067,0.500000,5.960000,0.008378,1.832215
30,"(Action, Adventure, Comedy, Comedy, Drama)",(Comedy),0.020134,0.083893,0.010067,0.500000,5.960000,0.008378,1.832215


The maximum value of `Lift` is `11.1` and maximum value for `Confidence` is `0.75` found in `Association Rules using F-P Growth Algorithm` same like the `Apriori Algorithm`. I want to view for a large value of `Lift` and `Conficence` with range value of more than 1 and more than 0.4 respectively

In [20]:
# Filter the dataframe for Lift < 1 and high confidence >= 0.5

rules_fpgrowth[(rules_fpgrowth['lift'] > 1) & 
               (rules_fpgrowth['confidence'] >= 0.5)].sort_values(by='lift', ascending=0)

Unnamed: 0,antecedents,consequents,antecedent support,consequent support,support,confidence,lift,leverage,conviction
32,"(Action, Adventure, Comedy, Comedy)","(Comedy, Drama)",0.016779,0.053691,0.010067,0.6,11.175,0.009166,2.365772
31,"(Comedy, Drama, Comedy)","(Action, Adventure, Comedy)",0.013423,0.07047,0.010067,0.75,10.642857,0.009121,3.718121
56,"(Comedy, Crime, Drama)","(Comedy, Drama, Romance)",0.020134,0.080537,0.010067,0.5,6.208333,0.008446,1.838926
48,"(Comedy, Crime)",(Comedy),0.020134,0.083893,0.010067,0.5,5.96,0.008378,1.832215
30,"(Action, Adventure, Comedy, Comedy, Drama)",(Comedy),0.020134,0.083893,0.010067,0.5,5.96,0.008378,1.832215
43,"(Drama, Sport)","(Action, Adventure, Sci-Fi)",0.013423,0.204698,0.010067,0.75,3.663934,0.007319,3.181208


In [21]:
# Filter the dataframe for Lift < 1 and high confidence >= 0.5

rules_apriori[(rules_apriori['lift'] < 1) & 
              (rules_apriori['confidence'] >= 0.5)].sort_values(by='lift', ascending=0)

Unnamed: 0,antecedents,consequents,antecedent support,consequent support,support,confidence,lift,leverage,conviction


In [22]:
# Filter the dataframe for Lift = 1 and high confidence > 0.5

rules_apriori[(rules_apriori['lift'] == 1) & 
              (rules_apriori['confidence'] >= 0.5)].sort_values(by='lift', ascending=0)

Unnamed: 0,antecedents,consequents,antecedent support,consequent support,support,confidence,lift,leverage,conviction


#### Findings 2:

1) The output from `F-P Growth algorithm` is same like `Apriori algorithm` for this case study. About 74 rule have a high `Lift` value (more than 1), which means that it increase the chances of occurence of movie genre in `Consequents` in spite high `Confidence` value.

2) These 74 rules of high `Lift` value also have wide range of `Confidence` number, range between 0.04 to 0.75.

3) `F-P Growth algorithm` showed faster in processing the data than `Apriori algorithm`

4) There is no Association Rule for Lift value less than 1 in this case study

In [23]:
rules_apriori.to_csv('Apriori_Revenues_Genre.csv',mode = 'w', index=False, header=True)

In [24]:
rules_fpgrowth.to_csv('FPGrowth_Revenues_Genre.csv',mode = 'w', index=False, header=True)