Problem Statement: 

A film distribution company wants to target audience based on their likes and dislikes, you as a Chief Data Scientist Analyze the data and come up with different rules of movie list so that the business objective is achieved.

Business Problem:

This dataset can help solve various business problems, especially for streaming platforms or movie theaters:

Recommendation Systems: Build a recommendation engine that suggests movies to users based on their viewing history or ratings. This helps improve user engagement and retention.

Understanding Popularity: Analyzing which genres, actors, or directors are most popular among viewers helps in deciding which content to invest in or acquire.

Content Strategy: For streaming platforms, the data can be used to decide which movies to produce, license, or retire based on viewership trends.

Sentiment Analysis: By incorporating user reviews or ratings, businesses can gauge audience satisfaction and identify which movies are liked or disliked.

In [1]:
#Import required libraries
import pandas as pd
from mlxtend.frequent_patterns import apriori, association_rules
from mlxtend.preprocessing import TransactionEncoder

In [2]:
#load dataset
df=pd.read_csv('my_movies.csv', on_bad_lines='skip')
df

Unnamed: 0,Sixth Sense,Gladiator,LOTR1,Harry Potter1,Patriot,LOTR2,Harry Potter2,LOTR,Braveheart,Green Mile
0,1,0,1,1,0,1,0,0,0,1
1,0,1,0,0,1,0,0,0,1,0
2,0,0,1,0,0,1,0,0,0,0
3,1,1,0,0,1,0,0,0,0,0
4,1,1,0,0,1,0,0,0,0,0
5,1,1,0,0,1,0,0,0,0,0
6,0,0,0,1,0,0,1,0,0,0
7,0,1,0,0,1,0,0,0,0,0
8,1,1,0,0,1,0,0,0,0,0
9,1,1,0,0,0,0,0,1,0,1


In [3]:
df.head()

Unnamed: 0,Sixth Sense,Gladiator,LOTR1,Harry Potter1,Patriot,LOTR2,Harry Potter2,LOTR,Braveheart,Green Mile
0,1,0,1,1,0,1,0,0,0,1
1,0,1,0,0,1,0,0,0,1,0
2,0,0,1,0,0,1,0,0,0,0
3,1,1,0,0,1,0,0,0,0,0
4,1,1,0,0,1,0,0,0,0,0


In [6]:
#Our dataset is in proper format that is required for Apriori
#step2:Apply the apriori algorithm to find frequent itemsets
frequent_itemsets=apriori(df,min_support=0.2,use_colnames=True)
frequent_itemsets



Unnamed: 0,support,itemsets
0,0.6,(Sixth Sense)
1,0.7,(Gladiator)
2,0.2,(LOTR1)
3,0.2,(Harry Potter1)
4,0.6,(Patriot)
5,0.2,(LOTR2)
6,0.2,(Green Mile)
7,0.5,"(Sixth Sense, Gladiator)"
8,0.4,"(Patriot, Sixth Sense)"
9,0.2,"(Sixth Sense, Green Mile)"


In [7]:
#step3:Generate association rules from the frequent itemsets
rules=association_rules(frequent_itemsets,metric="lift",min_threshold=1)

In [8]:
#step4:Output the results
print("Frequent Itemsets:")
print(frequent_itemsets)

Frequent Itemsets:
    support                           itemsets
0       0.6                      (Sixth Sense)
1       0.7                        (Gladiator)
2       0.2                            (LOTR1)
3       0.2                    (Harry Potter1)
4       0.6                          (Patriot)
5       0.2                            (LOTR2)
6       0.2                       (Green Mile)
7       0.5           (Sixth Sense, Gladiator)
8       0.4             (Patriot, Sixth Sense)
9       0.2          (Sixth Sense, Green Mile)
10      0.6               (Patriot, Gladiator)
11      0.2                     (LOTR1, LOTR2)
12      0.4  (Patriot, Sixth Sense, Gladiator)


In [9]:
print("\nAssociation Rules:")
print(rules[['antecedents','consequents','support','confidence','lift']])



Association Rules:
                 antecedents               consequents  support  confidence  \
0              (Sixth Sense)               (Gladiator)      0.5    0.833333   
1                (Gladiator)             (Sixth Sense)      0.5    0.714286   
2                  (Patriot)             (Sixth Sense)      0.4    0.666667   
3              (Sixth Sense)                 (Patriot)      0.4    0.666667   
4              (Sixth Sense)              (Green Mile)      0.2    0.333333   
5               (Green Mile)             (Sixth Sense)      0.2    1.000000   
6                  (Patriot)               (Gladiator)      0.6    1.000000   
7                (Gladiator)                 (Patriot)      0.6    0.857143   
8                    (LOTR1)                   (LOTR2)      0.2    1.000000   
9                    (LOTR2)                   (LOTR1)      0.2    1.000000   
10    (Patriot, Sixth Sense)               (Gladiator)      0.4    1.000000   
11      (Patriot, Gladiator)    

Applying Association Rule Learning to movie viewing data enables streaming services or cinemas to gain deeper insights into customer preferences. By analyzing users' watch histories and ratings, platforms can provide tailored recommendations that boost engagement and viewer satisfaction. Understanding which genres, directors, or actors are popular helps optimize content selection and marketing efforts, ensuring the most relevant content is highlighted for specific audiences. Personalizing recommendations based on user behavior strengthens customer loyalty, encourages more frequent viewing, and enhances subscriber retention. By continuously refining recommendations through feedback, the platform can maintain a competitive advantage, drive higher revenue, and keep customers invested.