# Building a Basic Content-Based Filtering Recommendation System Model for MyShoplivery E-commerce Platform

Developed a simple content based recommendation system for MyShoplivery, an e-commerce shopping platform in Nigeria. This recommendation system is able to suggest similar items based on item metadata, such as description, categories, and tags. The general idea behind these recommender system is that if a custumer selects a particular item, he or she will also like an item that is similar to it. And to recommend this with MyShoplivery App, you can walk into stores from your mobile phone, select items, pay and watch them delivered to your doorsteps in minutes.


This project will help you to find building content-based recommendation system without fear of complexity of getting started with building one and that even gives a better result while it is subject to further advanced improvement subject to industry standard in terms quality, or accuracy.

Personalization of the user experience has been a high priority and has become the
new mantra in the consumer-focused industry. You might have observed e-commerce
companies casting personalized ads for you suggesting what to buy, which news to
read, which video to watch, where/what to eat, and who you might be interested in
networking (friends/professionals) on social media sites. Recommender systems are
the core information filtering system designed to predict the user preference and help to
recommend correct items to create a user-specific personalization experience. There are
two types of recommendation systems: 1) content-based filtering and 2) collaborative
filtering.

# Content-Based Recommender System

Before I perform any of the above steps, let's import the necessary libraries that will be helping in buiding the recommmendation model.

In [1]:
# Import Pandas
import pandas as pd

#Import TfIdfVectorizer from scikit-learn
from sklearn.feature_extraction.text import TfidfVectorizer


Unnamed: 0,Id,TITLE,DESCRIPTION,Categories,Tags,Weight,Images,Price,Unnamed: 8
0,11616,HIVE FIVE HONEY 227G,Hive Five is Bax Bees’ premium product contain...,"Groceries , Food Cupboard",Condiments & Salad Dressing | Food Cupboard | ...,0.227,https://www.picclickimg.com/d/l400/pict/224019...,1200,
1,11581,CHI HAPPY HOUR ORANGE SAFARI 1LTR,Happy hour Orange Safari is a rare blend of na...,"Groceries , Beverages",Beverages | Drinks,1.0,http://oldenglishsuperstores.com/image/cache/c...,400,
2,11574,REXONA SEXY BOUQUET SPRAY 200ML,"The best accessory, No white marks.","Health & Beauty , Personal Care",Beauty & Personal Care | Personal Care,0.2,https://tellme.ng/wp-content/uploads/2020/10/1...,1200,
3,11565,FEFFERETTI CUBED CHINCHIN COCONUT 70G,,"Groceries , Snacks",Snacks | Cooking & Baking,0.07,,100,
4,11564,FEFFERETTI CUBED CHINCHIN VANILLA 70G,,"Groceries , Snacks",Snacks | Cooking & Baking,0.07,,100,


let's load the MyShoplivery metadata dataset into a pandas DataFrame:

In [None]:
# Load Movies Metadata
myshoplivery_data = pd.read_csv('myshoplivery_data1 - Sheet2.csv', low_memory=False)

# Print the first three rows
myshoplivery_data.head()

In [2]:
myshoplivery_data.shape

(11999, 9)

In [3]:
#Print plot overviews of the first 5 movies.
myshoplivery_data['DESCRIPTION'].head()

0    Hive Five is Bax Bees’ premium product contain...
1    Happy hour Orange Safari is a rare blend of na...
2                  The best accessory, No white marks.
3                                                  NaN
4                                                  NaN
Name: DESCRIPTION, dtype: object

Here we will carry out word vector extraction of some of the features to help us compute the similarity and disimilarity between the above text data by using Term Frequency-Inverse Document Frequency (TF-IDF). The Term Frequency-Inverse Document Frequency (TF-IDF) vectors for each document will give a matrix where each column represents a word in the description vocabulary.

For example, man & king will have vector representations close to each other while man & woman would have representation far from each other.

In Natural Language Processing problem, word vectors are vectorized representation of words in a document. The vectors carry a semantic meaning with it. 


In its essence, the TF-IDF score is the frequency of a word occurring in a document.

This is done to reduce the importance of words that frequently occur in plot overviews and, therefore, their significance in computing the final similarity score. Fortunately, scikit-learn gives you a built-in TfIdfVectorizer class that produces the TF-IDF matrix in a couple of lines.

Import the Tfidf module using scikit-learn;
Remove stop words like 'the', 'an', etc. since they do not give any useful information about the topic;
Replace not-a-number values with a blank string;
Finally, construct the TF-IDF matrix on the data.

In [4]:
#Import TfIdfVectorizer from scikit-learn
from sklearn.feature_extraction.text import TfidfVectorizer

#Define a TF-IDF Vectorizer Object. Remove all english stop words such as 'the', 'a'
tfidf = TfidfVectorizer(stop_words='english')

#Replace NaN with an empty string
myshoplivery_data['DESCRIPTION'] = myshoplivery_data['DESCRIPTION'].fillna('')

#myshoplivery_data['Categories'] = myshoplivery_data['Categories'].fillna('')

#Construct the required TF-IDF matrix by fitting and transforming the data
tfidf_matrix = tfidf.fit_transform(myshoplivery_data['DESCRIPTION'])

#Output the shape of tfidf_matrix
tfidf_matrix.shape

(11999, 13900)

From the above output, you observe that 13,900 different vocabularies or words in your dataset have 11,999 MyShoplivery dataset. 

In [5]:
#Array mapping from feature integer indices to feature name.

tfidf.get_feature_names()[10000:10500]

['qci',
 'qei',
 'qently',
 'qt',
 'quadruple',
 'quaker',
 'qualified',
 'qualities',
 'quality',
 'qualitygreat',
 'quantitative',
 'quantities',
 'quantity',
 'quantum',
 'quart',
 'quarter',
 'quarterback',
 'quarts',
 'quartz',
 'queen',
 'quench',
 'quenched',
 'quencher',
 'quenches',
 'quenching',
 'quest',
 'question',
 'questions',
 'quick',
 'quicker',
 'quickest',
 'quickly',
 'quicly',
 'quiet',
 'quilted',
 'quilters',
 'quilts',
 'quinine',
 'quintessential',
 'quip',
 'quit',
 'quite',
 'quo',
 'quote',
 'rabbit',
 'race',
 'racing',
 'rack',
 'radiance',
 'radiant',
 'radiants',
 'radiate',
 'radiates',
 'radiating',
 'radiation',
 'radiaton',
 'radiator',
 'radical',
 'radically',
 'radicals',
 'radio',
 'raditional',
 'radix',
 'radler',
 'radox',
 'raffermit',
 'rafraîchie',
 'rag',
 'rage',
 'raid',
 'raids',
 'rain',
 'rainbow',
 'rainbows',
 'raining',
 'raise',
 'raised',
 'raisers',
 'raises',
 'raisin',
 'raising',
 'raisins',
 'raisons',
 'randomly',
 'ranee'

With this matrix in hand, you can now compute a similarity score. There are several similarity metrics that you can use for this, such as:
<iu>The Manhattan,</iu> 
<iu>The Euclidean,</iu> 
<iu>The Pearson, and </iu> 
<iu>the Cosine similarity scores.</iu>  

Different scores work well in different scenarios, and it is often a good idea to experiment with different metrics and observe the results.


You will be using the cosine similarity to calculate a numeric quantity that denotes the similarity between two movies. 

You use the cosine similarity score since it is independent of magnitude and is relatively easy and fast to calculate (especially when used in conjunction with TF-IDF scores, which will be explained later). Mathematically, it is defined as follows:

Since you have used the TF-IDF vectorizer, calculating the dot product between each vector will directly give you the cosine similarity score. 

Therefore, you will use sklearn's linear_kernel() instead of cosine_similarities() since it is faster. This would return a matrix of shape 45466x45466, which means each movie overview cosine similarity score with every other movie overview. Hence, each movie will be a 1x45466 column vector where each column will be a similarity score with each movie.

In [6]:
# Import linear_kernel
from sklearn.metrics.pairwise import linear_kernel

# Compute the cosine similarity matrix
cosine_sim = linear_kernel(tfidf_matrix, tfidf_matrix)

In [7]:
cosine_sim.shape

(11999, 11999)

In [8]:
cosine_sim[1]

array([0., 1., 0., ..., 0., 0., 0.])

You're going to define a function that takes in a product or item title as an input and outputs a list of the 10 most similar products. 

Firstly, for this, you need a reverse mapping of product titles and DataFrame indices. In other words, you need a mechanism to identify the index of a product in your metadata DataFrame, given its title.

In [9]:
#Construct a reverse map of indices and movie titles
indices = pd.Series(myshoplivery_data.index, index=myshoplivery_data['TITLE']).drop_duplicates()

In [10]:
indices[:10]

TITLE
HIVE FIVE HONEY 227G                       0
CHI HAPPY HOUR ORANGE SAFARI 1LTR          1
REXONA SEXY BOUQUET SPRAY 200ML            2
FEFFERETTI CUBED CHINCHIN COCONUT 70G      3
FEFFERETTI CUBED CHINCHIN VANILLA  70G     4
FEFFERETTI CUBED CHINCHIN NUTMEG 70G       5
FEFFERETTI CUBED CHINCHIN CINNAMON  70G    6
SOULMATE HAIR CONDITIONER PLUS 330G        7
MEGA GROWTH NON-LYN RELAXER 12S            8
GOOD KNIGHT POWER SHOTs 120ML              9
dtype: int64

We are now in good shape to define the recommendation function. 
These are the following steps you'll follow:
Get the index of the product given its title.
Get the list of cosine similarity scores for that particular product with all products. Convert it into a list of tuples where the first element is its position, and the second is the similarity score.
Sort the aforementioned list of tuples based on the similarity scores; that is, the second element.
Get the top 10 elements of this list. Ignore the first element as it refers to self (the product most similar to a particular product is the product itself).
Return the titles corresponding to the indices of the top elements.

In [11]:
# Function that takes in item title as input and outputs most similar products


def get_recommendations(TITLE, cosine_sim=cosine_sim):
    # Get the index of the movie that matches the title
    idx = indices[TITLE]

    # Get the pairwsie similarity scores of all products with that item
    sim_scores = list(enumerate(cosine_sim[idx]))

    # Sort the products based on the similarity scores
    sim_scores = sorted(sim_scores, key=lambda x: x[1], reverse=True)

    # Get the scores of the 10 most similar items
    sim_scores = sim_scores[1:11]

    # Get the items indices
    shop_indices = [i[0] for i in sim_scores]

    # Return the top 10 most similar items
    return myshoplivery_data['TITLE'].iloc[shop_indices]

In [12]:
get_recommendations('HIVE FIVE HONEY 227G')

1077               KIRKLAND ORGANIC RAW HONEY 608G
392                          BEEBI HILL HONEY 720G
3252                GLUPA SKIN WHITENING CREAM 50G
2133               Honey Tree Honey 440Gm(Squeezy)
480         PALMOLIVE NATURAL SHOWER GEL MIX 500ML
4465               GLAM'S LIP GLOSS -7ML VARIETIES
88                         LEGEND GINGER DRINK 10S
6230    HONEY INFUSION DAILY HEAT HAIR CREAM -170G
7303                 LASER PURE BLOSSOM HONEY 500G
3314                    REVLON NATURAL HONEY 600ML
Name: TITLE, dtype: object

In [13]:
get_recommendations('REXONA SEXY BOUQUET SPRAY 200ML')

2505     SUPREME WHITE -500ML BODY LOTION CARROT EXT.
53                               SURE DRY SPRAY 150ML
52                    SURE INVISIBLE PURE SPRAY 150ML
1917            SURE INVISIBLE PURE WOMEN SPRAY 250ML
4508                         GLAM'S PRO TOUCH PINCEAU
1909            SURE MOTION SENSE BLACK + WHITE 250ML
8526                               DOVE INVISIBLE DRY
5230              AMOS WHITE AS STRETCH MARKS SOAP X3
2899     NANO EXTRA WHITE PAPAYA & CARROT BODY LOTION
11606                     DOVE INVIBLE DRY BODY SPRAY
Name: TITLE, dtype: object

We can see that the system has done a decent job of finding products with similar plot descriptions, the quality of recommendations is not that great. 

"The Dark Knight Rises" returns all Batman movies while it is more likely that the people who liked that movie are more inclined to enjoy other Christopher Nolan movies. This is something that cannot be captured by your present system.
Credits, Genres, and Keywords Based Recommender.


The quality of your recommender would be increased with the usage of better metadata and by capturing more of the finer details. That is precisely what you are going to do in this section. 

You will build a recommender system based on the following metadata: 
The Categories, and
The product tags.

The keywords, cast, and crew data are not available in your current dataset, so the first step would be to load and merge them into your main DataFrame metadata.

In [16]:
#drop unnecessary column
myshoplivery_data= myshoplivery_data.drop(['Unnamed: 8'],axis=1)

In [17]:
myshoplivery_data.head()

Unnamed: 0,Id,TITLE,DESCRIPTION,Categories,Tags,Weight,Images,Price
0,11616,HIVE FIVE HONEY 227G,Hive Five is Bax Bees’ premium product contain...,"Groceries , Food Cupboard",Condiments & Salad Dressing | Food Cupboard | ...,0.227,https://www.picclickimg.com/d/l400/pict/224019...,1200
1,11581,CHI HAPPY HOUR ORANGE SAFARI 1LTR,Happy hour Orange Safari is a rare blend of na...,"Groceries , Beverages",Beverages | Drinks,1.0,http://oldenglishsuperstores.com/image/cache/c...,400
2,11574,REXONA SEXY BOUQUET SPRAY 200ML,"The best accessory, No white marks.","Health & Beauty , Personal Care",Beauty & Personal Care | Personal Care,0.2,https://tellme.ng/wp-content/uploads/2020/10/1...,1200
3,11565,FEFFERETTI CUBED CHINCHIN COCONUT 70G,,"Groceries , Snacks",Snacks | Cooking & Baking,0.07,,100
4,11564,FEFFERETTI CUBED CHINCHIN VANILLA 70G,,"Groceries , Snacks",Snacks | Cooking & Baking,0.07,,100


In [18]:
myshoplivery_data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 11999 entries, 0 to 11998
Data columns (total 8 columns):
 #   Column       Non-Null Count  Dtype 
---  ------       --------------  ----- 
 0   Id           11875 non-null  object
 1   TITLE        11989 non-null  object
 2   DESCRIPTION  11999 non-null  object
 3   Categories   11986 non-null  object
 4   Tags         11947 non-null  object
 5   Weight       10568 non-null  object
 6   Images       9639 non-null   object
 7   Price        11995 non-null  object
dtypes: object(8)
memory usage: 750.1+ KB


In [19]:
myshoplivery_data.shape

(11999, 8)

From the new features, cast, crew, and keywords, We need to extract the three most important actors, the director and the keywords associated with that movie. But first things first, your data is present in the form of "stringified" lists. You need to convert them into a way that is usable for you.

In [25]:
# Print the new features of the first 3 films
myshoplivery_data[['TITLE', 'DESCRIPTION', 'Categories', 'Tags']].head(3)

Unnamed: 0,TITLE,DESCRIPTION,Categories,Tags
0,HIVE FIVE HONEY 227G,Hive Five is Bax Bees’ premium product contain...,"groceries , food cupboard","condiments & salad dressing , food cupboard , ..."
1,CHI HAPPY HOUR ORANGE SAFARI 1LTR,Happy hour Orange Safari is a rare blend of na...,"groceries , beverages","beverages , drinks"
2,REXONA SEXY BOUQUET SPRAY 200ML,"The best accessory, No white marks.","health & beauty , personal care","beauty & personal care , personal care"


The next step would be to convert the names and keyword instances into lowercase and strip all the spaces between them. Removing the spaces between words is an important preprocessing step. It is done so that your vectorizer doesn't count the Johnny of "Johnny Depp" and "Johnny Galecki" as the same. After this processing step, the aforementioned actors will be represented as "johnnydepp" and "johnnygalecki" and will be distinct to your vectorizer. Another good example where the model might output the same vector representation is "bread jam" and "traffic jam". Hence, it is better to strip off any space that is present. The below function will exactly do that for you:

In [21]:
def clean_data(x):
        if isinstance(x, list): 
            return [str.lower(i.replace("|", ",")) for i in x]             
        else: 
            #Check if director exists. If not, return empty string          
            if isinstance(x, str):
                return str.lower(x.replace("|", ","))          
            else:           
                return '' 

In [22]:
# Apply clean_data function to your features.  
features = ['Categories', 'Tags']

In [23]:
for feature in features:
    myshoplivery_data[feature] = myshoplivery_data[feature].apply(clean_data)

In [24]:
# Print the first two movies of your newly merged metadata
myshoplivery_data.head(2)

Unnamed: 0,Id,TITLE,DESCRIPTION,Categories,Tags,Weight,Images,Price
0,11616,HIVE FIVE HONEY 227G,Hive Five is Bax Bees’ premium product contain...,"groceries , food cupboard","condiments & salad dressing , food cupboard , ...",0.227,https://www.picclickimg.com/d/l400/pict/224019...,1200
1,11581,CHI HAPPY HOUR ORANGE SAFARI 1LTR,Happy hour Orange Safari is a rare blend of na...,"groceries , beverages","beverages , drinks",1.0,http://oldenglishsuperstores.com/image/cache/c...,400


Next, you write functions that will help you to extract the required information from each feature. First, you'll import the NumPy package to get access to its NaN constant. Next, you can use it to write the get_director() function:


Get the director's name from the crew feature. If the director is not listed, return NaN

COPY CODE
Next, you will write a function that will return the top 3 elements or the entire list, whichever is more. Here the list refers to the cast, keywords, and genres.

We are now in a position to create "metadata soup", which is a string that contains all the metadata that you want to feed to your vectorizer (namely actors, director and keywords). The create_soup function will simply join all the required columns by a space. This is the final preprocessing step, and the output of this function will be fed into the word vector model.

In [26]:
def create_soup(x):
    return ' '.join(x['Tags']) + ' ' + x['Categories'] + ' ' + ' '.join(x['DESCRIPTION'])

In [27]:
# Create a new soup feature
myshoplivery_data['soup'] = myshoplivery_data.apply(create_soup, axis=1)

In [28]:
myshoplivery_data[['soup']].head(2)

Unnamed: 0,soup
0,c o n d i m e n t s & s a l a d d r e s ...
1,"b e v e r a g e s , d r i n k s groceries ..."


The next steps are the same as what you did with your plot description based recommender. One key difference is that you use the CountVectorizer() instead of TF-IDF. This is because you do not want to down-weight the actor/director's presence if he or she has acted or directed in relatively more movies. It doesn't make much intuitive sense to down-weight them in this context. The major difference between CountVectorizer() and TF-IDF is the inverse document frequency (IDF) component which is present in later and not in the former.

In [29]:
# Import CountVectorizer and create the count matrix
from sklearn.feature_extraction.text import CountVectorizer

count = CountVectorizer(stop_words='english')
count_matrix = count.fit_transform(myshoplivery_data['soup'])

In [30]:
count_matrix.shape

(11999, 177)

From the above output, you can see that there are 177 vocabularies in the metadata that you fed to it. Next, you will use the cosine_similarity to measure the distance between the embeddings.

In [31]:
# Compute the Cosine Similarity matrix based on the count_matrix
from sklearn.metrics.pairwise import cosine_similarity

cosine_sim2 = cosine_similarity(count_matrix, count_matrix)

In [32]:
# Reset index of your main DataFrame and construct reverse mapping as before
myshoplivery_data = myshoplivery_data.reset_index()

indices = pd.Series(myshoplivery_data.index, index=myshoplivery_data['TITLE'])

You can now reuse your get_recommendations() function by passing in the new cosine_sim2 matrix as your second argument.

In [33]:
get_recommendations('HIVE FIVE HONEY 227G', cosine_sim2)

44                      MORNING TIME WHITE OAT 500G
98                  REMIA VITAL FULL MARGARINE 500G
99                             REMIA MARGARINE 450G
140              MARYLAND BIG & CHUNKY COOKIES 180G
141         MARYLAND TREATS CHOC CHIPS COOKIES 200G
142        MARYLAND TREATS DOUBLE CHOC COOKIES 200G
161                          HERMAN WHITE OATS 500G
221    POST HONEY BUNCHES OF OATS WITH ALMONDS 510G
226                    KELLOGG'S MOON AND STAR 400G
251                           QUAKER ROLLED OAT 1KG
Name: TITLE, dtype: object

In [34]:
get_recommendations('REXONA SEXY BOUQUET SPRAY 200ML', cosine_sim2)

14     BIC RAZOR CLASSIC NORMAL 5X 24
33       ADIDAS GET READY SPRAY 150ML
34     PRETTY COSMETIC PAD SQUARE 80S
35               PLUS COTTONSWAB 100S
37         PRETTY COTTON BUD 200COUNT
50         DOVE SENSITIVE SPRAY 150ML
51        SURE COTTON DRY SPRAY 150ML
52    SURE INVISIBLE PURE SPRAY 150ML
53               SURE DRY SPRAY 150ML
54    SURE BRIGHT BOUQUET SPRAY 150ML
Name: TITLE, dtype: object

In [35]:
myshoplivery_data.head()

Unnamed: 0,index,Id,TITLE,DESCRIPTION,Categories,Tags,Weight,Images,Price,soup
0,0,11616,HIVE FIVE HONEY 227G,Hive Five is Bax Bees’ premium product contain...,"groceries , food cupboard","condiments & salad dressing , food cupboard , ...",0.227,https://www.picclickimg.com/d/l400/pict/224019...,1200,c o n d i m e n t s & s a l a d d r e s ...
1,1,11581,CHI HAPPY HOUR ORANGE SAFARI 1LTR,Happy hour Orange Safari is a rare blend of na...,"groceries , beverages","beverages , drinks",1.0,http://oldenglishsuperstores.com/image/cache/c...,400,"b e v e r a g e s , d r i n k s groceries ..."
2,2,11574,REXONA SEXY BOUQUET SPRAY 200ML,"The best accessory, No white marks.","health & beauty , personal care","beauty & personal care , personal care",0.2,https://tellme.ng/wp-content/uploads/2020/10/1...,1200,b e a u t y & p e r s o n a l c a r e ...
3,3,11565,FEFFERETTI CUBED CHINCHIN COCONUT 70G,,"groceries , snacks","snacks , cooking & baking",0.07,,100,"s n a c k s , c o o k i n g & b a k i ..."
4,4,11564,FEFFERETTI CUBED CHINCHIN VANILLA 70G,,"groceries , snacks","snacks , cooking & baking",0.07,,100,"s n a c k s , c o o k i n g & b a k i ..."
5,5,11562,FEFFERETTI CUBED CHINCHIN NUTMEG 70G,,"groceries ,snacks","snacks , cooking & baking",0.07,,100,"s n a c k s , c o o k i n g & b a k i ..."
6,6,11561,FEFFERETTI CUBED CHINCHIN CINNAMON 70G,,"groceries ,snacks","snacks , cooking & baking",0.07,,100,"s n a c k s , c o o k i n g & b a k i ..."
7,7,11556,SOULMATE HAIR CONDITIONER PLUS 330G,Magnificent hair conditioning and hair shine *...,"health & beauty , beauty & personal care",beauty & personal care,0.33,https://www.paketznpiecezltd.com/wp-content/up...,850,b e a u t y & p e r s o n a l c a r e he...
8,8,11555,MEGA GROWTH NON-LYN RELAXER 12S,This MegaGrowth Crème on Crème Relaxer System ...,"health & beauty ,beauty & personal care",beauty & personal care,,https://images-na.ssl-images-amazon.com/images...,5500,b e a u t y & p e r s o n a l c a r e he...
9,9,11552,GOOD KNIGHT POWER SHOTs 120ML,This is a ready to use non propellant spray in...,"groceries , household supplies","household supplies , insecticides",0.12,https://static-s3.supermart.ng/productImage/sp...,1100,"h o u s e h o l d s u p p l i e s , i n ..."


Great! You see that your recommender has been successful in capturing more information due to more metadata and has given you better recommendations. 

There are, of course, numerous ways of experimenting with this system to improve recommendations.

Some suggestions:
Introduce a popularity filter: this recommender would take the 30 most similar movies, calculate the weighted ratings (using the IMDB formula from above), sort movies based on this rating, and return the top 10 movies.
Other crew members: other crew member names, such as screenwriters and producers, could also be included.

The increasing weight of the director: to give more weight to the director, he or she could be mentioned multiple times in the soup to increase the similarity scores of product item with the same tag.

In [None]:
Model Evaluation 

In [36]:
#To find or Search for a particular product for confirmation if the model performed well.

value_list=["MORNING TIME WHITE OAT 500G"]

value_series=myshoplivery_data.TITLE.isin(value_list)

filtered_myshoplivery_data=myshoplivery_data[value_series]

In [37]:
filtered_myshoplivery_data

Unnamed: 0,index,Id,TITLE,DESCRIPTION,Categories,Tags,Weight,Images,Price,soup
44,44,11423,MORNING TIME WHITE OAT 500G,"Good Morning White Oats is high in fibre, give...",groceries > food cupboard,"breakfast foods , food cupboard , jarred & pac...",0.5,https://static-s3.supermart.ng/productImage/al...,1200,"b r e a k f a s t f o o d s , f o o d ..."


In [38]:
value_list=["HIVE FIVE HONEY 227G"]

value_series=myshoplivery_data.TITLE.isin(value_list)

filtered_myshoplivery_data=myshoplivery_data[value_series]
filtered_myshoplivery_data

Unnamed: 0,index,Id,TITLE,DESCRIPTION,Categories,Tags,Weight,Images,Price,soup
0,0,11616,HIVE FIVE HONEY 227G,Hive Five is Bax Bees’ premium product contain...,"groceries , food cupboard","condiments & salad dressing , food cupboard , ...",0.227,https://www.picclickimg.com/d/l400/pict/224019...,1200,c o n d i m e n t s & s a l a d d r e s ...
