<a href="https://colab.research.google.com/github/Iamjuhwan/Deep-Learing/blob/main/Fiirst_NLP.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# 📌 NLP Text Preprocessing (Beginner-Friendly Interactive Guide)

## 🔹 Objective
Text preprocessing is an essential step in NLP. Before applying deep learning algorithms, we must clean and format the text so that computers can process it effectively. This is to guide us for practical sessions of this course (text preprocessing), which includes:
Lowercasing
Removing Punctuation
Removing Stopwords
Stemming / Lemmatization

In [1]:
import pandas as pd
import re
import nltk
from nltk.corpus import stopwords
from nltk.stem import PorterStemmer, WordNetLemmatizer
print("Success")

Success


In [2]:
# Download required NLTK datasets
nltk.download('stopwords')  # Stopwords list
print("Stopword Downloaded")
nltk.download('punkt')  # Tokenization
print("Punkt Downloaded")
nltk.download('wordnet')  # WordNet for Lemmatization
print("WordNet Downloaded")

[nltk_data] Downloading package stopwords to /root/nltk_data...
[nltk_data]   Unzipping corpora/stopwords.zip.
[nltk_data] Downloading package punkt to /root/nltk_data...


Stopword Downloaded


[nltk_data]   Unzipping tokenizers/punkt.zip.


Punkt Downloaded
WordNet Downloaded


[nltk_data] Downloading package wordnet to /root/nltk_data...


## Creating a Large, Diverse Dataset
We will create a dataset of 50 customer reviews with:

*   ✔️ Mixed sentiments (positive, negative, neutral)
*   ✔️ Punctuation and special characters
*   ✔️ Emojis for expression






In [3]:
# Sample dataset with 50 diverse reviews
reviews = [
    "I absolutely LOVE this product!! ❤️ It's super efficient and really worth the money. Definitely recommend! 👍",
    "Worst purchase ever... 😡 Waste of money. DO NOT BUY!! Full of issues.",
    "This product does what it says, but nothing special. 🤷‍♂️ It's okay for the price, I guess.",
    "AMAZING quality and fast shipping!!! 🚀🔥 #satisfied #fastdelivery",
    "Terrible! Had high expectations, but it broke in a week. Really disappointed. 😞",
    "This phone is great 📱, but the battery drains too fast. 🔋😕",
    "I love how easy it is to use! 🥰 Definitely a game-changer.",
    "Do not buy this laptop! 👎 It crashes every 10 minutes. So frustrating! 😡",
    "The camera quality is excellent! 📸 Love the night mode. 🌙✨",
    "Meh... the product is just average. 😐 I expected more for this price.",
    "Great customer service! 🙌 They replaced my faulty item within 24 hours.",
    "Horrible experience!! 💔 Received a broken item and no refund.",
    "I use this every day now. Super helpful! ✅",
    "The features are nice, but the software is laggy. 🤦‍♀️ Annoying!",
    "Best investment I've made this year! 🔥🔥",
    "Delivery took 2 months 😤 but the product is okay.",
    "Wouldn't recommend to anyone. Waste of time. 🙅‍♂️",
    "Super happy with my purchase!! 🎉 Everything works perfectly.",
    "This product changed my life. ✨ Absolutely incredible!",
    "Not bad, but also not great. Just okay. 😶",
    "Expected more for the price I paid. 😕",
    "Highly recommended! 👏 Fast shipping and great quality.",
    "This is my third purchase from this store and I'm never disappointed! ❤️",
    "Overpriced and underwhelming. 🫤 Could be better.",
    "Works well, but setup was a nightmare. 🛠️ Took me 2 hours!",
    "Love the design but the materials feel cheap. 🧐",
    "Absolutely worth it!! 💎 Super happy with this purchase.",
    "Stopped working after a month. 😔 Very disappointed.",
    "10/10! Would buy again. ⭐⭐⭐⭐⭐",
    "Disgusting smell 🤢, returned immediately.",
    "Best headphones I’ve ever used! 🎧 Sound quality is top-notch.",
    "Regret buying this. Not as advertised. 😠",
    "It does the job. Nothing exceptional. 🤷",
    "The size is perfect, but the fabric feels cheap. 🏷️",
    "Feels premium! 🔝 Great value for money.",
    "Returned because it didn't fit. 🚚 Hassle-free process.",
    "The manual is useless. Had to figure it out myself. 📖",
    "Why is this so expensive?? 💰 Not worth the price.",
    "Awesome customer support! 🎈 They solved my issue instantly.",
    "Very fragile. Broke after one drop. 🫣",
    "Exactly what I needed! 🎯 Highly recommended.",
    "Didn't expect much, but it exceeded my expectations! 🎊",
    "Scratches easily, but functions well. 🏁",
    "This brand never disappoints! 🏆 Will keep buying.",
    "Fake reviews everywhere. Product is terrible. 😒",
    "No words… just amazing. 🔥🔥🔥",
    "This was a gift and the recipient loved it! 🎁",
    "Cheap plastic, feels like a toy. 😡 Not recommended.",
    "Perfect for my needs. 👍 Would purchase again."
]


#list_syntax = ['element 1', 'element 2', 'element 3', 'element 4']
#list_sytax = [6,3,2,6,8,9,7,6]

In [4]:
# Convert to DataFrame
df_reviews = pd.DataFrame(reviews, columns=["Review"])

# Display the dataset
df_reviews.head(20)  # Show first 20 rows for preview

Unnamed: 0,Review
0,I absolutely LOVE this product!! ❤️ It's super...
1,Worst purchase ever... 😡 Waste of money. DO NO...
2,"This product does what it says, but nothing sp..."
3,AMAZING quality and fast shipping!!! 🚀🔥 #satis...
4,"Terrible! Had high expectations, but it broke ..."
5,"This phone is great 📱, but the battery drains ..."
6,I love how easy it is to use! 🥰 Definitely a g...
7,Do not buy this laptop! 👎 It crashes every 10 ...
8,The camera quality is excellent! 📸 Love the ni...
9,Meh... the product is just average. 😐 I expect...


## 3️⃣ Lowercasing
📌 Why?
Because NLP models treat "Love" and "love" differently, so we standardize text by converting it all to lowercase.

In [5]:
# Convert text to lowercase
df_reviews["Review_Lowercase"] = df_reviews["Review"].str.lower()

# Display results
df_reviews.head(20)  # Show first 10 rows


Unnamed: 0,Review,Review_Lowercase
0,I absolutely LOVE this product!! ❤️ It's super...,i absolutely love this product!! ❤️ it's super...
1,Worst purchase ever... 😡 Waste of money. DO NO...,worst purchase ever... 😡 waste of money. do no...
2,"This product does what it says, but nothing sp...","this product does what it says, but nothing sp..."
3,AMAZING quality and fast shipping!!! 🚀🔥 #satis...,amazing quality and fast shipping!!! 🚀🔥 #satis...
4,"Terrible! Had high expectations, but it broke ...","terrible! had high expectations, but it broke ..."
5,"This phone is great 📱, but the battery drains ...","this phone is great 📱, but the battery drains ..."
6,I love how easy it is to use! 🥰 Definitely a g...,i love how easy it is to use! 🥰 definitely a g...
7,Do not buy this laptop! 👎 It crashes every 10 ...,do not buy this laptop! 👎 it crashes every 10 ...
8,The camera quality is excellent! 📸 Love the ni...,the camera quality is excellent! 📸 love the ni...
9,Meh... the product is just average. 😐 I expect...,meh... the product is just average. 😐 i expect...


## 4️⃣ Removing Punctuation Marks
📌 Why?
Punctuation doesn't contribute meaning to most NLP models.
Removing punctuation simplifies text without affecting readability.

In [6]:
# Function to remove punctuation from text
def remove_punctuation(text):
    """
    This function removes all punctuation marks from the given text.

    - The function uses `re.sub(r'[^\w\s]', '', text)`, which means:
      - `[^\w\s]` → Matches any character that is NOT a word (`\w`) or a whitespace (`\s`).
      - `''` → Replaces all matched punctuation with an empty string.

    Example:
    ----------
    Input  : "Hello, World!!!"
    Output : "Hello World"
    """
    return re.sub(r'[^\w\s]', '', text)

# Apply the function to each review in the dataset
df_reviews["Review_NoPunct"] = df_reviews["Review_Lowercase"].apply(remove_punctuation)

# Display the first 20 rows to observe the changes
df_reviews.head(20)


Unnamed: 0,Review,Review_Lowercase,Review_NoPunct
0,I absolutely LOVE this product!! ❤️ It's super...,i absolutely love this product!! ❤️ it's super...,i absolutely love this product its super effi...
1,Worst purchase ever... 😡 Waste of money. DO NO...,worst purchase ever... 😡 waste of money. do no...,worst purchase ever waste of money do not buy...
2,"This product does what it says, but nothing sp...","this product does what it says, but nothing sp...",this product does what it says but nothing spe...
3,AMAZING quality and fast shipping!!! 🚀🔥 #satis...,amazing quality and fast shipping!!! 🚀🔥 #satis...,amazing quality and fast shipping satisfied f...
4,"Terrible! Had high expectations, but it broke ...","terrible! had high expectations, but it broke ...",terrible had high expectations but it broke in...
5,"This phone is great 📱, but the battery drains ...","this phone is great 📱, but the battery drains ...",this phone is great but the battery drains to...
6,I love how easy it is to use! 🥰 Definitely a g...,i love how easy it is to use! 🥰 definitely a g...,i love how easy it is to use definitely a gam...
7,Do not buy this laptop! 👎 It crashes every 10 ...,do not buy this laptop! 👎 it crashes every 10 ...,do not buy this laptop it crashes every 10 mi...
8,The camera quality is excellent! 📸 Love the ni...,the camera quality is excellent! 📸 love the ni...,the camera quality is excellent love the nigh...
9,Meh... the product is just average. 😐 I expect...,meh... the product is just average. 😐 i expect...,meh the product is just average i expected mo...


## 5️⃣ Removing Stopwords
📌 Why?
Stopwords (e.g., "is", "the", "of") appear frequently but add little meaning.

In [7]:
# Function to remove punctuation from text
def remove_punctuation(text):
    """
    This function removes all punctuation marks from the given text.

    - The function uses `re.sub(r'[^\w\s]', '', text)`, which means:
      - `[^\w\s]` → Matches any character that is NOT a word (`\w`) or a whitespace (`\s`).
      - `''` → Replaces all matched punctuation with an empty string.

    Example:
    ----------
    Input  : "Hello, World!!!"
    Output : "Hello World"
    """
    return re.sub(r'[^\w\s]', '', text)

# Apply the function to each review in the dataset
df_reviews["Review_NoPunct"] = df_reviews["Review_Lowercase"].apply(remove_punctuation)

# Display the first 20 rows to observe the changes
df_reviews.head(20)


Unnamed: 0,Review,Review_Lowercase,Review_NoPunct
0,I absolutely LOVE this product!! ❤️ It's super...,i absolutely love this product!! ❤️ it's super...,i absolutely love this product its super effi...
1,Worst purchase ever... 😡 Waste of money. DO NO...,worst purchase ever... 😡 waste of money. do no...,worst purchase ever waste of money do not buy...
2,"This product does what it says, but nothing sp...","this product does what it says, but nothing sp...",this product does what it says but nothing spe...
3,AMAZING quality and fast shipping!!! 🚀🔥 #satis...,amazing quality and fast shipping!!! 🚀🔥 #satis...,amazing quality and fast shipping satisfied f...
4,"Terrible! Had high expectations, but it broke ...","terrible! had high expectations, but it broke ...",terrible had high expectations but it broke in...
5,"This phone is great 📱, but the battery drains ...","this phone is great 📱, but the battery drains ...",this phone is great but the battery drains to...
6,I love how easy it is to use! 🥰 Definitely a g...,i love how easy it is to use! 🥰 definitely a g...,i love how easy it is to use definitely a gam...
7,Do not buy this laptop! 👎 It crashes every 10 ...,do not buy this laptop! 👎 it crashes every 10 ...,do not buy this laptop it crashes every 10 mi...
8,The camera quality is excellent! 📸 Love the ni...,the camera quality is excellent! 📸 love the ni...,the camera quality is excellent love the nigh...
9,Meh... the product is just average. 😐 I expect...,meh... the product is just average. 😐 i expect...,meh the product is just average i expected mo...


In [8]:
# Load stopwords from the NLTK library
stop_words = set(stopwords.words('english'))

"""
Explanation:
------------
- `stopwords.words('english')` loads a predefined list of common English stopwords.
- `set(stopwords.words('english'))` converts the list into a set for **faster lookup**.
- Example stopwords: {"the", "is", "in", "at", "which", "and", "but", "or", "a", "an"}

Why use a set?
--------------
- Checking if a word is in a **set** is faster (O(1) time complexity) compared to a **list** (O(n)).
- This makes the stopword removal process much **more efficient** for large datasets.
"""

# Function to remove stopwords from a given text
def remove_stopwords(text):
    """
    This function removes all stopwords from the given text.

    - It takes a sentence, splits it into individual words,
      and filters out any word that is present in the `stop_words` set.
    - It then joins the remaining words back into a single cleaned string.

    Example:
    ----------
    Input  : "this is a great product with amazing quality"
    Output : "great product amazing quality"
    """

    # Split the sentence into words and keep only words not in stop_words
    cleaned_text = ' '.join(word for word in text.split() if word not in stop_words)

    return cleaned_text  # Return the processed text without stopwords

# Apply the remove_stopwords function to each review in the dataset
df_reviews["Review_NoStopwords"] = df_reviews["Review_NoPunct"].apply(remove_stopwords)

# Display the first 10 rows to observe the changes
df_reviews.head(10)


Unnamed: 0,Review,Review_Lowercase,Review_NoPunct,Review_NoStopwords
0,I absolutely LOVE this product!! ❤️ It's super...,i absolutely love this product!! ❤️ it's super...,i absolutely love this product its super effi...,absolutely love product super efficient really...
1,Worst purchase ever... 😡 Waste of money. DO NO...,worst purchase ever... 😡 waste of money. do no...,worst purchase ever waste of money do not buy...,worst purchase ever waste money buy full issues
2,"This product does what it says, but nothing sp...","this product does what it says, but nothing sp...",this product does what it says but nothing spe...,product says nothing special okay price guess
3,AMAZING quality and fast shipping!!! 🚀🔥 #satis...,amazing quality and fast shipping!!! 🚀🔥 #satis...,amazing quality and fast shipping satisfied f...,amazing quality fast shipping satisfied fastde...
4,"Terrible! Had high expectations, but it broke ...","terrible! had high expectations, but it broke ...",terrible had high expectations but it broke in...,terrible high expectations broke week really d...
5,"This phone is great 📱, but the battery drains ...","this phone is great 📱, but the battery drains ...",this phone is great but the battery drains to...,phone great battery drains fast
6,I love how easy it is to use! 🥰 Definitely a g...,i love how easy it is to use! 🥰 definitely a g...,i love how easy it is to use definitely a gam...,love easy use definitely gamechanger
7,Do not buy this laptop! 👎 It crashes every 10 ...,do not buy this laptop! 👎 it crashes every 10 ...,do not buy this laptop it crashes every 10 mi...,buy laptop crashes every 10 minutes frustrating
8,The camera quality is excellent! 📸 Love the ni...,the camera quality is excellent! 📸 love the ni...,the camera quality is excellent love the nigh...,camera quality excellent love night mode
9,Meh... the product is just average. 😐 I expect...,meh... the product is just average. 😐 i expect...,meh the product is just average i expected mo...,meh product average expected price


In [9]:
from nltk.stem import PorterStemmer, WordNetLemmatizer
import nltk

# Download WordNet dataset for lemmatization
nltk.download('wordnet')

# Initialize Stemmer and Lemmatizer
ps = PorterStemmer()
lemmatizer = WordNetLemmatizer()

# Test words
words = ["running", "flies", "happiness", "better", "wolves", "studies"]

# Apply Stemming
stemmed_words = [ps.stem(word) for word in words]
print("Stemmed Words:", stemmed_words)

# Apply Lemmatization
lemmatized_words = [lemmatizer.lemmatize(word) for word in words]
print("Lemmatized Words:", lemmatized_words)


[nltk_data] Downloading package wordnet to /root/nltk_data...
[nltk_data]   Package wordnet is already up-to-date!


Stemmed Words: ['run', 'fli', 'happi', 'better', 'wolv', 'studi']
Lemmatized Words: ['running', 'fly', 'happiness', 'better', 'wolf', 'study']


In [10]:
# Initialize the Porter Stemmer (for stemming) and WordNet Lemmatizer (for lemmatization)
ps = PorterStemmer()  # Stemming
lemmatizer = WordNetLemmatizer()  # Lemmatization

"""
What is Stemming?
-----------------
- Stemming chops words down to their root form by **removing suffixes**.
- It applies **heuristic rules** rather than a linguistic approach.
- It does NOT always produce a real English word.

Example:
- "running" → "run"
- "happiness" → "happi"
- "better" → "better"  (incorrect, should be "good")

What is Lemmatization?
----------------------
- Lemmatization is **more accurate** than stemming.
- It uses a **dictionary-based approach** to return the correct base form of a word.
- It ensures words are **real** words in the English language.

Example:
- "running" → "run"
- "happiness" → "happiness"  (unchanged because it’s already a base form)
- "better" → "good"  (correctly mapped)
"""

# Function to apply stemming
def apply_stemming(text):
    """
    This function applies stemming to all words in a given text.

    - It splits the sentence into individual words.
    - Each word is reduced to its base form using the Porter Stemmer.
    - The words are then joined back into a processed string.

    Example:
    ----------
    Input  : "running quickly towards happiness"
    Output : "run quickli toward happi"
    """
    return ' '.join(ps.stem(word) for word in text.split())

In [11]:
# Apply stemming to the reviews
df_reviews["Review_Stemmed"] = df_reviews["Review_NoStopwords"].apply(apply_stemming)

# Display the first 10 rows to observe changes
df_reviews.head(20)


Unnamed: 0,Review,Review_Lowercase,Review_NoPunct,Review_NoStopwords,Review_Stemmed
0,I absolutely LOVE this product!! ❤️ It's super...,i absolutely love this product!! ❤️ it's super...,i absolutely love this product its super effi...,absolutely love product super efficient really...,absolut love product super effici realli worth...
1,Worst purchase ever... 😡 Waste of money. DO NO...,worst purchase ever... 😡 waste of money. do no...,worst purchase ever waste of money do not buy...,worst purchase ever waste money buy full issues,worst purchas ever wast money buy full issu
2,"This product does what it says, but nothing sp...","this product does what it says, but nothing sp...",this product does what it says but nothing spe...,product says nothing special okay price guess,product say noth special okay price guess
3,AMAZING quality and fast shipping!!! 🚀🔥 #satis...,amazing quality and fast shipping!!! 🚀🔥 #satis...,amazing quality and fast shipping satisfied f...,amazing quality fast shipping satisfied fastde...,amaz qualiti fast ship satisfi fastdeliveri
4,"Terrible! Had high expectations, but it broke ...","terrible! had high expectations, but it broke ...",terrible had high expectations but it broke in...,terrible high expectations broke week really d...,terribl high expect broke week realli disappoint
5,"This phone is great 📱, but the battery drains ...","this phone is great 📱, but the battery drains ...",this phone is great but the battery drains to...,phone great battery drains fast,phone great batteri drain fast
6,I love how easy it is to use! 🥰 Definitely a g...,i love how easy it is to use! 🥰 definitely a g...,i love how easy it is to use definitely a gam...,love easy use definitely gamechanger,love easi use definit gamechang
7,Do not buy this laptop! 👎 It crashes every 10 ...,do not buy this laptop! 👎 it crashes every 10 ...,do not buy this laptop it crashes every 10 mi...,buy laptop crashes every 10 minutes frustrating,buy laptop crash everi 10 minut frustrat
8,The camera quality is excellent! 📸 Love the ni...,the camera quality is excellent! 📸 love the ni...,the camera quality is excellent love the nigh...,camera quality excellent love night mode,camera qualiti excel love night mode
9,Meh... the product is just average. 😐 I expect...,meh... the product is just average. 😐 i expect...,meh the product is just average i expected mo...,meh product average expected price,meh product averag expect price


In [12]:
'''
import spacy

# Load English model (after installation)
nlp = spacy.load("en_core_web_sm")

def spacy_lemmatization(text):
    """
    Applies SpaCy lemmatization to a given text.

    - Uses the "en_core_web_sm" model to analyze words.
    - Converts each word to its base form (lemma).
    """
    doc = nlp(text)
    return " ".join([token.lemma_ for token in doc])

# Example usage
print(spacy_lemmatization("wolves better going"))
'''

'\nimport spacy\n\n# Load English model (after installation)\nnlp = spacy.load("en_core_web_sm")\n\ndef spacy_lemmatization(text):\n    """\n    Applies SpaCy lemmatization to a given text.\n\n    - Uses the "en_core_web_sm" model to analyze words.\n    - Converts each word to its base form (lemma).\n    """\n    doc = nlp(text)\n    return " ".join([token.lemma_ for token in doc])\n\n# Example usage\nprint(spacy_lemmatization("wolves better going"))\n'

In [13]:
import nltk
from nltk.stem import WordNetLemmatizer

# Ensure that WordNet is downloaded
nltk.download('wordnet')

# Initialize the WordNet Lemmatizer
lemmatizer = WordNetLemmatizer()

# Function to apply lemmatization
def apply_lemmatization(text):
    """
    This function applies lemmatization to all words in a given text.

    - It splits the sentence into individual words.
    - Each word is converted to its base form using the WordNet Lemmatizer.
    - The words are then joined back into a processed string.

    Example:
    ----------
    Input  : "wolves better going"
    Output : "wolf good going"  (correct transformations)
    """
    return ' '.join(lemmatizer.lemmatize(word) for word in text.split())

# Apply lemmatization to the reviews
df_reviews["Review_Lemmatized"] = df_reviews["Review_NoStopwords"].apply(apply_lemmatization)

# Display the first 10 rows to observe changes
df_reviews.head(20)


[nltk_data] Downloading package wordnet to /root/nltk_data...
[nltk_data]   Package wordnet is already up-to-date!


Unnamed: 0,Review,Review_Lowercase,Review_NoPunct,Review_NoStopwords,Review_Stemmed,Review_Lemmatized
0,I absolutely LOVE this product!! ❤️ It's super...,i absolutely love this product!! ❤️ it's super...,i absolutely love this product its super effi...,absolutely love product super efficient really...,absolut love product super effici realli worth...,absolutely love product super efficient really...
1,Worst purchase ever... 😡 Waste of money. DO NO...,worst purchase ever... 😡 waste of money. do no...,worst purchase ever waste of money do not buy...,worst purchase ever waste money buy full issues,worst purchas ever wast money buy full issu,worst purchase ever waste money buy full issue
2,"This product does what it says, but nothing sp...","this product does what it says, but nothing sp...",this product does what it says but nothing spe...,product says nothing special okay price guess,product say noth special okay price guess,product say nothing special okay price guess
3,AMAZING quality and fast shipping!!! 🚀🔥 #satis...,amazing quality and fast shipping!!! 🚀🔥 #satis...,amazing quality and fast shipping satisfied f...,amazing quality fast shipping satisfied fastde...,amaz qualiti fast ship satisfi fastdeliveri,amazing quality fast shipping satisfied fastde...
4,"Terrible! Had high expectations, but it broke ...","terrible! had high expectations, but it broke ...",terrible had high expectations but it broke in...,terrible high expectations broke week really d...,terribl high expect broke week realli disappoint,terrible high expectation broke week really di...
5,"This phone is great 📱, but the battery drains ...","this phone is great 📱, but the battery drains ...",this phone is great but the battery drains to...,phone great battery drains fast,phone great batteri drain fast,phone great battery drain fast
6,I love how easy it is to use! 🥰 Definitely a g...,i love how easy it is to use! 🥰 definitely a g...,i love how easy it is to use definitely a gam...,love easy use definitely gamechanger,love easi use definit gamechang,love easy use definitely gamechanger
7,Do not buy this laptop! 👎 It crashes every 10 ...,do not buy this laptop! 👎 it crashes every 10 ...,do not buy this laptop it crashes every 10 mi...,buy laptop crashes every 10 minutes frustrating,buy laptop crash everi 10 minut frustrat,buy laptop crash every 10 minute frustrating
8,The camera quality is excellent! 📸 Love the ni...,the camera quality is excellent! 📸 love the ni...,the camera quality is excellent love the nigh...,camera quality excellent love night mode,camera qualiti excel love night mode,camera quality excellent love night mode
9,Meh... the product is just average. 😐 I expect...,meh... the product is just average. 😐 i expect...,meh the product is just average i expected mo...,meh product average expected price,meh product averag expect price,meh product average expected price


In [14]:
df_reviews

Unnamed: 0,Review,Review_Lowercase,Review_NoPunct,Review_NoStopwords,Review_Stemmed,Review_Lemmatized
0,I absolutely LOVE this product!! ❤️ It's super...,i absolutely love this product!! ❤️ it's super...,i absolutely love this product its super effi...,absolutely love product super efficient really...,absolut love product super effici realli worth...,absolutely love product super efficient really...
1,Worst purchase ever... 😡 Waste of money. DO NO...,worst purchase ever... 😡 waste of money. do no...,worst purchase ever waste of money do not buy...,worst purchase ever waste money buy full issues,worst purchas ever wast money buy full issu,worst purchase ever waste money buy full issue
2,"This product does what it says, but nothing sp...","this product does what it says, but nothing sp...",this product does what it says but nothing spe...,product says nothing special okay price guess,product say noth special okay price guess,product say nothing special okay price guess
3,AMAZING quality and fast shipping!!! 🚀🔥 #satis...,amazing quality and fast shipping!!! 🚀🔥 #satis...,amazing quality and fast shipping satisfied f...,amazing quality fast shipping satisfied fastde...,amaz qualiti fast ship satisfi fastdeliveri,amazing quality fast shipping satisfied fastde...
4,"Terrible! Had high expectations, but it broke ...","terrible! had high expectations, but it broke ...",terrible had high expectations but it broke in...,terrible high expectations broke week really d...,terribl high expect broke week realli disappoint,terrible high expectation broke week really di...
5,"This phone is great 📱, but the battery drains ...","this phone is great 📱, but the battery drains ...",this phone is great but the battery drains to...,phone great battery drains fast,phone great batteri drain fast,phone great battery drain fast
6,I love how easy it is to use! 🥰 Definitely a g...,i love how easy it is to use! 🥰 definitely a g...,i love how easy it is to use definitely a gam...,love easy use definitely gamechanger,love easi use definit gamechang,love easy use definitely gamechanger
7,Do not buy this laptop! 👎 It crashes every 10 ...,do not buy this laptop! 👎 it crashes every 10 ...,do not buy this laptop it crashes every 10 mi...,buy laptop crashes every 10 minutes frustrating,buy laptop crash everi 10 minut frustrat,buy laptop crash every 10 minute frustrating
8,The camera quality is excellent! 📸 Love the ni...,the camera quality is excellent! 📸 love the ni...,the camera quality is excellent love the nigh...,camera quality excellent love night mode,camera qualiti excel love night mode,camera quality excellent love night mode
9,Meh... the product is just average. 😐 I expect...,meh... the product is just average. 😐 i expect...,meh the product is just average i expected mo...,meh product average expected price,meh product averag expect price,meh product average expected price


In [15]:
# prompt: Using dataframe df_reviews: how do i fully view the content of each cell as they are review which make the cell content longer

import pandas as pd

# Set the maximum column width to display the full content of each cell
pd.set_option('display.max_colwidth', None)

# Display the DataFrame
df_reviews

# Reset the option to the default value if needed
# pd.reset_option('display.max_colwidth')


Unnamed: 0,Review,Review_Lowercase,Review_NoPunct,Review_NoStopwords,Review_Stemmed,Review_Lemmatized
0,I absolutely LOVE this product!! ❤️ It's super efficient and really worth the money. Definitely recommend! 👍,i absolutely love this product!! ❤️ it's super efficient and really worth the money. definitely recommend! 👍,i absolutely love this product its super efficient and really worth the money definitely recommend,absolutely love product super efficient really worth money definitely recommend,absolut love product super effici realli worth money definit recommend,absolutely love product super efficient really worth money definitely recommend
1,Worst purchase ever... 😡 Waste of money. DO NOT BUY!! Full of issues.,worst purchase ever... 😡 waste of money. do not buy!! full of issues.,worst purchase ever waste of money do not buy full of issues,worst purchase ever waste money buy full issues,worst purchas ever wast money buy full issu,worst purchase ever waste money buy full issue
2,"This product does what it says, but nothing special. 🤷‍♂️ It's okay for the price, I guess.","this product does what it says, but nothing special. 🤷‍♂️ it's okay for the price, i guess.",this product does what it says but nothing special its okay for the price i guess,product says nothing special okay price guess,product say noth special okay price guess,product say nothing special okay price guess
3,AMAZING quality and fast shipping!!! 🚀🔥 #satisfied #fastdelivery,amazing quality and fast shipping!!! 🚀🔥 #satisfied #fastdelivery,amazing quality and fast shipping satisfied fastdelivery,amazing quality fast shipping satisfied fastdelivery,amaz qualiti fast ship satisfi fastdeliveri,amazing quality fast shipping satisfied fastdelivery
4,"Terrible! Had high expectations, but it broke in a week. Really disappointed. 😞","terrible! had high expectations, but it broke in a week. really disappointed. 😞",terrible had high expectations but it broke in a week really disappointed,terrible high expectations broke week really disappointed,terribl high expect broke week realli disappoint,terrible high expectation broke week really disappointed
5,"This phone is great 📱, but the battery drains too fast. 🔋😕","this phone is great 📱, but the battery drains too fast. 🔋😕",this phone is great but the battery drains too fast,phone great battery drains fast,phone great batteri drain fast,phone great battery drain fast
6,I love how easy it is to use! 🥰 Definitely a game-changer.,i love how easy it is to use! 🥰 definitely a game-changer.,i love how easy it is to use definitely a gamechanger,love easy use definitely gamechanger,love easi use definit gamechang,love easy use definitely gamechanger
7,Do not buy this laptop! 👎 It crashes every 10 minutes. So frustrating! 😡,do not buy this laptop! 👎 it crashes every 10 minutes. so frustrating! 😡,do not buy this laptop it crashes every 10 minutes so frustrating,buy laptop crashes every 10 minutes frustrating,buy laptop crash everi 10 minut frustrat,buy laptop crash every 10 minute frustrating
8,The camera quality is excellent! 📸 Love the night mode. 🌙✨,the camera quality is excellent! 📸 love the night mode. 🌙✨,the camera quality is excellent love the night mode,camera quality excellent love night mode,camera qualiti excel love night mode,camera quality excellent love night mode
9,Meh... the product is just average. 😐 I expected more for this price.,meh... the product is just average. 😐 i expected more for this price.,meh the product is just average i expected more for this price,meh product average expected price,meh product averag expect price,meh product average expected price


In [16]:
# Save the DataFrame to a CSV file
df_reviews.to_csv('01_preprocessed_reviews.csv', index=False)

# Download the CSV file (Colab-specific)
from google.colab import files
files.download('01_preprocessed_reviews.csv')


<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>