### Using NLP for Text Data Quality
**Objective**: Enhance text data quality using NLP techniques.

**Task**: Handling Noisy Text Data

**Steps**:
1. Data Set: Obtain a dataset with customer reviews containing noise (e.g., random characters).
2. Clean Data: Use regex patterns to clean the noise from text data.
3. Evaluate: Compare the text before and after cleaning for noise.

In [6]:
# write your code from here

import pandas as pd
import re

# Step 1: Sample noisy customer reviews
data = {
    'ReviewID': [1, 2, 3, 4],
    'ReviewText': [
        "Great product!!! $$$ Loved it :) #awesome123",
        "Terrible service...!!! #### Will not buy again!!!",
        "Good value for $$$ money!!! But shipping was slow...##",
        "??? Worst experience ever!!! Call 123-456-7890!!!"
    ]
}

df = pd.DataFrame(data)

# Function to clean noisy text using regex
def clean_text(text):
    # Remove special characters except basic punctuation and letters/numbers
    text = re.sub(r'[^a-zA-Z0-9\s,.!?]', '', text)
    # Replace multiple spaces with single space
    text = re.sub(r'\s+', ' ', text)
    # Strip leading/trailing spaces
    return text.strip()

# Step 2: Clean the reviews
df['CleanedReview'] = df['ReviewText'].apply(clean_text)

# Step 3: Evaluate before and after cleaning
for i, row in df.iterrows():
    print(f"Original Review [{row['ReviewID']}]: {row['ReviewText']}")
    print(f"Cleaned Review  [{row['ReviewID']}]: {row['CleanedReview']}")
    print("-" * 60)


Original Review [1]: Great product!!! $$$ Loved it :) #awesome123
Cleaned Review  [1]: Great product!!! Loved it awesome123
------------------------------------------------------------
Original Review [2]: Terrible service...!!! #### Will not buy again!!!
Cleaned Review  [2]: Terrible service...!!! Will not buy again!!!
------------------------------------------------------------
Original Review [3]: Good value for $$$ money!!! But shipping was slow...##
Cleaned Review  [3]: Good value for money!!! But shipping was slow...
------------------------------------------------------------
Original Review [4]: ??? Worst experience ever!!! Call 123-456-7890!!!
Cleaned Review  [4]: ??? Worst experience ever!!! Call 1234567890!!!
------------------------------------------------------------
