# **TASK-1**
**Sentiment Analysis:** Analyze social media or product reviews to determine sentiment (positive, negative, neutral).

-->Sentiment analysis is a process that analyzes written or spoken language to determine if the tone is positive, negative, or neutral. It's a type of text analysis that uses natural language processing (NLP) to assign a score to each clause based on the sentiment expressed in the text.

Here we analyzed customer sentiment in Amazon product reviews



In [8]:
import pandas as pd
import re
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics import classification_report, accuracy_score
from nltk.sentiment import SentimentIntensityAnalyzer
import nltk
nltk.download('vader_lexicon')


[nltk_data] Downloading package vader_lexicon to /root/nltk_data...
[nltk_data]   Package vader_lexicon is already up-to-date!


True

**1. Load and Inpect the Dataset**




In [9]:
# Load the dataset
data = pd.read_csv("/content/sample_data/EMS2.csv", encoding='latin-1')
data

Unnamed: 0,link,Review,date,source,user
0,https://www.google.com/maps/reviews/data=!4m8!...,Here I joined as an intern.I have learned lots...,a year ago,Google,"{\n ""name"": ""KALSE PRATHMESH PRASHANT"",\n ""l..."
1,https://www.google.com/maps/reviews/data=!4m8!...,Being a part of eSparse Matrix Solution pvt lt...,3 years ago,Google,"{\n ""name"": ""Prashansha Jadon"",\n ""link"": ""h..."
2,https://www.google.com/maps/reviews/data=!4m8!...,"Team of dedicated, talented and experienced pr...",2 years ago,Google,"{\n ""name"": ""Sohail Daudani"",\n ""link"": ""htt..."
3,https://www.google.com/maps/reviews/data=!4m8!...,Here I joined as an intern. I have learned lot...,3 years ago,Google,"{\n ""name"": ""Pallabi Devi"",\n ""link"": ""https..."
4,https://www.google.com/maps/reviews/data=!4m8!...,Here I joined as an intern . here team leaders...,3 years ago,Google,"{\n ""name"": ""Dhage Vaishnavi"",\n ""link"": ""ht..."
...,...,...,...,...,...
89,"<iframe src=""https://www.facebook.com/plugins/...",fantastic team. They gove had a long time to c...,"Monday, June 19, 2023 at 11:08 AM",Facebook,PROMINENT PUBLICATION LLP
90,"<iframe src=""https://www.facebook.com/plugins/...",Best Software Development Company with Best Se...,"Thursday, November 18, 2021 at 10:47 AM",Facebook,Swami Raj Villa
91,https://www.justdial.com/Pune/Sparse-Matrix-So...,"Sparse Matrix Solutions is great work place, m...","Thursday, 10/24/2019",Justdial,KRISHNA
92,https://www.justdial.com/Pune/Sparse-Matrix-So...,I have been working at Best Companies full-time.,"Thursday, 10/24/2019",Justdial,Onkar


In [10]:
# View basic details
data.head()
data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 94 entries, 0 to 93
Data columns (total 5 columns):
 #   Column  Non-Null Count  Dtype 
---  ------  --------------  ----- 
 0   link    94 non-null     object
 1   Review  72 non-null     object
 2   date    94 non-null     object
 3   source  94 non-null     object
 4   user    94 non-null     object
dtypes: object(5)
memory usage: 3.8+ KB


**Step 2: Preprocess the Data**

Clean the column

In [11]:
# Text cleaning function
def clean_text(text):
    text = re.sub(r'[^\w\s]', '', text)  # Remove punctuation
    text = re.sub(r'\d+', '', text)      # Remove numbers
    text = text.lower()                  # Convert to lowercase
    return text

# Apply cleaning
data['cleaned_review_summary'] = data['Review'].apply(lambda x: clean_text(str(x)))

# Check for missing values and drop them if necessary
#data.dropna(subset=['cleaned_review_summary','Review' ], inplace=True)
data

Unnamed: 0,link,Review,date,source,user,cleaned_review_summary
0,https://www.google.com/maps/reviews/data=!4m8!...,Here I joined as an intern.I have learned lots...,a year ago,Google,"{\n ""name"": ""KALSE PRATHMESH PRASHANT"",\n ""l...",here i joined as an interni have learned lots ...
1,https://www.google.com/maps/reviews/data=!4m8!...,Being a part of eSparse Matrix Solution pvt lt...,3 years ago,Google,"{\n ""name"": ""Prashansha Jadon"",\n ""link"": ""h...",being a part of esparse matrix solution pvt lt...
2,https://www.google.com/maps/reviews/data=!4m8!...,"Team of dedicated, talented and experienced pr...",2 years ago,Google,"{\n ""name"": ""Sohail Daudani"",\n ""link"": ""htt...",team of dedicated talented and experienced pro...
3,https://www.google.com/maps/reviews/data=!4m8!...,Here I joined as an intern. I have learned lot...,3 years ago,Google,"{\n ""name"": ""Pallabi Devi"",\n ""link"": ""https...",here i joined as an intern i have learned lots...
4,https://www.google.com/maps/reviews/data=!4m8!...,Here I joined as an intern . here team leaders...,3 years ago,Google,"{\n ""name"": ""Dhage Vaishnavi"",\n ""link"": ""ht...",here i joined as an intern here team leaders ...
...,...,...,...,...,...,...
89,"<iframe src=""https://www.facebook.com/plugins/...",fantastic team. They gove had a long time to c...,"Monday, June 19, 2023 at 11:08 AM",Facebook,PROMINENT PUBLICATION LLP,fantastic team they gove had a long time to co...
90,"<iframe src=""https://www.facebook.com/plugins/...",Best Software Development Company with Best Se...,"Thursday, November 18, 2021 at 10:47 AM",Facebook,Swami Raj Villa,best software development company with best se...
91,https://www.justdial.com/Pune/Sparse-Matrix-So...,"Sparse Matrix Solutions is great work place, m...","Thursday, 10/24/2019",Justdial,KRISHNA,sparse matrix solutions is great work place ma...
92,https://www.justdial.com/Pune/Sparse-Matrix-So...,I have been working at Best Companies full-time.,"Thursday, 10/24/2019",Justdial,Onkar,i have been working at best companies fulltime


Formating Date to have it in same format

In [12]:

from datetime import datetime, timedelta


# Define a function to clean and standardize the date format
def clean_date(date):
    try:
        # Check if the date is a relative format like "3 years ago"
        if "years ago" in date:
            years_ago = int(date.split()[0])
            return (datetime.now() - timedelta(days=years_ago*365)).strftime('%Y-%m-%d')
        elif "months ago" in date:
            months_ago = int(date.split()[0])
            return (datetime.now() - timedelta(days=months_ago*30)).strftime('%Y-%m-%d')
        elif "days ago" in date:
            days_ago = int(date.split()[0])
            return (datetime.now() - timedelta(days=days_ago)).strftime('%Y-%m-%d')
        else:
            # Convert to datetime object and standardize the format
            return pd.to_datetime(date).strftime('%Y-%m-%d')
    except Exception as e:
        print(f"Error parsing date: {date}, {e}")
        return None

# Apply the cleaning function to the 'date' column
data['cleaned_date'] = data['date'].apply(clean_date)

# Drop rows where date conversion failed
data.dropna(subset=['cleaned_date'], inplace=True)

# Optionally, replace the old date column with the cleaned one
data['date'] = data['cleaned_date']
data.drop(columns=['cleaned_date'], inplace=True)


data


Error parsing date: a year ago, Unknown datetime string format, unable to parse: a year ago, at position 0
Error parsing date: a year ago, Unknown datetime string format, unable to parse: a year ago, at position 0
Error parsing date: a year ago, Unknown datetime string format, unable to parse: a year ago, at position 0


Unnamed: 0,link,Review,date,source,user,cleaned_review_summary
1,https://www.google.com/maps/reviews/data=!4m8!...,Being a part of eSparse Matrix Solution pvt lt...,2021-12-07,Google,"{\n ""name"": ""Prashansha Jadon"",\n ""link"": ""h...",being a part of esparse matrix solution pvt lt...
2,https://www.google.com/maps/reviews/data=!4m8!...,"Team of dedicated, talented and experienced pr...",2022-12-07,Google,"{\n ""name"": ""Sohail Daudani"",\n ""link"": ""htt...",team of dedicated talented and experienced pro...
3,https://www.google.com/maps/reviews/data=!4m8!...,Here I joined as an intern. I have learned lot...,2021-12-07,Google,"{\n ""name"": ""Pallabi Devi"",\n ""link"": ""https...",here i joined as an intern i have learned lots...
4,https://www.google.com/maps/reviews/data=!4m8!...,Here I joined as an intern . here team leaders...,2021-12-07,Google,"{\n ""name"": ""Dhage Vaishnavi"",\n ""link"": ""ht...",here i joined as an intern here team leaders ...
5,https://www.google.com/maps/reviews/data=!4m8!...,I had worked for 3 months with esparse matrix ...,2021-12-07,Google,"{\n ""name"": ""Binny Chouhan"",\n ""link"": ""http...",i had worked for months with esparse matrix s...
...,...,...,...,...,...,...
89,"<iframe src=""https://www.facebook.com/plugins/...",fantastic team. They gove had a long time to c...,2023-06-19,Facebook,PROMINENT PUBLICATION LLP,fantastic team they gove had a long time to co...
90,"<iframe src=""https://www.facebook.com/plugins/...",Best Software Development Company with Best Se...,2021-11-18,Facebook,Swami Raj Villa,best software development company with best se...
91,https://www.justdial.com/Pune/Sparse-Matrix-So...,"Sparse Matrix Solutions is great work place, m...",2019-10-24,Justdial,KRISHNA,sparse matrix solutions is great work place ma...
92,https://www.justdial.com/Pune/Sparse-Matrix-So...,I have been working at Best Companies full-time.,2019-10-24,Justdial,Onkar,i have been working at best companies fulltime


**3. Initialize VADER Sentiment Analyzer**


In [13]:
# Initialize SentimentIntensityAnalyzer
sia = SentimentIntensityAnalyzer()


**4. Analyze Sentiment of Review**
Use VADER to assign a sentiment score to each review.

In [14]:
# Function to get sentiment from VADER
def analyze_sentiment(text):
    score = sia.polarity_scores(str(text))  # Get sentiment scores
    if score['compound'] > 0.05:
        return 'positive'
    elif score['compound'] < -0.05:
        return 'negative'
    else:
        return 'neutral'

# Apply the sentiment analysis
data['predicted_sentiment'] = data['Review'].apply(analyze_sentiment)

# Display results
data[['Review', 'predicted_sentiment']]


Unnamed: 0,Review,predicted_sentiment
1,Being a part of eSparse Matrix Solution pvt lt...,positive
2,"Team of dedicated, talented and experienced pr...",positive
3,Here I joined as an intern. I have learned lot...,positive
4,Here I joined as an intern . here team leaders...,positive
5,I had worked for 3 months with esparse matrix ...,positive
...,...,...
89,fantastic team. They gove had a long time to c...,positive
90,Best Software Development Company with Best Se...,positive
91,"Sparse Matrix Solutions is great work place, m...",positive
92,I have been working at Best Companies full-time.,positive


**6. Removing Negative Reviews**

In [15]:
# Filter out negative reviews
data = data[(data['predicted_sentiment'] != 'negative')]
data
# Save filtered dataset
#data.to_csv("filtered_reviews.csv", index=False)


Unnamed: 0,link,Review,date,source,user,cleaned_review_summary,predicted_sentiment
1,https://www.google.com/maps/reviews/data=!4m8!...,Being a part of eSparse Matrix Solution pvt lt...,2021-12-07,Google,"{\n ""name"": ""Prashansha Jadon"",\n ""link"": ""h...",being a part of esparse matrix solution pvt lt...,positive
2,https://www.google.com/maps/reviews/data=!4m8!...,"Team of dedicated, talented and experienced pr...",2022-12-07,Google,"{\n ""name"": ""Sohail Daudani"",\n ""link"": ""htt...",team of dedicated talented and experienced pro...,positive
3,https://www.google.com/maps/reviews/data=!4m8!...,Here I joined as an intern. I have learned lot...,2021-12-07,Google,"{\n ""name"": ""Pallabi Devi"",\n ""link"": ""https...",here i joined as an intern i have learned lots...,positive
4,https://www.google.com/maps/reviews/data=!4m8!...,Here I joined as an intern . here team leaders...,2021-12-07,Google,"{\n ""name"": ""Dhage Vaishnavi"",\n ""link"": ""ht...",here i joined as an intern here team leaders ...,positive
5,https://www.google.com/maps/reviews/data=!4m8!...,I had worked for 3 months with esparse matrix ...,2021-12-07,Google,"{\n ""name"": ""Binny Chouhan"",\n ""link"": ""http...",i had worked for months with esparse matrix s...,positive
...,...,...,...,...,...,...,...
89,"<iframe src=""https://www.facebook.com/plugins/...",fantastic team. They gove had a long time to c...,2023-06-19,Facebook,PROMINENT PUBLICATION LLP,fantastic team they gove had a long time to co...,positive
90,"<iframe src=""https://www.facebook.com/plugins/...",Best Software Development Company with Best Se...,2021-11-18,Facebook,Swami Raj Villa,best software development company with best se...,positive
91,https://www.justdial.com/Pune/Sparse-Matrix-So...,"Sparse Matrix Solutions is great work place, m...",2019-10-24,Justdial,KRISHNA,sparse matrix solutions is great work place ma...,positive
92,https://www.justdial.com/Pune/Sparse-Matrix-So...,I have been working at Best Companies full-time.,2019-10-24,Justdial,Onkar,i have been working at best companies fulltime,positive


Segregrate and Display reviews from facebook and Google

In [16]:
FBreviews = data[data['source'] == 'Facebook']
FBreviews


Unnamed: 0,link,Review,date,source,user,cleaned_review_summary,predicted_sentiment
87,"<iframe src=""https://www.facebook.com/plugins/...",I Highly review Esparse Matrix Solutions for a...,2023-06-22,Facebook,Gernari Holidays,i highly review esparse matrix solutions for a...,positive
88,"<iframe src=""https://www.facebook.com/plugins/...",Best Software Development company. They develo...,2023-06-19,Facebook,Diwanji Services,best software development company they develop...,positive
89,"<iframe src=""https://www.facebook.com/plugins/...",fantastic team. They gove had a long time to c...,2023-06-19,Facebook,PROMINENT PUBLICATION LLP,fantastic team they gove had a long time to co...,positive
90,"<iframe src=""https://www.facebook.com/plugins/...",Best Software Development Company with Best Se...,2021-11-18,Facebook,Swami Raj Villa,best software development company with best se...,positive


In [17]:
GReviews = data[data['source'] == 'Google']
GReviews

Unnamed: 0,link,Review,date,source,user,cleaned_review_summary,predicted_sentiment
1,https://www.google.com/maps/reviews/data=!4m8!...,Being a part of eSparse Matrix Solution pvt lt...,2021-12-07,Google,"{\n ""name"": ""Prashansha Jadon"",\n ""link"": ""h...",being a part of esparse matrix solution pvt lt...,positive
2,https://www.google.com/maps/reviews/data=!4m8!...,"Team of dedicated, talented and experienced pr...",2022-12-07,Google,"{\n ""name"": ""Sohail Daudani"",\n ""link"": ""htt...",team of dedicated talented and experienced pro...,positive
3,https://www.google.com/maps/reviews/data=!4m8!...,Here I joined as an intern. I have learned lot...,2021-12-07,Google,"{\n ""name"": ""Pallabi Devi"",\n ""link"": ""https...",here i joined as an intern i have learned lots...,positive
4,https://www.google.com/maps/reviews/data=!4m8!...,Here I joined as an intern . here team leaders...,2021-12-07,Google,"{\n ""name"": ""Dhage Vaishnavi"",\n ""link"": ""ht...",here i joined as an intern here team leaders ...,positive
5,https://www.google.com/maps/reviews/data=!4m8!...,I had worked for 3 months with esparse matrix ...,2021-12-07,Google,"{\n ""name"": ""Binny Chouhan"",\n ""link"": ""http...",i had worked for months with esparse matrix s...,positive
...,...,...,...,...,...,...,...
82,https://www.google.com/maps/reviews/data=!4m8!...,,2021-12-07,Google,"{\n ""name"": ""Pankaj Keshwani"",\n ""link"": ""ht...",,neutral
83,https://www.google.com/maps/reviews/data=!4m8!...,,2019-12-08,Google,"{\n ""name"": ""Amol Sirsat"",\n ""link"": ""https:...",,neutral
84,https://www.google.com/maps/reviews/data=!4m8!...,,2019-12-08,Google,"{\n ""name"": ""Rahul Chinchore"",\n ""link"": ""ht...",,neutral
85,https://www.google.com/maps/reviews/data=!4m8!...,,2020-12-07,Google,"{\n ""name"": ""pranav vadnere"",\n ""link"": ""htt...",,neutral
