In [1]:
###################################################################################################################################################


import pandas as pd
from transformers import pipeline
import warnings
warnings.filterwarnings("ignore")

# Load pre-trained models
sentiment_pipeline = pipeline("sentiment-analysis", model="distilbert-base-uncased-finetuned-sst-2-english")
ner_pipeline = pipeline("ner", grouped_entities=True)
summarization_pipeline = pipeline("summarization", model="facebook/bart-large-cnn")

# Example list of customer reviews
reviews = [
    "The Scooter and Bike quality is absolutely superb. I recently purchased both items, and while the scooter exceeded my expectations in terms of performance and build quality, I did encounter a delay in the delivery process. The delivery took longer than anticipated, which was a bit inconvenient. Santosh from Nashik, who handled my queries, was quite responsive and helpful throughout the process. Despite the delay, I am overall satisfied with my purchase, as the quality of the products speaks for itself.",
    "I am extremely disappointed with the Samsung tablet that I bought recently. Unfortunately, the device broke down after just one use. The quality of the product was far from what I had expected, and it left me frustrated with the overall experience. To make matters worse, the customer support from the company was not very helpful in addressing the issue. Additionally, the tablet’s performance was below average, which is a major letdown considering the reputation of the brand. I would not recommend this product to others based on my experience.",
    "I had a very positive experience with my recent purchase from India. The delivery was impressively fast, and the customer service was top-notch. Roshan and Josh, who handled my order, were exceptionally quick in responding to my queries and resolving any issues I had. The quality of the product was exactly as described, and I am very happy with the overall service. The company’s attention to customer satisfaction is commendable, and I will definitely consider purchasing from them again in the future. The entire process from ordering to receiving the product was smooth and efficient.",
    "The recent purchase I made was somewhat satisfactory, although it didn't fully meet my expectations. The product itself was decent, but there were certain aspects that left me wanting more. The delivery was on time, but the product did not stand out as much as I had hoped. Europe was mentioned in the context of the service, but it did not really enhance the overall experience. Hazel, who was supposed to handle customer support, was late in picking up my call, which added to my frustration. Overall, while the product was okay, the service experience could have been improved."
]


# Function to return reviews with their sentiment
def sentiment_analysis_df(reviews):
    results = sentiment_pipeline(reviews)
    df = pd.DataFrame({
        'Review': reviews,
        'Sentiment': [res['label'] for res in results],
        'Score': [res['score'] for res in results]
    })
    return df

# Function to return reviews with their NER entities
def extract_entities(reviews):
    entities = [ner_pipeline(review) for review in reviews]
    grouped_entities = []
    for review_entities in entities:
        grouped_entities.append([(ent['word'], ent['entity_group']) for ent in review_entities])
    return grouped_entities

# Function to summarize reviews
def summarize_reviews(reviews):
    summaries = summarization_pipeline(reviews, max_length=40, min_length=10, do_sample=False)
    return [summary['summary_text'] for summary in summaries]

# Function to create DataFrame with reviews, sentiments, extracted entities, and summaries
def sentiment_ner_summary_analysis_df(reviews):
    sentiment_df = sentiment_analysis_df(reviews)
    sentiment_df['Entities'] = extract_entities(reviews)
    sentiment_df['Highlight'] = summarize_reviews(reviews)
    return sentiment_df

# Create a DataFrame of reviews with sentiment analysis, NER results, and summaries
combined_df = sentiment_ner_summary_analysis_df(reviews)

# Convert the 'Entities' column to string representation
combined_df['Entities'] = combined_df['Entities'].apply(lambda x: ', '.join([f"{word}({entity})" for word, entity in x]))

# Save the DataFrame to a CSV file
csv_filename = 'review_analysis.csv'
combined_df.to_csv(csv_filename, index=False)

print(f"DataFrame has been saved to {csv_filename}")
print("\nFirst few rows of the DataFrame:")
print(combined_df.head())

config.json:   0%|          | 0.00/629 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/268M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/48.0 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

No model was supplied, defaulted to dbmdz/bert-large-cased-finetuned-conll03-english and revision f2482bf (https://huggingface.co/dbmdz/bert-large-cased-finetuned-conll03-english).
Using a pipeline without specifying a model name and revision in production is not recommended.


config.json:   0%|          | 0.00/998 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/1.33G [00:00<?, ?B/s]

Some weights of the model checkpoint at dbmdz/bert-large-cased-finetuned-conll03-english were not used when initializing BertForTokenClassification: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight']
- This IS expected if you are initializing BertForTokenClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForTokenClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


tokenizer_config.json:   0%|          | 0.00/60.0 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/213k [00:00<?, ?B/s]

config.json:   0%|          | 0.00/1.58k [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/1.63G [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/363 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/899k [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

DataFrame has been saved to review_analysis.csv

First few rows of the DataFrame:
                                              Review Sentiment     Score  \
0  The Scooter and Bike quality is absolutely sup...  POSITIVE  0.990219   
1  I am extremely disappointed with the Samsung t...  NEGATIVE  0.999754   
2  I had a very positive experience with my recen...  POSITIVE  0.999857   
3  The recent purchase I made was somewhat satisf...  NEGATIVE  0.996332   

                              Entities  \
0  Sc(MISC), Santosh(PER), Nashik(LOC)   
1                         Samsung(ORG)   
2   India(LOC), Roshan(PER), Josh(PER)   
3              Europe(LOC), Hazel(PER)   

                                           Highlight  
0  The delivery took longer than anticipated, whi...  
1  The device broke down after just one use. The ...  
2  The delivery was impressively fast, and the cu...  
3  The product itself was decent, but there were ...  
