<a href="https://colab.research.google.com/github/arssite/GENAi-Assessment/blob/main/Sentiment_Analyzer_Using_DistilBert.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
!pip install transformers torch gradio pandas matplotlib openpyxl

Collecting gradio
  Downloading gradio-4.44.1-py3-none-any.whl.metadata (15 kB)
Collecting aiofiles<24.0,>=22.0 (from gradio)
  Downloading aiofiles-23.2.1-py3-none-any.whl.metadata (9.7 kB)
Collecting fastapi<1.0 (from gradio)
  Downloading fastapi-0.115.0-py3-none-any.whl.metadata (27 kB)
Collecting ffmpy (from gradio)
  Downloading ffmpy-0.4.0-py3-none-any.whl.metadata (2.9 kB)
Collecting gradio-client==1.3.0 (from gradio)
  Downloading gradio_client-1.3.0-py3-none-any.whl.metadata (7.1 kB)
Collecting httpx>=0.24.1 (from gradio)
  Downloading httpx-0.27.2-py3-none-any.whl.metadata (7.1 kB)
Collecting orjson~=3.0 (from gradio)
  Downloading orjson-3.10.7-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (50 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m50.4/50.4 kB[0m [31m1.4 MB/s[0m eta [36m0:00:00[0m
Collecting pydub (from gradio)
  Downloading pydub-0.25.1-py2.py3-none-any.whl.metadata (1.4 kB)
Collecting python-multipart>=0.0.9 (from g

In [8]:
!pip install python-docx PyPDF2

Collecting python-docx
  Downloading python_docx-1.1.2-py3-none-any.whl.metadata (2.0 kB)
Downloading python_docx-1.1.2-py3-none-any.whl (244 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m244.3/244.3 kB[0m [31m3.3 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: python-docx
Successfully installed python-docx-1.1.2


In [12]:
import torch
import gradio as gr
import pandas as pd
import matplotlib.pyplot as plt
from transformers import pipeline
from docx import Document
from PyPDF2 import PdfReader

# Initialize the sentiment analysis pipeline
analyzer = pipeline("text-classification", model="distilbert/distilbert-base-uncased-finetuned-sst-2-english")

# Function to analyze sentiment for a single sentence
def sentiment_analyzer(review):
    sentiment = analyzer(review)
    return sentiment[0]['label']

# Create a pie chart visualization for the sentiment
def sentiment_pie_chart(sentiment_labels):
    sentiment_counts = pd.Series(sentiment_labels).value_counts()

    fig, ax = plt.subplots()
    sentiment_counts.plot(kind='pie', autopct='%1.1f%%', colors=['green', 'red'], ax=ax)
    ax.set_ylabel('')
    ax.set_title('Sentiment Distribution')

    return fig

# Function to analyze a single input sentence
def analyze_single_sentence(sentence):
    sentiment = sentiment_analyzer(sentence)
    return f"The sentiment of the sentence is: {sentiment}"

# Function to read and analyze reviews from Excel, PDF, or DOCX files
def read_reviews_and_analyze_sentiment(file_object):
    if file_object.name.endswith('.xlsx'):
        # Load the Excel file into a DataFrame
        df = pd.read_excel(file_object)
        if 'Reviews' not in df.columns:
            raise ValueError("Excel file must contain a 'Reviews' column.")
        reviews = df['Reviews'].tolist()

    elif file_object.name.endswith('.docx'):
        # Read the content of the DOCX file
        doc = Document(file_object)
        reviews = [para.text for para in doc.paragraphs if para.text.strip()]

    elif file_object.name.endswith('.pdf'):
        # Read the content of the PDF file
        reader = PdfReader(file_object)
        text = ""
        for page in reader.pages:
            text += page.extract_text()
        reviews = text.split('\n')  # Assuming reviews are newline-separated

    else:
        raise ValueError("Unsupported file format. Please upload .xlsx, .pdf, or .docx files.")

    # Analyze the sentiment of each review
    sentiments = [sentiment_analyzer(review) for review in reviews]
    df = pd.DataFrame({'Reviews': reviews, 'Sentiment': sentiments})

    # Generate pie chart
    chart_object = sentiment_pie_chart(sentiments)

    return df, chart_object

# Gradio interface combining single sentence analysis and file-based review sentiment analysis
def main_interface(input_option, sentence=None, file=None):
    if input_option == "Single Sentence":
        if sentence:
            result = analyze_single_sentence(sentence)
            return None, None, result  # Single sentence output
        else:
            return None, None, "Please enter a sentence."
    elif input_option == "File Upload":
        if file:
            df, chart_object = read_reviews_and_analyze_sentiment(file)
            return df, chart_object, None  # File output
        else:
            return None, None, "Please upload a file."

# Gradio interface
demo = gr.Interface(
    fn=main_interface,
    inputs=[
        gr.Radio(label="Choose Input Type", choices=["Single Sentence", "File Upload"], value="Single Sentence"),
        gr.Textbox(label="Enter a sentence for sentiment analysis (if selected)", placeholder="Type your sentence here..."),
        gr.File(file_types=["xlsx", "pdf", "docx"], label="Upload your review comment file (if selected)")
    ],
    outputs=[
        gr.Dataframe(label="Sentiment Analysis Results (For File Uploads)"),
        gr.Plot(label="Sentiment Distribution Chart (For File Uploads)"),
        gr.Textbox(label="Single Sentence Sentiment Result (For Single Sentence Input)")
    ],
    title="@GenAILearniverse Project 3: Sentiment Analyzer",
    description="This application analyzes the sentiment of either a single sentence or reviews in uploaded files (Excel, PDF, DOCX)."
)

demo.launch()




Setting queue=True in a Colab notebook requires sharing enabled. Setting `share=True` (you can turn this off by setting `share=False` in `launch()` explicitly).

Colab notebook detected. To show errors in colab notebook, set debug=True in launch()
Running on public URL: https://0d9c320d3dab444d6c.gradio.live

This share link expires in 72 hours. For free permanent hosting and GPU upgrades, run `gradio deploy` from Terminal to deploy to Spaces (https://huggingface.co/spaces)




In [14]:
!gradio deploy


mnist_test.csv:   0% 0.00/18.3M [00:00<?, ?B/s]
mnist_train_small.csv:   0% 0.00/36.5M [00:00<?, ?B/s][A

Upload 2 LFS files:   0% 0/2 [00:00<?, ?it/s][A[Amnist_test.csv:   3% 541k/18.3M [00:00<00:03, 5.09MB/s]
mnist_train_small.csv:   2% 606k/36.5M [00:00<00:06, 5.74MB/s][A
mnist_test.csv:  24% 4.42M/18.3M [00:00<00:00, 20.5MB/s]
mnist_test.csv:  59% 10.8M/18.3M [00:00<00:00, 27.8MB/s]
mnist_test.csv:  88% 16.0M/18.3M [00:00<00:00, 17.9MB/s]
mnist_train_small.csv:  46% 16.6M/36.5M [00:00<00:01, 15.4MB/s][A
mnist_train_small.csv:  62% 22.8M/36.5M [00:01<00:00, 23.6MB/s][A
mnist_test.csv: 100% 18.3M/18.3M [00:01<00:00, 12.6MB/s]


Upload 2 LFS files:  50% 1/2 [00:01<00:01,  1.66s/it][A[A
mnist_train_small.csv: 100% 36.5M/36.5M [00:02<00:00, 13.2MB/s]


Upload 2 LFS files: 100% 2/2 [00:02<00:00,  1.48s/it]
Space available at [4;94mhttps://huggingface.co/spaces/arssite/Sentiment_Analyzer_Using_Distilbert[0m
