<a href="https://colab.research.google.com/github/leaBroe/Deep_Learning_in_Python/blob/main/deep_learning_python_winter_school_24.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

### Project Overview

This is the project for the course "Deep Learning in Python" from the Machine Learning Winter School 2024 from the University of Fribourg.

Authors:  
Lea Brönnimann lea.broennimann@unifr.ch (19-107-010)  
Mohamed Mansour Faye mohamedmansour.faye@unifr.ch (19-505-197)  
Laura Dekker laura.dekker@unifr.ch (22-112-346)  

The code as well as some example texts to test the app can be found in the Github repo: [https://github.com/leaBroe/Deep_Learning_in_Python.git](https://github.com/leaBroe/Deep_Learning_in_Python.git)  


**Abstract:**

We have created an app that conducts a sentiment analysis over written text. This is a process where text is analyzed to assess its tone, whether it is more positive or negative. To be inclusive the app can also analyze text from other languages than English, because we included a translation model in it. The final product is the Gradio interface in which you can upload an image containing digital text. From this you will receive the translated text and the sentiment and emotion that are inferred to be contained in it. An app like this can be useful to bring across emotion in the digital world better. Almost everyone has had trouble reading between the lines of a text when deducing the emotion behind it. Another application of such a model can be found in AI customer service where an app like this can be used by an AI to provide more appropriate responses to its customers. A further application could be in more effective social media monitoring to make the job of social media moderators easier. Other applications could include analysing customer feedback. In addition, the entities are extracted from the text, which can be useful, for example, in quickly analysing annual reports from companies. All in all, enough reason to develop software that can analyze emotions and extract relevant entities in written text.

### Model Card

- **Model Details**: This app incorporates three different models in order to perform sentiment analysis from text images. We first use the python-tesseract OCR tool from the `pytesseract` package to extract text from images. We then perform sentiment analysis and topic classification using `distilbert-base-uncased-finetuned-sst-2-english`, which is available on Hugging Face. Because this model only works with english text, we also use the `m2m100_418M` model from Meta (also available on HuggingFace) to translate the input text when it is not in English.
- **Data**: The distilbert SST2 model is based on the BERT transformer from Google, which was trained on large amounts of english text (from Wikipedia amongst others) in a self-supervised fashion. It was then fine-tuned on the Stanford Sentiment Treebank to enhance performance on sentiment analysis tasks.
- **Performance**: The fine-tuned distilbert model achieves a good score of 91.3 on the GLUE benchmark, but as we will discuss later, might struggle with some specific subtasks/topics. The overall pipeline's performance also depends on the language of the input text. The Many-to-Many translation model used here doesn't perform as well when translating Wolof than when translating German, for example.
- **Ethical Considerations**: Developers highlight the model producing biased predictions which affect underrepresented populations. As the SST 2 dataset was sourced from movie reviews on Rotten Tomatoes, many of the statements on which the fine-tuning is based on contain judgement on the way (American) movies in particular portray one topic or another, which might result in biased predictions.

### Outlook

Given an extra month, we might consider:

- **Expanding Data Sources**: Including more diverse sources of images and text, such as news articles or blogs, to enrich the analysis.
- **Model Fine-Tuning**: Fine-tuning the sentiment analysis model on a dataset specifically related to the specific user-case we are interested in. We could probably have used more advanced AI models such as a sentiment analysis model that works with any language or better translation models. But the ones we used now were quite easy to use and gave decent results.
- **Feature Expansion**: Adding functionality to track sentiment trends over time, e.g. enabling longitudinal studies on public sentiment toward environmental issues.
- **More Advanced Analyses**: For example, in order to better analyse the overall sentiment of customer feedback or trends in the perception of environmental issues, it would certainly be helpful if you could analyse a larger quantity of texts in parallel and, e.g. calculate the percentage of positive or negative feedback.

In [9]:
!sudo apt install tesseract-ocr

Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
tesseract-ocr is already the newest version (4.1.1-2.1build1).
0 upgraded, 0 newly installed, 0 to remove and 35 not upgraded.


In [10]:
!pip install pytesseract #!!!



In [11]:
!pip install langdetect #!!!



In [12]:
!pip install gradio #!!!



In [13]:
import gradio as gr #!!!
from PIL import Image
import pytesseract
import spacy
from transformers import pipeline

# Pipelines
emotion_pipeline = pipeline("text-classification", model="bhadresh-savani/distilbert-base-uncased-emotion")
sentiment_pipeline = pipeline("sentiment-analysis", model="distilbert-base-uncased-finetuned-sst-2-english")

# Load the spaCy English model
nlp = spacy.load("en_core_web_sm")

# Your existing function for extracting text from an image file
def extract_text_from_image(image_path):
    from PIL import Image
    import pytesseract
    image = Image.open(image_path)
    extracted_text = pytesseract.image_to_string(image)
    return extracted_text.strip()


from langdetect import detect
from transformers import M2M100ForConditionalGeneration, M2M100Tokenizer

# Load your translation model and tokenizer
model_name = "facebook/m2m100_418M"
tokenizer = M2M100Tokenizer.from_pretrained(model_name)
model = M2M100ForConditionalGeneration.from_pretrained(model_name)

def translate_text_to_english(text):
    # Detect the language of the input text
    detected_lang = detect(text)
    print(f"Detected language: {detected_lang}")  # For debugging

    # Check if the detected language is English
    if detected_lang == 'en':
        return text, detected_lang  # Return the original text if it's already in English

    # Specify the source language for the tokenizer; m2m100 uses language codes
    tokenizer.src_lang = detected_lang

    # Encode the text for the model
    encoded = tokenizer(text, return_tensors="pt", padding=True, truncation=True, max_length=512)

    # Generate translation tokens and decode them to text
    # Note: forced_bos_token_id forces the model to translate to English
    generated_tokens = model.generate(**encoded, forced_bos_token_id=tokenizer.get_lang_id("en"))
    translated_text = tokenizer.batch_decode(generated_tokens, skip_special_tokens=True)[0]

    return translated_text, detected_lang


# Function to get sentiment from text
def get_sentiment(text):
    results = sentiment_pipeline(text)
    return results

# Function to get emotion from text
def get_emotion(text):
    results = emotion_pipeline(text)
    return results

# Function to extract named entities using spaCy
def extract_entities(text):
    doc = nlp(text)
    entities = [(ent.text, ent.label_) for ent in doc.ents]
    return entities

# Main processing function to integrate OCR, translation, sentiment and emotion analysis, and NER
def process_image(image):
    extracted_text = extract_text_from_image(image)
    translated_text, detected_lang = translate_text_to_english(extracted_text)
    sentiment_result = get_sentiment(translated_text)
    emotion_result = get_emotion(translated_text)
    entities = extract_entities(translated_text)  # Use spaCy to extract entities
    entities_str = ', '.join([f"{text} ({label})" for text, label in entities])  # Format entities for display
    return extracted_text, detected_lang, translated_text, sentiment_result, emotion_result, entities_str

# Define Gradio interface
iface = gr.Interface(fn=process_image,
                     inputs=gr.Image(label="Upload Image", type="filepath"),
                     outputs=[gr.Textbox(label="Extracted Text"),
                              gr.Textbox(label="Detected Language"),
                              gr.Textbox(label="Translated Text"),
                              gr.Textbox(label="Sentiment Analysis Result"),
                              gr.Textbox(label="Emotion Analysis Result"),
                              gr.Textbox(label="Extracted Entities")],
                     title="Image to Sentiment and Emotion Analysis",
                     description="Upload an image containing text, and the app will translate the text to English, then perform sentiment and emotion analysis and extract named entities.")

# Launch the app
# debug=True
iface.launch(share=True)

Colab notebook detected. To show errors in colab notebook, set debug=True in launch()
Running on public URL: https://73a3bad08bfc5cec2b.gradio.live

This share link expires in 72 hours. For free permanent hosting and GPU upgrades, run `gradio deploy` from Terminal to deploy to Spaces (https://huggingface.co/spaces)


