<a href="https://colab.research.google.com/github/cdtlaura/nlp2/blob/main/Final_project_with_UserInterface_Document_Summarization_and_Translation_App_compatible_with_mobile_devices.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

"Document Summarization and Translation App". This app summarizes lengthy documents and translates the summaries into multiple languages (e.g., French, Spanish, Chinese). Businesses could use it for:
Quickly understanding key points of reports.
Communicating summaries to international teams.

Modified Application Plan

Input: A user uploads a document (e.g., Word .docx) via a file uploader.

Processing:
Extract content from the uploaded document.
Summarize the document using an LLM (Together AI in this case).

Translate the summary into multiple languages (e.g., French, Spanish, Chinese).

Output:
Display:
The original summary.

Translations of the summary.

Optionally save the summary and translations as a downloadable file (e.g., PDF or .txt).

Explanation of the Workflow
Upload Document:


Uses Colab's files.upload() to allow users to upload a Word document.

Extract Content:


The uploaded document is read and its text content is extracted using the Docx2txtLoader from langchain_community.

Summarize Content:


The full document content is summarized using Together AI's meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo model.
The prompt explicitly requests a detailed and comprehensive summary.

Translate Summary:


Translations are handled by MarianMT models for French, Spanish, and Chinese.

Display Results:


The original summary and translations are displayed in Markdown format for readability.

Save Results:


The summary and translations are saved to a .txt file, which is then made available for download.

In [None]:
!pip install together -qqq

[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m70.3/70.3 kB[0m [31m3.4 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m4.5/4.5 MB[0m [31m14.1 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m44.7/44.7 kB[0m [31m2.8 MB/s[0m eta [36m0:00:00[0m
[?25h

In [None]:
!pip install langchain_together python-dotenv langchain-community langchain youtube_transcript_api pytube numpy -qqq

[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/2.4 MB[0m [31m?[0m eta [36m-:--:--[0m[2K   [91m━━━━━━━━━━━━[0m[90m╺[0m[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.7/2.4 MB[0m [31m20.8 MB/s[0m eta [36m0:00:01[0m[2K   [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[91m╸[0m [32m2.4/2.4 MB[0m [31m44.0 MB/s[0m eta [36m0:00:01[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.4/2.4 MB[0m [31m29.6 MB/s[0m eta [36m0:00:00[0m
[?25h[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/1.0 MB[0m [31m?[0m eta [36m-:--:--[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.0/1.0 MB[0m [31m30.9 MB/s[0m eta [36m0:00:00[0m
[?25h[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/622.3 kB[0m [31m?[0m eta [36m-:--:--[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m622.3/622.3 kB[0m [31m28.1 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━

In [None]:
%pip install --upgrade --quiet  docx2txt -qqq

  Preparing metadata (setup.py) ... [?25l[?25hdone
  Building wheel for docx2txt (setup.py) ... [?25l[?25hdone


In [None]:
!pip install langchain-together --upgrade -qqq


In [None]:
# Import necessary libraries
from langchain_community.document_loaders import Docx2txtLoader
from transformers import MarianMTModel, MarianTokenizer
from IPython.display import Markdown, display
import os
from together import Together
import warnings
warnings.filterwarnings("ignore")

In [None]:
# Set your API key as an environment variable
os.environ["TOGETHER_API_KEY"] = ""
client = Together()

In [None]:
# Step 1: Load the Word document
loader = Docx2txtLoader("/content/Computer Vision10.docx")
data = loader.load()

# Step 2: Extract content from the document
document_content = " ".join([doc.page_content for doc in data])

# Step 3: Query the LLM for a summary
stream = client.chat.completions.create(
    model="meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo",
    messages=[{"role": "user", "content": f"Please summarize the following document:\n\n{document_content}"}],
    stream=True,
)

# Step 4: Collect the streamed summary (no printing here)
summary_text = ""  # Initialize an empty string to store the final summary
for chunk in stream:
    chunk_content = chunk.choices[0].delta.content or ""
    summary_text += chunk_content  # Concatenate the chunk content to build the summary

# Step 5: Display the summary only once
print("Summary:")
display(Markdown(summary_text))

# Step 6: Load MarianMT models and tokenizers for multiple translations
translation_models = {
    "French": {
        "model": MarianMTModel.from_pretrained("Helsinki-NLP/opus-mt-en-fr"),
        "tokenizer": MarianTokenizer.from_pretrained("Helsinki-NLP/opus-mt-en-fr"),
    },
    "Spanish": {
        "model": MarianMTModel.from_pretrained("Helsinki-NLP/opus-mt-en-es"),
        "tokenizer": MarianTokenizer.from_pretrained("Helsinki-NLP/opus-mt-en-es"),
    },
    "Chinese": {
        "model": MarianMTModel.from_pretrained("Helsinki-NLP/opus-mt-en-zh"),
        "tokenizer": MarianTokenizer.from_pretrained("Helsinki-NLP/opus-mt-en-zh"),
    },
}

# Step 7: Function to split text into smaller chunks
def split_text(text, max_chunk_size=512):
    """
    Splits text into smaller chunks that fit within the token limit.
    """
    words = text.split()
    chunks = []
    current_chunk = []

    for word in words:
        current_chunk.append(word)
        if len(" ".join(current_chunk)) > max_chunk_size:
            chunks.append(" ".join(current_chunk))
            current_chunk = []

    if current_chunk:  # Add any remaining words as the last chunk
        chunks.append(" ".join(current_chunk))

    return chunks

# Step 8: Function to translate text in chunks
def translate_text_in_chunks(text, model, tokenizer, max_tokens=512):
    """
    Translates large text by splitting it into chunks that fit within the token limit.
    """
    chunks = split_text(text, max_chunk_size=max_tokens)  # Split text into chunks
    translated_chunks = []

    for chunk in chunks:
        input_ids = tokenizer.encode(chunk, return_tensors="pt", truncation=True)
        output_ids = model.generate(input_ids, max_length=max_tokens, num_beams=4)
        translated_chunks.append(tokenizer.decode(output_ids[0], skip_special_tokens=True))

    return " ".join(translated_chunks)

# Step 9: Translate the LLM summary into multiple languages
translations = {}
for language, resources in translation_models.items():
    translations[language] = translate_text_in_chunks(summary_text, resources["model"], resources["tokenizer"])



Summary:


The document is an introduction to image smoothing and thresholding techniques using OpenCV, a widely used open-source computer vision library. The author, Laura Castillo, and co-author, Yamil Guevara, provide an overview of the following techniques:

1. **Averaging Smoothing**: This technique reduces image noise by calculating the mean of pixel values within a kernel. It is simple and effective, making it a common starting point for image noise reduction.

2. **Gaussian Blurring**: This technique improves on averaging by applying a Gaussian function to the pixels, emphasizing the central pixels in the kernel area. It is useful for reducing Gaussian noise in images and is widely used in preprocessing steps for edge detection and image segmentation.

3. **Median Blurring**: This technique is ideal for removing "salt-and-pepper" noise from images by calculating the median of all pixels within a kernel. It preserves the edges of objects in an image, making it useful for applications where maintaining structural details is important.

4. **Bilateral Filtering**: This technique smooths an image while preserving its edges by considering both the spatial distance and intensity difference of pixels in the kernel. It is computationally intensive but suitable for applications like facial recognition and feature extraction.

5. **Simple Thresholding**: This technique separates the foreground from the background by setting pixel values above or below a certain threshold to a maximum or minimum intensity. It is ideal for images with consistent lighting.

6. **Adaptive Thresholding**: This technique extends simple thresholding by allowing the threshold value to vary across different regions of the image. It is effective for images with varying illumination and provides better segmentation accuracy compared to simple thresholding.

The authors conclude that OpenCV provides a versatile set of tools for image smoothing and thresholding, each with unique characteristics suited to specific applications. Mastering these fundamental techniques is crucial for developing sophisticated computer vision applications.

config.json:   0%|          | 0.00/1.42k [00:00<?, ?B/s]

pytorch_model.bin:   0%|          | 0.00/301M [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/293 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/42.0 [00:00<?, ?B/s]

source.spm:   0%|          | 0.00/778k [00:00<?, ?B/s]

target.spm:   0%|          | 0.00/802k [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/1.34M [00:00<?, ?B/s]

config.json:   0%|          | 0.00/1.47k [00:00<?, ?B/s]

pytorch_model.bin:   0%|          | 0.00/312M [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/293 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/44.0 [00:00<?, ?B/s]

source.spm:   0%|          | 0.00/802k [00:00<?, ?B/s]

target.spm:   0%|          | 0.00/826k [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/1.59M [00:00<?, ?B/s]

config.json:   0%|          | 0.00/1.40k [00:00<?, ?B/s]

pytorch_model.bin:   0%|          | 0.00/312M [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/293 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/44.0 [00:00<?, ?B/s]

source.spm:   0%|          | 0.00/806k [00:00<?, ?B/s]

target.spm:   0%|          | 0.00/805k [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/1.62M [00:00<?, ?B/s]

In [None]:
# Step 10: Display the translated summaries
print("\nTranslated Summaries:")
for language, translated_text in translations.items():
    print(f"\nTranslated Summary in {language}:")
    display(Markdown(translated_text))


Translated Summaries:

Translated Summary in French:


Le document est une introduction aux techniques de lissage d'images et de seuil utilisant OpenCV, une bibliothèque de vision en open source largement utilisée. L'auteur, Laura Castillo, et le coauteur, Yamil Guevara, donnent un aperçu des techniques suivantes : 1. **Lissage moyen** : Cette technique réduit le bruit d'image en calculant la moyenne des valeurs de pixel au sein d'un noyau. Il est simple et efficace, ce qui en fait un point de départ commun pour la réduction du bruit d'image. 2. **Gaussian Blurring** : Cette technique améliore la moyenne Il est utile pour réduire le bruit gaussien dans les images et est largement utilisé dans les étapes de prétraitement pour la détection des bords et la segmentation de l'image. 3. **Blurring médian**: Cette technique est idéale pour éliminer le bruit "sel et poivre" des images en calculant la médiane de tous les pixels à l'intérieur d'un noyau. Il préserve les bords des objets dans une image, ce qui le rend utile pour les applications où le maintien des détails structurels est important. 4. ** Filtrage bilatéral**: Cette technique lisse une image tout en préservant ses bords en tenant compte à la fois de la distance spatiale et de la différence d'intensité des pixels dans le noyau. Elle est intensive par calcul mais convient à des applications comme la reconnaissance faciale et l'extraction des fonctionnalités. 5. **Simple Thresholding**: Cette technique sépare le premier plan de l'arrière-plan en fixant des valeurs de pixel au-dessus ou au-dessous d'un certain seuil à une intensité maximale ou minimale. Elle est idéale pour les images avec un éclairage cohérent. 6. **Thresholding adaptatif**: Cette technique étend le seuil simple en permettant à la valeur seuil de varier entre les différentes régions de l'image. Elle est efficace pour les images avec un éclairage variable et offre une meilleure précision de segmentation par rapport au seuil simple. Les auteurs concluent qu'OpenCV fournit un ensemble polyvalent d'outils pour le lissage et le seuil d'image, chacun avec des caractéristiques uniques adaptées à des applications spécifiques. applications de vision informatique sophistiquées.


Translated Summary in Spanish:


El documento es una introducción a las técnicas de suavizado y umbralado de imágenes utilizando OpenCV, una biblioteca de visión de código abierto ampliamente utilizada. La autora, Laura Castillo, y la coautora, Yamil Guevara, proporcionan una visión general de las siguientes técnicas: 1. **Average Smoothing**: Esta técnica reduce el ruido de la imagen calculando la media de los valores de píxeles dentro de un núcleo. Es simple y eficaz, lo que lo convierte en un punto de partida común para la reducción del ruido de imagen. 2. **Gaussian Blurring**: Esta técnica mejora en el promedio mediante la aplicación de una función gaussiana a los píxeles, haciendo hincapié en los píxeles centrales en el área del núcleo. Es útil para reducir el ruido gaussiano en las imágenes y se utiliza ampliamente en los pasos de preprocesamiento para la detección de bordes y la segmentación de imágenes. 3. **Blurring medio**: Esta técnica es ideal para eliminar el ruido "sal y pimienta" de las imágenes mediante el cálculo de la mediana de todos los píxeles dentro de un núcleo. 4. ** Filtrado bilateral**: Esta técnica suaviza una imagen al tiempo que preserva sus bordes considerando tanto la distancia espacial como la diferencia de intensidad de píxeles en el núcleo. Es computacionalmente intensiva pero adecuada para aplicaciones como reconocimiento facial y extracción de características. 5. **Sencillo Thresholding**: Esta técnica separa el primer plano del fondo estableciendo valores de píxeles por encima o por debajo de un determinado umbral a una intensidad máxima o mínima. Es ideal para imágenes con iluminación consistente. 6. **Adaptive Thresholding**: Esta técnica extiende el umbral simple al permitir que el valor umbral varíe entre diferentes regiones de la imagen. Es eficaz para imágenes con iluminación variable y proporciona una mejor precisión de segmentación en comparación con el umbral simple. Los autores concluyen que OpenCV proporciona un conjunto versátil de herramientas para el suavizado de imágenes y el umbral, cada una con características únicas adecuadas a aplicaciones específicas. sofisticadas aplicaciones de visión por ordenador.


Translated Summary in Chinese:


本文件介绍使用开放源码计算机视觉图书馆OpenCV的图像平滑和临界技术。作者Laura Castillo和共同作者Yamil Guevara概述了以下技术:** 平均平滑**:这一技术通过计算内核中像素值的平均值来减少图像噪音。它简单而有效,使它成为减少图像噪音的共同起点。** Gausian Blurring**:这一技术在平均水平上有所改进。 通过对像素应用高斯函数,强调内核区域的中心像素。它有助于减少图像中的高斯噪音,并被广泛用于边缘探测和图像分割的预处理步骤。 3. ** Median Blurring **:这一技术是理想的,通过计算内核内核中所有像素的中位数,从图像中去除“盐和派普尔”噪音。它保存图像中对象的边缘,使其在维护结构细节很重要的地方对应用程序有用。 4. ** 双边过滤**:这一技术既考虑到内核像素的空间距离和强度差异,又在保持其边缘的同时平滑图像,它具有计算强度,但适合面部识别和特征提取等应用。 6. ** 适应性悬赏**:这一技术通过允许图像不同区域的临界值不同而扩大了简单的临界值,从而扩大了简单的临界值,对不同照明度的图像有效,与简单的临界值相比,提供了更好的分解准确性,作者得出结论认为,开放CV为图像平滑和阈值提供了一套多功能工具,每个工具都有适合具体应用的独特特征,掌握这些基本技术对于开发至关重要。 精密计算机视觉应用。

Business Use Case:

This application addresses the challenge of efficiently extracting critical insights from lengthy Word documents, which often consume significant time and resources to review. By leveraging advanced summarization and translation technologies, the app empowers users to quickly distill essential information from complex content and communicate it across diverse linguistic markets.

Business Value:

This app offers substantial value by significantly reducing the time and effort required for document review and translation, streamlining workflows for knowledge management and decision-making processes. Companies can enhance productivity, ensure better collaboration, and improve global communication by providing instant summaries and multilingual support. Furthermore, the application reduces reliance on human translation services, enabling cost savings while maintaining high accuracy. Ultimately, this tool drives operational efficiency and fosters inclusivity in multilingual environments, allowing businesses to scale more effectively in international markets.
"""


In [None]:
pip install gradio -qqq

[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m57.2/57.2 MB[0m [31m11.0 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m320.2/320.2 kB[0m [31m16.7 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m94.8/94.8 kB[0m [31m6.6 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m11.2/11.2 MB[0m [31m95.4 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m73.2/73.2 kB[0m [31m5.7 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m63.8/63.8 kB[0m [31m4.2 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m168.2/168.2 kB[0m [31m12.8 MB/s[0m eta [36m0:00:00[0m
[?25h

interface 3

In [None]:
import gradio as gr

# Define a function to summarize and translate
def summarize_and_translate_ui(file):
    if file is None:
        # Fallback for mobile users: Use a preloaded sample document
        sample_path = "/content/sample.docx"  # Replace with the path to your sample document
        loader = Docx2txtLoader(sample_path)
    else:
        # Load the uploaded document
        loader = Docx2txtLoader(file.name)

    data = loader.load()
    document_content = " ".join([doc.page_content for doc in data])

    # Generate the summary using LLM
    stream = client.chat.completions.create(
        model="meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo",
        messages=[{"role": "user", "content": f"Please summarize the following document:\n\n{document_content}"}],
        stream=True,
    )
    summary_text = ""
    for chunk in stream:
        chunk_content = chunk.choices[0].delta.content or ""
        summary_text += chunk_content

    # Translate the summary into multiple languages
    translations = {}
    for language, resources in translation_models.items():
        translations[language] = translate_text_in_chunks(
            summary_text, resources["model"], resources["tokenizer"]
        )

    # Return the summary and translations
    return [summary_text] + list(translations.values())

iface = gr.Interface(
    fn=summarize_and_translate_ui,
    inputs=gr.File(label="Upload a Word Document (.docx)", type="filepath", file_types=[".docx"]), # Changed type to "filepath"
    outputs=[
        gr.Textbox(label="📝Summary (English)"),
        gr.Textbox(label="🇫🇷Translated Summary (French)"),
        gr.Textbox(label="🇪🇸Translated Summary (Spanish)"),
        gr.Textbox(label="🇨🇳Translated Summary (Chinese)"),
    ],
    title="🌟 Document Summarizer and Translator 🌟",
    description="""
        **Business Use Case:**
        This application addresses the challenge of efficiently extracting critical insights from lengthy Word documents, which often consume significant time and resources to review. By leveraging advanced summarization and translation technologies, the app empowers users to quickly distill essential information from complex content and communicate it across diverse linguistic markets.

        **Business Value:**
        This app offers substantial value by significantly reducing the time and effort required for document review and translation, streamlining workflows for knowledge management and decision-making processes. Companies can enhance productivity, ensure better collaboration, and improve global communication by providing instant summaries and multilingual support. Furthermore, the application reduces reliance on human translation services, enabling cost savings while maintaining high accuracy. Ultimately, this tool drives operational efficiency and fosters inclusivity in multilingual environments, allowing businesses to scale more effectively in international markets.

        **Note:** This app works best on desktop browsers. Mobile users can test the app using a preloaded sample document if upload functionality is unavailable.
    """,
    article="Upload your document, and this tool will generate a summary in English and translate it into multiple languages, streamlining your workflow and improving productivity."
)

 # Launch the app
iface.launch()

Running Gradio in a Colab notebook requires sharing enabled. Automatically setting `share=True` (you can turn this off by setting `share=False` in `launch()` explicitly).

Colab notebook detected. To show errors in colab notebook, set debug=True in launch()
* Running on public URL: https://8cff2002570e79cf4b.gradio.live

This share link expires in 72 hours. For free permanent hosting and GPU upgrades, run `gradio deploy` from the terminal in the working directory to deploy to Hugging Face Spaces (https://huggingface.co/spaces)


