

---

# **Project Presentation**: _[Smart Summarizer]_

---

## **Team Members**

- **Mähönen Janne**: Worked on text-to-speech (TTS), translation, and summary types integration.
- **Ocampo Heidi**: [Brief description of your role and contribution].
- **Sillanaukee Joonas**: [Brief description of your role and contribution].
- **Silvola Izabel**: [Brief description of your role and contribution].
- **Vihanto Jami**: [Brief description of your role and contribution].

---

## **. Introduction**

- **Objective**: Provide users with faster analysis of large text sets by generating summaries for articles, documents, and educational materials.
- **Key Tools**:  
   - **BART model** for summarization.
   - **deep_translator** for translation between languages.
   - **gTTS** for text-to-speech (TTS).
   - **nltk/wordnet** for extracting key terms and definitions.

---


## **. Design**

- **Text Input**: Users can input text directly, upload files (PDF, DOCX), or provide URLs for summarization.
- **Summary Types**: Multiple summarization formats are available:
   - Main Points
   - Short, Medium, and Long Summaries
   - Concepts List (with definitions for key terms)

- **Language Detection**: The app uses `langdetect` to identify the language and if chosen it can provide text-to-speech in the correct tone.

---

## **. Challenges (Janne)**

###  **Dependency Conflicts and Tool Integration**:
We faced library conflicts using Google Translate with Gradio's dependencie, and resolved them by switching to Deep Translator. Integrating multiple tools like Gradio, NLP models, and TTS systems while avoiding compatibility issues was crucial.


###  **Concept List Extraction**:
Generating concept lists by extracting key nouns and adjectives was complicated. We relied on NLTK’s WordNet for definitions, which only works reliably with English input, requiring precise language detection to address its short comings.

###  **Maintaining Workflow Between Summarization and Translation**:
Ensuring smooth transitions between summarization, translation, and TTS required careful data handling, especially when toggling between original and translated content to maintain output quality.
## **. Challenges (Izabel)**


- **Remote communication and lack of face-to-face interaction:**
One of the challenges we faced was working remotely, which made communication through Teams more difficult at times. Without face-to-face interaction, it was harder to have quick, spontaneous conversations to solve problems or brainstorm ideas.
Scheduling meetings was also a challenge, as part of the group have other responsibilities to manage, making it tough to find time that worked for the whole team. Plus, without being able to pick up on non-verbal cues, it was sometimes tricky to fully understand tone or intent during discussions.

- **Varied programming skill levels and educational backgrounds:**
Another challenge we encountered was the difference in programming skills and educational backgrounds within the team. Some members had more experience with certain tools and technologies, while others were still learning or came from different academic programs with varying focuses. This created a bit of a learning curve for some team members and sometimes slowed down progress.

- **Integrating features developed separately:**
We developed each new feature individually and then tried to combine them later, which turned out to be a challenge. While programming features one at a time helped us focus on each aspect, integrating them into a single system was more complex than expected. There were unexpected compatibility issues between the different components, and combining everything required more troubleshooting and coordination than we initially anticipated. This made the integration phase more difficult than planned.

---

## **. Code Logic**

Here’s a detailed explanation of the **Code Logic** for summarization, translation, and text-to-speech (TTS), with relevant snippets:

### 1. **Summarization Logic**:
The app uses the **BART model** from Hugging Face’s transformers library to summarize the input text. Based on the user’s selection (short, medium, or long summary), the model is configured with different maximum and minimum lengths for the generated summary.

**Snippet:**
```python
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

# Load BART model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("facebook/bart-large-cnn")
model = AutoModelForSeq2SeqLM.from_pretrained("facebook/bart-large-cnn")

# Summarization function
def summarize_bart(input_text, max_length, min_length):
    inputs = tokenizer(input_text, return_tensors="pt", max_length=1024, truncation=True)
    summary_ids = model.generate(inputs["input_ids"], max_length=max_length, min_length=min_length, do_sample=False)
    summary = tokenizer.decode(summary_ids[0], skip_special_tokens=True)
    return summary

# Example for generating a short summary
short_summary = summarize_bart(content, max_length=80, min_length=40)
```

In this example, the **BART model** processes the input text and generates a summary. The length of the summary is controlled by `max_length` and `min_length` parameters to match the user's choice of summary type (e.g., short, medium, long).

---

### 2. **Translation Logic**:
The app uses the **Deep Translator** library to translate the summarized text into the selected language. If the user selects a language other than "Original," the app translates the summary to the target language.

**Snippet:**
```python
from deep_translator import GoogleTranslator

# Function to translate the summary
def translate_summary(text, target_lang):
    if target_lang != "Original":  # Only translate if the target language is not "Original"
        translator = GoogleTranslator(source="en", target=target_lang)
        translated_text = translator.translate(text)
        return translated_text
    return text  # If "Original" is selected, return the original text

# Example of translating a summary to Spanish
translated_summary = translate_summary(short_summary, target_lang="es")
```

The **GoogleTranslator** class from Deep Translator handles the translation, with `target_lang` being the user-selected output language. If the user chooses "Original," the app keeps the summary in its original language.

---

### 3. **Text-to-Speech (TTS) Logic**:
Using **gTTS (Google Text-to-Speech)**, the app can convert the summary into audio. The language for TTS is determined by detecting the language of the summarized text using the **langdetect** library, ensuring the speech is generated in the correct language.

**Snippet:**
```python
from langdetect import detect
from gtts import gTTS
from io import BytesIO

# Function to convert text to speech
def text_to_speech(input_text, summary_text, summary_generated):
    text_to_read = summary_text if summary_generated else input_text
    detected_lang = detect(text_to_read)  # Detect the language of the text
    tts_lang = tts_language_map.get(detected_lang, 'en')  # Map detected language to TTS language code

    # Generate speech using gTTS
    tts = gTTS(text=text_to_read, lang=tts_lang)
    audio_file = BytesIO()
    tts.write_to_fp(audio_file)
    audio_file.seek(0)
    return audio_file

# Example of converting a summary to speech
audio_output = text_to_speech(input_text, translated_summary, summary_generated=True)
```

In this step, the app uses **gTTS** to generate speech in the detected language (`detected_lang`). The detected language is mapped to the appropriate TTS language code using `tts_language_map`. The audio is generated and saved into a temporary file, which can be played by the user.

---

This structure ensures a smooth workflow: 
1. The text is summarized using the **BART model**.
2. The summary can be **translated** into different languages.
3. Finally, it can be converted to **speech** using **gTTS**, with the correct language automatically detected and applied.

---

## **. Future Improvements**

- Improve the concept list generation to provide more accurate and relevant terms, especially for educational material.
- Expand summary generation methods to handle larger text sets and create educational summaries more effectively.

---

## **. In Conclusion**

- We built a tool that allows users to input text, upload files, or provide URLs to generate summaries. Users can choose different summary types, translate the content into their preferred language, and listen to the summary using text-to-speech. This enhances accessibility and usability, offering an all-in-one solution for summarization, translation, and audio output.

- Reflect on the learning experiences and outcomes of the project. `Do this 24.10`

---

Feel free to customize any sections further based on your project specifics!