# 📝 Text Summarization with Audio Generation

This notebook creates an interactive text summarization app with:
- Load any Hugging Face summarization model
- Text-to-Audio generation
- Gradio interface for input/output
- Export functionality (text + audio)
- Dark/Light theme toggle


## 1️⃣ Install Required Packages


In [None]:
# Install required packages
!pip install -q gradio transformers torch sentencepiece accelerate librosa soundfile gtts


[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/98.2 kB[0m [31m?[0m eta [36m-:--:--[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m98.2/98.2 kB[0m [31m4.0 MB/s[0m eta [36m0:00:00[0m
[?25h

## 2️⃣ Set Hugging Face Token


In [None]:
import os

# Option 1: Use your token directly (replace with your token)
hf_token = ""

# Option 2: Or use secrets from Google Colab (recommended)
# from google.colab import userdata
# hf_token = userdata.get('HF_TOKEN')

# Set the token
if hf_token != "your_hf_token_here":
    os.environ['HF_TOKEN'] = hf_token
    print("✅ Hugging Face token set successfully!")
else:
    print("⚠️ Please replace 'your_hf_token_here' with your actual Hugging Face token")
    print("   Get your token from: https://huggingface.co/settings/tokens")


## 3️⃣ Load Summarization Model


In [None]:
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
import torch

# Default model - you can change this to any Hugging Face model
MODEL_NAME = "facebook/bart-large-cnn"  # Change this to your desired model

print(f"🔄 Loading model: {MODEL_NAME}...")

# Get token if available
token = os.environ.get('HF_TOKEN', None)

try:
    # Load tokenizer
    tokenizer = AutoTokenizer.from_pretrained(
        MODEL_NAME,
        token=token if token else None
    )

    # Load model
    model = AutoModelForSeq2SeqLM.from_pretrained(
        MODEL_NAME,
        token=token if token else None
    )

    # Move to GPU if available
    device = 'cuda' if torch.cuda.is_available() else 'cpu'
    model = model.to(device)
    model.eval()

    print(f"✅ Model loaded successfully on {device}!")
    print(f"📊 Model: {MODEL_NAME}")
    print(f"💻 Device: {device}")

except Exception as e:
    print(f"❌ Error loading model: {e}")
    print("💡 Make sure you have set your Hugging Face token correctly")


🔄 Loading model: facebook/bart-large-cnn...


config.json: 0.00B [00:00, ?B/s]

vocab.json: 0.00B [00:00, ?B/s]

merges.txt: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

model.safetensors:   0%|          | 0.00/1.63G [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/363 [00:00<?, ?B/s]

✅ Model loaded successfully on cpu!
📊 Model: facebook/bart-large-cnn
💻 Device: cpu


## 4️⃣ Summarization Function


In [None]:
def summarize_text(text, max_length=1024, min_length=50, return_tokens=False):
    """
    Summarize input text using the loaded model.

    Args:
        text: Input text to summarize
        max_length: Maximum length of summary
        min_length: Minimum length of summary
        return_tokens: Whether to return token count

    Returns:
        Summarized text
    """
    if not text.strip():
        return "⚠️ Please provide some text to summarize.", (0, 0)

    try:
        # Tokenize input
        inputs = tokenizer(
            text,
            max_length=1024,
            truncation=True,
            return_tensors="pt"
        ).to(device)

        # Generate summary
        with torch.no_grad():
            summary_ids = model.generate(
                inputs["input_ids"],
                max_length=max_length,
                min_length=min_length,
                num_beams=4,
                length_penalty=2.0,
                early_stopping=True
            )

        # Decode summary
        summary = tokenizer.decode(summary_ids[0], skip_special_tokens=True)

        # Calculate token counts
        input_tokens = len(tokenizer.encode(text))
        output_tokens = len(summary_ids[0])

        if return_tokens:
            return summary, (input_tokens, output_tokens)
        else:
            return summary

    except Exception as e:
        return f"❌ Error during summarization: {str(e)}", (0, 0)


## 5️⃣ Load Text-to-Audio Model


In [None]:
from gtts import gTTS
import tempfile
import os

# Use gTTS (Google Text-to-Speech) - free and fast
# For more advanced models, you can use:
# - facebook/fastspeech2-en-ljspeech
# - microsoft/speecht5_tts

print("🔄 Loading text-to-audio system...")

def text_to_audio(text, language='en', slow=False):
    """
    Convert text to audio using Google Text-to-Speech.

    Args:
        text: Text to convert
        language: Language code (default: 'en')
        slow: Whether to speak slowly

    Returns:
        Path to audio file
    """
    if not text or not text.strip():
        return None

    try:
        # Create temporary file
        with tempfile.NamedTemporaryFile(delete=False, suffix='.mp3') as tmp:
            # Generate speech
            tts = gTTS(text=text, lang=language, slow=slow)
            tts.save(tmp.name)

            print(f"✅ Audio generated successfully!")
            return tmp.name
    except Exception as e:
        print(f"❌ Error generating audio: {e}")
        return None

def text_to_audio_advanced(text, language='en', slow=False):
    """
    Advanced text-to-audio with error handling and file management.
    """
    if not text or not text.strip():
        return None

    try:
        # Clean text for TTS (remove special characters that might cause issues)
        cleaned_text = text.replace('"', '').replace('\n', ' ')

        # Create temporary file
        temp_dir = tempfile.gettempdir()
        audio_file = os.path.join(temp_dir, f"summary_{hash(text) % 10000}.mp3")

        # Generate speech
        tts = gTTS(text=cleaned_text, lang=language, slow=slow)
        tts.save(audio_file)

        return audio_file
    except Exception as e:
        print(f"❌ Error in advanced audio generation: {e}")
        return None

print("✅ Text-to-Audio system ready!")


🔄 Loading text-to-audio system...
✅ Text-to-Audio system ready!


## 6️⃣ Test Audio Generation


In [None]:
# Test audio generation
test_text = "Hello! This is a test of the text-to-audio conversion."
print(f"📝 Test text: {test_text}")
print("🔊 Generating audio...")

audio_path = text_to_audio_advanced(test_text)
if audio_path:
    print(f"✅ Audio saved to: {audio_path}")
    # In Colab, you can play it with:
    # from IPython.display import Audio
    # Audio(audio_path)


📝 Test text: Hello! This is a test of the text-to-audio conversion.
🔊 Generating audio...
✅ Audio saved to: /tmp/summary_4250.mp3


## 7️⃣ Test the Summarization Model


In [None]:
# Test with sample text
sample_text = """
Artificial intelligence (AI) has revolutionized many aspects of our daily lives.
From virtual assistants in our phones to recommendation systems on streaming platforms,
AI is everywhere. Machine learning, a subset of AI, enables computers to learn from data
without being explicitly programmed. Deep learning, which uses neural networks with multiple
layers, has enabled breakthroughs in image recognition, natural language processing, and more.
These technologies are transforming industries from healthcare to finance, making processes
more efficient and enabling new capabilities that were previously thought impossible.
"""

print("📝 Original text:")
print(sample_text[:200] + "...")
print("\n" + "="*50 + "\n")

summary, (in_tokens, out_tokens) = summarize_text(sample_text, return_tokens=True)
print("✅ Summary:")
print(summary)
print(f"\n📊 Input tokens: {in_tokens}, Output tokens: {out_tokens}")


📝 Original text:

Artificial intelligence (AI) has revolutionized many aspects of our daily lives. 
From virtual assistants in our phones to recommendation systems on streaming platforms, 
AI is everywhere. Machine le...


✅ Summary:
Artificial intelligence (AI) has revolutionized many aspects of our daily lives. Machine learning, a subset of AI, enables computers to learn from data. Deep learning has enabled breakthroughs in image recognition, natural language processing, and more. These technologies are transforming industries from healthcare to finance.

📊 Input tokens: 119, Output tokens: 61


## 8️⃣ Create Gradio Interface with Audio & Theme Features


In [None]:
import gradio as gr
import datetime

def process_summarization(input_text, max_len, min_len, theme, generate_audio):
    """
    Process summarization with Gradio interface and audio generation.
    """
    if not input_text.strip():
        return "Please provide some text to summarize.", None, None

    # Get summary
    summary, (in_tokens, out_tokens) = summarize_text(
        input_text,
        max_length=max_len,
        min_length=min_len,
        return_tokens=True
    )

    # Create detailed output
    detailed_output = f"""# Text Summarization Results

**Generated on:** {datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S')}

## Original Text
{input_text}

## Summary
{summary}

## Statistics
- Input Tokens: {in_tokens}
- Output Tokens: {out_tokens}
- Compression Ratio: {(1 - out_tokens/in_tokens) * 100:.1f}%
"""

    # Generate audio if requested
    audio_output = None
    if generate_audio and summary:
        audio_output = text_to_audio_advanced(summary)

    return summary, detailed_output, audio_output

def export_results(detailed_output):
    """
    Prepare export file content.
    """
    if not detailed_output or detailed_output.startswith("Please"):
        return None
    return detailed_output

def export_summary_with_audio(summary_text, audio_file):
    """
    Export summary as text with optional audio file.
    """
    if not summary_text or summary_text.startswith("Please"):
        return None

    # Create a timestamped file with the summary
    timestamp = datetime.datetime.now().strftime('%Y%m%d_%H%M%S')

    # Return the summary text for download
    # For more advanced export, you could zip text + audio
    return summary_text

# Define the theme
def create_theme(theme_choice):
    """
    Create theme based on user choice.
    """
    if theme_choice == "Dark":
        return gr.themes.Soft(
            primary_hue="blue",
            secondary_hue="purple",
            neutral_hue="slate",
            font=[gr.themes.GoogleFont('Inter')],
            font_mono=[gr.themes.GoogleFont('JetBrains Mono')]
        ).set(
            body_background_fill_dark="#1a1a1a",
            panel_background_fill_dark="#2d2d2d",
            button_primary_background_fill_dark="#0066cc",
            button_primary_background_fill_hover_dark="#0052a3"
        )
    else:
        return gr.themes.Soft(
            primary_hue="blue",
            secondary_hue="purple",
            neutral_hue="slate",
            font=[gr.themes.GoogleFont('Inter')],
            font_mono=[gr.themes.GoogleFont('JetBrains Mono')]
        )

print("🎨 Creating Gradio interface...")

with gr.Blocks(title="Text Summarization App", theme=create_theme("Light")) as demo:
    gr.Markdown(
        """
        # 📝 Text Summarization with Audio Generation
        ### Powered by transformers, Gradio & Text-to-Speech

        Enter your text below and get AI-generated summaries with audio output!
        """
    )

    with gr.Row():
        with gr.Column(scale=1):
            gr.Markdown("### ⚙️ Settings")
            max_length = gr.Slider(
                minimum=50,
                maximum=512,
                value=150,
                step=10,
                label="Max Summary Length",
                info="Maximum number of words in summary"
            )
            min_length = gr.Slider(
                minimum=10,
                maximum=100,
                value=30,
                step=5,
                label="Min Summary Length",
                info="Minimum number of words in summary"
            )
            generate_audio = gr.Checkbox(
                value=True,
                label="🔊 Generate Audio",
                info="Convert summary to speech"
            )
            theme_choice = gr.Radio(
                choices=["Light", "Dark"],
                value="Light",
                label="Theme"
            )

        with gr.Column(scale=2):
            gr.Markdown("### 📝 Input Text")
            input_text = gr.Textbox(
                label="Enter text to summarize",
                placeholder="Paste your text here...",
                lines=10,
                max_lines=15
            )

            with gr.Row():
                summarize_btn = gr.Button("✨ Summarize", variant="primary")
                clear_btn = gr.Button("🗑️ Clear")

    with gr.Row():
        with gr.Column(scale=1):
            gr.Markdown("### 📄 Summary")
            summary_output = gr.Textbox(
                label="Summary",
                lines=8,
                interactive=True
            )
            audio_output = gr.Audio(
                label="🔊 Audio Summary",
                type="filepath",
                autoplay=False
            )

            with gr.Row():
                export_text_btn = gr.Button("💾 Export Text", variant="secondary")
                export_audio_btn = gr.Button("🎵 Export Audio", variant="secondary")

        with gr.Column(scale=1):
            gr.Markdown("### 📊 Detailed Results")
            detailed_output = gr.Markdown()
            with gr.Row():
                file_output_text = gr.File(label="📄 Text Export", interactive=False)
                file_output_audio = gr.File(label="🎵 Audio Export", interactive=False)

    # Examples
    gr.Markdown("### 💡 Example Texts")
    examples = gr.Examples(
        examples=[
            ["Artificial intelligence has transformed many aspects of technology. Machine learning enables computers to learn patterns from data. Deep learning uses neural networks to solve complex problems. These technologies are revolutionizing healthcare, finance, and transportation."],
            ["Climate change is one of the most pressing issues of our time. Rising global temperatures are causing extreme weather events. Melting ice caps threaten coastal cities. Renewable energy solutions are crucial for reducing greenhouse gas emissions. Individual actions combined with policy changes can make a significant impact."],
            ["Space exploration has always captivated human imagination. From the moon landing to Mars rovers, we've achieved incredible milestones. Private companies are now advancing space travel. The goal is to establish human presence beyond Earth within this century. This requires international cooperation and sustainable technology."]
        ],
        inputs=input_text
    )

    # Event handlers
    summarize_btn.click(
        fn=process_summarization,
        inputs=[input_text, max_length, min_length, theme_choice, generate_audio],
        outputs=[summary_output, detailed_output, audio_output]
    )

    export_btn.click(
        fn=export_results,
        inputs=detailed_output,
        outputs=file_output
    )

    clear_btn.click(
        fn=lambda: ("", "", None, None, None),
        outputs=[input_text, summary_output, detailed_output, audio_output, file_output]
    )

print("✅ Gradio interface created successfully!")
print("🚀 Ready to launch...")


## 9️⃣ Launch the App


In [None]:
# Launch the app
demo.launch(
    share=True,  # Creates public link
    server_name="0.0.0.0",
    server_port=7860,
    debug=True
)


Colab notebook detected. This cell will run indefinitely so that you can see errors and logs. To turn off, set debug=False in launch().
* Running on public URL: https://603e21757e5fcecd0c.gradio.live

This share link expires in 1 week. For free permanent hosting and GPU upgrades, run `gradio deploy` from the terminal in the working directory to deploy to Hugging Face Spaces (https://huggingface.co/spaces)


Traceback (most recent call last):
  File "/usr/local/lib/python3.12/dist-packages/gradio/queueing.py", line 759, in process_events
    response = await route_utils.call_process_api(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/gradio/route_utils.py", line 354, in call_process_api
    output = await app.get_blocks().process_api(
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/gradio/blocks.py", line 2127, in process_api
    data = await self.postprocess_data(block_fn, result["prediction"], state)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/gradio/blocks.py", line 1904, in postprocess_data
    prediction_value = block.postprocess(prediction_value)
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/gradio/components/file.py", line 227, in postprocess


## 📚 How to Use

1. **Run all cells** in order
2. **Set your Hugging Face token** in cell 4 (get it from [Hugging Face Settings](https://huggingface.co/settings/tokens))
3. **Optionally change the model** in cell 6 (default: facebook/bart-large-cnn)
4. **Launch the app** in cell 18
5. **Test the app** with the example texts or your own input

### Features
- ✨ **Interactive summarization** with real-time processing
- 🔊 **Text-to-Audio conversion** using Google Text-to-Speech
- 🎨 **Dark/Light theme** toggle
- 💾 **Export functionality** to download results
- 📊 **Detailed statistics** including token counts
- 📝 **Example texts** for quick testing
- 🎵 **Audio player** integrated in the interface

### Popular Models to Try
- `facebook/bart-large-cnn` (default) - Best for general summarization
- `google/pegasus-xsum` - For abstractive summaries
- `t5-small` - Lightweight and fast
- `allenai/led-large-16384` - For long document summaries

### Text-to-Speech Options
- **gTTS** (default) - Fast, free, and works offline
- **Azure Speech** - For advanced voices (requires API key)
- **Amazon Polly** - For natural-sounding speech (requires API key)
