## Building a text summarization app Gradio

In [3]:
!pip install transformers gradio

Collecting transformers
  Downloading transformers-4.44.2-py3-none-any.whl.metadata (43 kB)
Collecting gradio
  Downloading gradio-4.43.0-py3-none-any.whl.metadata (15 kB)
Collecting huggingface-hub<1.0,>=0.23.2 (from transformers)
  Downloading huggingface_hub-0.24.6-py3-none-any.whl.metadata (13 kB)
Collecting regex!=2019.12.17 (from transformers)
  Downloading regex-2024.7.24-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (40 kB)
Collecting safetensors>=0.4.1 (from transformers)
  Downloading safetensors-0.4.5-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (3.8 kB)
Collecting tokenizers<0.20,>=0.19 (from transformers)
  Downloading tokenizers-0.19.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (6.7 kB)
Collecting aiofiles<24.0,>=22.0 (from gradio)
  Downloading aiofiles-23.2.1-py3-none-any.whl.metadata (9.7 kB)
Collecting fastapi<0.113.0 (from gradio)
  Downloading fastapi-0.112.4-py3-none-any.whl.metadata (27 kB)
Coll

In [11]:
from transformers import pipeline
import gradio as gr
import os


In [17]:
# Initialize the summarization pipeline
get_completion = pipeline("summarization", model="facebook/bart-large-cnn")

# Function to split text into smaller chunks
def split_text_into_chunks(text, max_tokens=500):
    words = text.split()
    # Split the text into chunks
    chunks = [' '.join(words[i:i + max_tokens]) for i in range(0, len(words), max_tokens)]
    return chunks

# Generator function for streaming summarization with accumulation
def summarize(input):
    # Split input text into smaller chunks
    chunks = split_text_into_chunks(input, max_tokens=500)
    accumulated_summary = ""  # Initialize an empty string to accumulate the summaries
    # Process each chunk and yield accumulated summaries one by one
    for chunk in chunks:
        summary = get_completion(chunk)[0]['summary_text']
        accumulated_summary += summary + " "  # Accumulate each new summary with a space
        yield accumulated_summary  # Yield the accumulated summary



In [18]:
# Close any previous Gradio instances
gr.close_all()

# Create the Gradio interface with a submit button and streaming enabled
demo = gr.Interface(
    fn=summarize,
    inputs=gr.Textbox(
        lines=10,
        placeholder="Enter the text you want to summarize...",
        label="Input Text"
    ),
    outputs=gr.Textbox(
        lines=5,
        label="Summarized Text"
    ),
    title="Text Summarization with Facebook/bart-large-cnn",
    description="This app uses a state-of-the-art transformer model to summarize long texts into concise summaries. Enter your text in the input box, and click 'Submit' to get the summary.",
    examples=[
        "Artificial intelligence is a field of study that gives computers the ability to learn and perform tasks that require human intelligence, such as decision-making, visual perception, and speech recognition.",
        "Climate change is a long-term shift in weather patterns and average temperatures, mainly due to human activities such as burning fossil fuels and deforestation."
    ],
    theme="dark",  # Choose your preferred theme
    live=True  # Enable live mode for real-time streaming of output
)

# Launch the app
demo.launch(share=True, server_port=int(os.environ.get('PORT1', 7860)))

Closing server running on port: 7860
Closing server running on port: 7860
Closing server running on port: 7860
Closing server running on port: 7860
Closing server running on port: 7860
Closing server running on port: 7860



Sorry, we can't find the page you are looking for.


Running on local URL:  http://127.0.0.1:7860
Running on public URL: https://c60088bb2e2021a2b5.gradio.live

This share link expires in 72 hours. For free permanent hosting and GPU upgrades, run `gradio deploy` from Terminal to deploy to Spaces (https://huggingface.co/spaces)




Traceback (most recent call last):
  File "/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/gradio/routes.py", line 789, in predict
    output = await route_utils.call_process_api(
  File "/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/gradio/route_utils.py", line 321, in call_process_api
    output = await app.get_blocks().process_api(
  File "/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/gradio/blocks.py", line 1935, in process_api
    result = await self.call_function(
  File "/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/gradio/blocks.py", line 1488, in call_function
    raise IndexError("function has no backend method.")
IndexError: function has no backend method.
Your max_length is set to 142, but your input_length is only 114. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_len