<a href="https://colab.research.google.com/github/gkjrtech/initial-setup/blob/main/News_Analyzer.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
!pip install newspaper3k transformers torch lxml[html_clean]

In [None]:
from transformers import pipeline

# Load the summarization 'pipeline' (the AI model)
summarizer = pipeline("summarization", model="facebook/bart-large-cnn")

def summarize_news(text):
    # AI models have a limit on how much text they can read at once (usually 1024 tokens)
    # We'll trim the text to ensure it fits
    trimmed_text = text[:3000]

    summary = summarizer(trimmed_text, max_length=130, min_length=30, do_sample=False)
    return summary[0]['summary_text']


# Task
Create an interactive tool that allows users to input a URL, fetch the news article content from that URL, and generate an AI summary using the pre-trained `facebook/bart-large-cnn` model. The tool should consist of a URL input box, a button to trigger the summarization, and an output area to display the generated summary. The article content will be fetched from the provided URL using the `newspaper3k` library.

## Prepare Article Retrieval and Summarization Functions

### Subtask:
Ensure the `get_article_text` and `summarize_news` functions are correctly defined and ready for use. I will also update `test_url` with the latest URL provided by you for the demonstration.


In [None]:
from newspaper import Article

def get_article_text(url):
    article = Article(url)
    article.download()
    article.parse()
    return article.text


## Create Interactive Widgets

### Subtask:
Set up `ipywidgets.Text` for URL input, `ipywidgets.Button` to trigger the summary, and `ipywidgets.Output` to display the AI summary result.


**Reasoning**:
To create the interactive tool, I need to import the necessary `ipywidgets` components and instantiate them as described in the instructions.



In [None]:
import ipywidgets as widgets
from IPython.display import display

# Create a Text widget for URL input
url_input = widgets.Text(
    value='',
    placeholder='Enter article URL here',
    description='URL:',
    disabled=False
)

# Create a Button widget to trigger summarization
summarize_button = widgets.Button(
    description='Summarize Article',
    disabled=False,
    button_style='', # 'success', 'info', 'warning', 'danger' or ''
    tooltip='Click to summarize the article',
    icon='file-text'
)

# Create an Output widget to display the summary
output_area = widgets.Output()

print("Widgets initialized successfully.")

## Implement Summary Generation Callback

### Subtask:
Write a Python function that will be called when the button is clicked. This function will retrieve the URL from the input textbox, fetch the article content, generate an AI summary using the `summarizer` model, and display the summary in the output widget.


**Reasoning**:
I need to define the `on_summarize_button_clicked` function as instructed, incorporating URL retrieval, article fetching, AI summarization, and error handling within the output widget.



In [None]:
def on_summarize_button_clicked(button):
    with output_area:
        output_area.clear_output()
        url = url_input.value
        if not url:
            print("Please enter a URL.")
            return

        try:
            print(f"Fetching article from: {url}")
            body = get_article_text(url)
            print("Article fetched. Generating summary...")
            ai_summary = summarize_news(body)
            print(f"--- AI SUMMARY ---\n{ai_summary}")
        except ArticleException as e:
            print(f"Error fetching or parsing article: {e}")
        except Exception as e:
            print(f"An unexpected error occurred: {e}")

print("Summary generation callback function defined.")

**Reasoning**:
Now that the `on_summarize_button_clicked` function is defined, I need to link it to the `summarize_button` click event to make the interactive tool functional.



In [None]:
summarize_button.on_click(on_summarize_button_clicked)

print("Button click event handler assigned.")

**Reasoning**:
With the button's event handler assigned, the next step is to display the interactive widgets (URL input, button, and output area) to the user to make the tool functional.



In [None]:
display(url_input, summarize_button, output_area)

print("Interactive widgets displayed.")

## Final Task

### Subtask:
Provide instructions on how to use the interactive interface to input a URL and generate an AI summary.


## Summary:

### Q&A
The interactive interface allows users to generate an AI summary by following these steps:
1.  **Input URL:** Enter the URL of a news article into the text box labeled "URL:".
2.  **Trigger Summarization:** Click the "Summarize Article" button.
3.  **View Summary:** The AI-generated summary will appear in the output area below the button, along with progress messages or any error notifications.

### Data Analysis Key Findings
*   The `get_article_text` function, utilizing `newspaper3k`, successfully retrieved content from a Wikipedia URL, confirming its functional correctness.
*   The `newspaper3k` library exhibited significant limitations, consistently failing to fetch articles from various news sources (e.g., CNN, BBC, New York Times) due to network errors such as "404 Client Error: Not Found", "500 Server Error: Internal Server Error", and "403 Client Error: Forbidden".
*   The `summarize_news` function operated as expected, successfully generating an AI summary when provided with article text.
*   All necessary `ipywidgets` components (`Text` for URL input, `Button` for triggering summarization, and `Output` for displaying results) were successfully initialized and displayed.
*   A callback function (`on_summarize_button_clicked`) was effectively implemented to manage the end-to-end process: retrieving the URL, fetching content, summarizing, and displaying output or error messages. This callback was correctly assigned to the summarization button.

### Insights or Next Steps
*   To enhance the tool's reliability, explore alternative or supplementary libraries/methods for article content extraction (e.g., `requests-html`, `BeautifulSoup` with custom parsing rules, or commercial APIs) to overcome the `newspaper3k` limitations with various news websites.
*   Improve user feedback by adding a loading indicator during the article fetching and summarization process, and provide more user-friendly error messages that guide the user on potential issues or provide workarounds.
