# News Summarizer

## API Key

In [1]:
API_FILE = 'api_key.txt'

with open(API_FILE, 'r') as api_content:
    api_key = api_content.read()

## Fetch the Latest News Articles
We will use **NewsAPI** to retrieve news articles. You can sign up for a free API key at https://newsapi.org/.

In [2]:
import requests

def fetch_news(api_key, query="technology", num_articles=5):
    url = f"https://newsapi.org/v2/everything?q={query}&pageSize={num_articles}&apiKey={api_key}"
    response = requests.get(url)
    articles = response.json()["articles"]
    return [(article['title'], article['description'], article['url']) for article in articles]

# Example usage
news_articles = fetch_news(api_key, query="AI", num_articles=5)
for title, desc, url in news_articles:
    print(f"Title: {title}\nDescription: {desc}\nLink: {url}\n")

Description: Starting next week, Meta will no longer put an easy-to-see label on Facebook images that were edited using AI tools, and it will make it much harder to determine if they appear in their original state or had been doctored. To be clear, the company will still …
Link: https://consent.yahoo.com/v2/collectConsent?sessionId=1_cc-session_2f675784-4666-497b-8867-104365facbab

Title: ByteDance will reportedly use Huawei chips to train a new AI model
Description: As first reported by Reuters, ByteDance, the Chinese parent company of TikTok, is planning to train and develop an AI model
 using chips from fellow Chinese company Huawei. Three anonymous sources approached Reuters with this information; a fourth source cou…
Link: https://consent.yahoo.com/v2/collectConsent?sessionId=1_cc-session_696f4cdb-5436-43e1-b3cd-6cdaee93cbe7

Title: Google is using AI to make fake podcasts from your research
Description: Google’s AI note-taking app, NotebookLM, will now let you generate a conversa

## Summarize the Articles using an LLM
Using Hugging Face's pre-trained models like **T5** for summarization.

In [3]:
from transformers import pipeline

def summarize_articles(articles):
    summarizer = pipeline("summarization", model="t5-small")
    summaries = []
    for title, desc, url in articles:
        full_text = f"{title}. {desc}"
        summary = summarizer(full_text, max_length=50, min_length=25, do_sample=False)[0]["summary_text"]
        summaries.append((title, summary, url))
    return summaries

## Perform Sentiment Analysis on the Articles
We'll use **VADER** from **NLTK** for sentiment analysis. It's easy to set up and provides accurate results for text data.

In [4]:
from nltk.sentiment import SentimentIntensityAnalyzer
import nltk

nltk.download('vader_lexicon')

def analyze_sentiment(articles):
    sia = SentimentIntensityAnalyzer()
    sentiment_scores = []
    for title, summary, url in articles:
        sentiment = sia.polarity_scores(summary)["compound"]
        sentiment_label = "Positive" if sentiment > 0 else "Negative" if sentiment < 0 else "Neutral"
        sentiment_scores.append((title, summary, sentiment_label, url))
    return sentiment_scores

[nltk_data] Downloading package vader_lexicon to
[nltk_data]     C:\Users\Ratch\AppData\Roaming\nltk_data...
[nltk_data]   Package vader_lexicon is already up-to-date!


## Build the Retrieval System using FAISS
We can implement a simple retrieval system that allows the user to input a search term and retrieve the most relevant news articles.
**FAISS** is used to efficiently search through the dataset of articles.

In [5]:
import faiss
from sentence_transformers import SentenceTransformer

def create_faiss_index(articles):
    model = SentenceTransformer('all-MiniLM-L6-v2')
    article_embeddings = model.encode([summary for _, summary, _ in articles])
    index = faiss.IndexFlatL2(article_embeddings.shape[1])  # L2 distance for similarity
    index.add(article_embeddings)
    return index, article_embeddings

def search_news(index, article_embeddings, query, articles):
    model = SentenceTransformer('all-MiniLM-L6-v2')
    query_embedding = model.encode([query])[0]
    D, I = index.search([query_embedding], k=3)  # Retrieve top 3 articles
    return [articles[i] for i in I[0]]

## Build the Frontend with Streamlit

In [7]:
import streamlit as st

def display_app():
    st.title("News Summarizer and Sentiment Analyzer")
    api_key = st.text_input("Enter NewsAPI Key:")
    query = st.text_input("Enter a search term (e.g., AI, Technology):")
    
    if api_key and query:
        news_articles = fetch_news(api_key, query=query, num_articles=5)
        st.write("### Retrieved News Articles")
        for title, desc, url in news_articles:
            st.write(f"**{title}**\n{desc}\n[Read More]({url})")
        
        if st.button("Summarize and Analyze"):
            summarized_articles = summarize_articles(news_articles)
            sentiment_scores = analyze_sentiment(summarized_articles)
            
            st.write("### Summarized Articles with Sentiment")
            for title, summary, sentiment, url in sentiment_scores:
                st.write(f"**Title**: {title}")
                st.write(f"**Summary**: {summary}")
                st.write(f"**Sentiment**: {sentiment}")
                st.write(f"[Read Full Article]({url})")

## Running the Application through Terminal

```bash
streamlit run app.py
```