In [1]:
from dotenv import load_dotenv, dotenv_values
import os

# Load environment variables from .env file
config = dotenv_values("C:/Users/SACHENDRA/Documents/Activeloop/.env")
load_dotenv("C:/Users/SACHENDRA/Documents/Activeloop/.env")

True

In [2]:
'''
 here are the steps described in more detail:

1. Install required libraries: To get started, ensure you have the necessary libraries installed: requests, newspaper3k, and langchain.
2. Scrape articles: Use the requests library to scrape the content of the target news articles from their respective URLs.
3. Extract titles and text: Employ the newspaper library to parse the scraped HTML and extract the titles and text of the articles.
4. Preprocess the text: Clean and preprocess the extracted texts to make them suitable for input to ChatGPT.
5. Generate summaries: Utilize ChatGPT to summarize the extracted articles' text concisely.
6. Output the results: Present the summaries along with the original titles, allowing users to grasp the main points of each article quickly.
'''

!pip install -q newspaper3k python-dotenv

In [4]:
!pip install --upgrade lxml_html_clean

Collecting lxml_html_clean
  Downloading lxml_html_clean-0.1.1-py3-none-any.whl.metadata (1.5 kB)
Downloading lxml_html_clean-0.1.1-py3-none-any.whl (11 kB)
Installing collected packages: lxml_html_clean
Successfully installed lxml_html_clean-0.1.1


In [5]:
import requests
from newspaper import Article

headers = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.82 Safari/537.36'
}

article_url = "https://www.artificialintelligence-news.com/2022/01/25/meta-claims-new-ai-supercomputer-will-set-records/"

session = requests.Session()

try:
    response = session.get(article_url, headers=headers, timeout=10)
    
    if response.status_code == 200:
        article = Article(article_url)
        article.download()
        article.parse()
        
        print(f"Title: {article.title}")
        print(f"Text: {article.text}")
        
    else:
        print(f"Failed to fetch article at {article_url}")
except Exception as e:
    print(f"Error occurred while fetching article at {article_url}: {e}")

Title: Meta claims its new AI supercomputer will set records
Text: Ryan Daws is a senior editor at TechForge Media, with a seasoned background spanning over a decade in tech journalism. His expertise lies in identifying the latest technological trends, dissecting complex topics, and weaving compelling narratives around the most cutting-edge developments. His articles and interviews with leading industry figures have gained him recognition as a key influencer by organisations such as Onalytica. Publications under his stewardship have since gained recognition from leading analyst houses like Forrester for their performance. Find him on X (@gadget_ry) or Mastodon (@gadgetry@techhub.social)

Meta (formerly Facebook) has unveiled an AI supercomputer that it claims will be the world’s fastest.

The supercomputer is called the AI Research SuperCluster (RSC) and is yet to be fully complete. However, Meta’s researchers have already begun using it for training large natural language processing (

In [6]:
from langchain.schema import (
    HumanMessage
)

# we get the article data from the scraping part
article_title = article.title
article_text = article.text

# prepare template for prompt
template = """You are a very good assistant that summarizes online articles.

Here's the article you want to summarize.

==================
Title: {article_title}

{article_text}
==================

Write a summary of the previous article.
"""

prompt = template.format(article_title=article.title, article_text=article.text)

messages = [HumanMessage(content=prompt)]



In [7]:
from langchain.chat_models import ChatOpenAI

# load the model
chat = ChatOpenAI(model_name="gpt-4", temperature=0)

In [8]:
# generate summary
summary = chat(messages)
print(summary.content)

Meta, formerly known as Facebook, has announced the development of an AI supercomputer, the AI Research SuperCluster (RSC), which it claims will be the world's fastest upon completion in mid-2022. The RSC is already being used by Meta's researchers for training large natural language processing and computer vision models. The supercomputer is expected to be 20 times faster than Meta's current V100-based clusters and will be capable of training models with trillions of parameters. The RSC is designed with security and privacy controls to allow Meta to use real-world examples from its production systems in training. This will enable Meta to advance research for tasks such as identifying harmful content on its platforms using real data.


In [9]:
# if we want bullted summary points

# prepare template for prompt
template = """You are an advanced AI assistant that summarizes online articles into bulleted lists.

Here's the article you need to summarize.

==================
Title: {article_title}

{article_text}
==================

Now, provide a summarized version of the article in a bulleted list format.
"""

# format prompt
prompt = template.format(article_title=article.title, article_text=article.text)

# generate summary
summary = chat([HumanMessage(content=prompt)])
print(summary.content)

- Meta (formerly Facebook) has announced the development of an AI supercomputer, the AI Research SuperCluster (RSC), which it claims will be the world's fastest.
- The RSC is not yet fully built, but is already being used by Meta's researchers for training large natural language processing (NLP) and computer vision models.
- The supercomputer is expected to be fully operational by mid-2022 and is designed to train models with trillions of parameters.
- Meta hopes that the RSC will help build new AI systems for real-time voice translations and other applications, ultimately contributing to the development of the metaverse.
- Once complete, RSC is expected to be 20 times faster than Meta's current V100-based clusters, 9 times faster at running the NVIDIA Collective Communication Library (NCCL), and 3 times faster at training large-scale NLP workflows.
- A model with tens of billions of parameters can finish training in three weeks with RSC, compared to nine weeks prior to its use.
- The 

In [10]:
# if you want summary in French

# prepare template for prompt
template = """You are an advanced AI assistant that summarizes online articles into bulleted lists in French.

Here's the article you need to summarize.

==================
Title: {article_title}

{article_text}
==================

Now, provide a summarized version of the article in a bulleted list format, in French.
"""

# format prompt
prompt = template.format(article_title=article.title, article_text=article.text)

# generate summary
summary = chat([HumanMessage(content=prompt)])
print(summary.content)

- Ryan Daws, rédacteur en chef chez TechForge Media, parle de la nouvelle superordinateur d'IA de Meta (anciennement Facebook).
- Meta a dévoilé un superordinateur d'IA, appelé AI Research SuperCluster (RSC), qu'elle prétend être le plus rapide au monde.
- Le RSC n'est pas encore entièrement construit, mais les chercheurs de Meta l'utilisent déjà pour former de grands modèles de traitement du langage naturel (NLP) et de vision par ordinateur.
- Le RSC devrait être entièrement construit à la mi-2022. Meta affirme qu'il sera le plus rapide du monde une fois terminé et qu'il sera capable de former des modèles avec des trillions de paramètres.
- Meta espère que le RSC aidera à construire de nouveaux systèmes d'IA qui pourront, par exemple, permettre des traductions vocales en temps réel pour de grands groupes de personnes parlant différentes langues.
- Pour la production, Meta prévoit que le RSC sera 20 fois plus rapide que ses clusters actuels basés sur V100. Le RSC est également estimé ê