First let's import the environment variables


In [5]:
import json 
from dotenv import load_dotenv
load_dotenv(override=True)

True

Now let's web scrape the data from the website. We will use the requests library to get the data from the website and the BeautifulSoup library to parse the data.

In [6]:
import requests
from newspaper import Article

headers = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.82 Safari/537.36'
}

article_url = "https://www.artificialintelligence-news.com/2022/01/25/meta-claims-new-ai-supercomputer-will-set-records/"

session = requests.Session()

try:
    response = session.get(article_url, headers=headers, timeout=10)
    
    if response.status_code == 200:
        article = Article(article_url)
        article.download()
        article.parse()
        
        print(f"Title: {article.title}")
        print(f"Text: {article.text}")
        
    else:
        print(f"Failed to fetch article at {article_url}")
except Exception as e:
    print(f"Error occurred while fetching article at {article_url}: {e}")

Title: Meta claims its new AI supercomputer will set records
Text: Ryan is a senior editor at TechForge Media with over a decade of experience covering the latest technology and interviewing leading industry figures. He can often be sighted at tech conferences with a strong coffee in one hand and a laptop in the other. If it's geeky, he’s probably into it. Find him on Twitter (@Gadget_Ry) or Mastodon (@gadgetry@techhub.social)

Meta (formerly Facebook) has unveiled an AI supercomputer that it claims will be the world’s fastest.

The supercomputer is called the AI Research SuperCluster (RSC) and is yet to be fully complete. However, Meta’s researchers have already begun using it for training large natural language processing (NLP) and computer vision models.

RSC is set to be fully built in mid-2022. Meta says that it will be the fastest in the world once complete and the aim is for it to be capable of training models with trillions of parameters.

“We hope RSC will help us build entire

Let's now create our chatbot , we will be using OpenAI's GPT-3.5-turbo for this.

In [14]:
from langchain.schema import (
    HumanMessage
)

title = article.title
text = article.text

# prepare template for prompt
template= """
You are a very good assistant that summarizes online articles.

Here's the article you want to summarize.

==================
Title: {article_title}

{article_text}
==================

Write a summary of the previous article.
"""

prompt = template.format(article_title=title, article_text=text)
# we have to use the HumanMessage class to send the prompt to the model
messages = [HumanMessage(content=prompt)]
print(messages)



In [15]:
from langchain.chat_models import ChatOpenAI

chat = ChatOpenAI(model_name="gpt-3.5-turbo" , temperature=0.0)

Now let's use the chat model to summarize the article.

In [18]:
summary = chat(messages)
print(summary.content)

Meta, formerly known as Facebook, has introduced a new AI supercomputer called the AI Research SuperCluster (RSC) that is expected to be the fastest in the world once completed in mid-2022. The supercomputer will be capable of training models with trillions of parameters and aims to advance research for tasks such as real-time voice translations and identifying harmful content on Meta's platforms. RSC is projected to be 20x faster than Meta's current clusters and will significantly improve training times for large-scale NLP workflows. The supercomputer was designed with security and privacy controls in mind to allow Meta to use real-world examples from its production systems in training.


We can modify the prompt to make a summary in bullet points, We can also use the model to generate a summary of the article in a different language.

In [19]:
template = """You are an advanced AI assistant that summarizes online articles into bulleted lists.

Here's the article you need to summarize.

==================
Title: {article_title}

{article_text}
==================

Now, provide a summarized version of the article in a bulleted list format.
"""
prompt = template.format (article_title=title , article_text=text)

messages = [HumanMessage(content=prompt)]
print(messages)



In [20]:
print(chat(messages).content)

- Meta (formerly Facebook) has unveiled an AI supercomputer called the AI Research SuperCluster (RSC) that it claims will be the world's fastest.
- RSC is still under construction but is already being used for training large natural language processing (NLP) and computer vision models.
- Meta aims for RSC to be capable of training models with trillions of parameters and to power AI-driven applications in the metaverse.
- The supercomputer is expected to be 20x faster than Meta's current V100-based clusters and 9x faster at running the NVIDIA Collective Communication Library (NCCL).
- RSC will enable Meta to advance research on tasks like identifying harmful content on its platforms using real data from production systems.
- Meta designed RSC with security and privacy controls in mind to ensure the protection of real-world examples used in production training.


we can instruct the chatbot to make a summary in a specific length

In [25]:
template ="""
You are an advanced AI assistant that summarizes online articles.

Here's the article title you need to summarize.

{article_title}
and here is article text :
{article_text}

Now, provide a summarized version of the article in one paragraph. Don't include any unnecessary details.The summary shouldn't be longer than 200-250 words.

"""
prompt = template.format(article_title=title, article_text=text)
messages = [HumanMessage(content=prompt)]
print(messages)

[HumanMessage(content="\nYou are an advanced AI assistant that summarizes online articles.\n\nHere's the article title you need to summarize.\n\nMeta claims its new AI supercomputer will set records\nand here is article text :\nRyan is a senior editor at TechForge Media with over a decade of experience covering the latest technology and interviewing leading industry figures. He can often be sighted at tech conferences with a strong coffee in one hand and a laptop in the other. If it's geeky, he’s probably into it. Find him on Twitter (@Gadget_Ry) or Mastodon (@gadgetry@techhub.social)\n\nMeta (formerly Facebook) has unveiled an AI supercomputer that it claims will be the world’s fastest.\n\nThe supercomputer is called the AI Research SuperCluster (RSC) and is yet to be fully complete. However, Meta’s researchers have already begun using it for training large natural language processing (NLP) and computer vision models.\n\nRSC is set to be fully built in mid-2022. Meta says that it will

In [26]:
print(chat(messages).content)

Meta (formerly Facebook) has unveiled its new AI supercomputer, the AI Research SuperCluster (RSC), which is set to be the world's fastest once fully built in mid-2022. The RSC will be capable of training models with trillions of parameters and aims to advance AI systems for applications like real-time voice translations and AR games in the metaverse. Meta expects the RSC to be 20x faster than its current clusters, with improved performance in training large-scale NLP workflows. The supercomputer was designed with security and privacy controls in mind, allowing Meta to use real-world data from its platforms for research purposes, such as identifying harmful content. This marks a significant advancement in performance, reliability, security, and privacy at scale for AI research infrastructure.
