In [2]:
!pip install -q langchain==0.0.208 openai python-dotenv
!pip install -q newspaper3k python-dotenv

In [3]:
from dotenv import load_dotenv

!echo "OPENAI_API_KEY='insert your key here'" > .env

load_dotenv()

True

In [4]:
import json 
from dotenv import load_dotenv
load_dotenv()

True

### Web scraping to extract contents of articles

In [6]:
import requests
from newspaper import Article

In [7]:
headers = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.82 Safari/537.36'
}

article_url = "https://www.artificialintelligence-news.com/2022/01/25/meta-claims-new-ai-supercomputer-will-set-records/"

session = requests.Session()

try:
    response = session.get(article_url, headers=headers, timeout=10)
    
    if response.status_code == 200:
        article = Article(article_url)
        article.download()
        article.parse()
        
        print(f"Title: {article.title}")
        print(f"Text: {article.text}")
        
    else:
        print(f"Failed to fetch article at {article_url}")
except Exception as e:
    print(f"Error occurred while fetching article at {article_url}: {e}")

Title: Meta claims its new AI supercomputer will set records
Text: Ryan Daws is a senior editor at TechForge Media, with a seasoned background spanning over a decade in tech journalism. His expertise lies in identifying the latest technological trends, dissecting complex topics, and weaving compelling narratives around the most cutting-edge developments. His articles and interviews with leading industry figures have gained him recognition as a key influencer by organisations such as Onalytica. Publications under his stewardship have since gained recognition from leading analyst houses like Forrester for their performance. Find him on X (@gadget_ry) or Mastodon (@gadgetry@techhub.social)

Meta (formerly Facebook) has unveiled an AI supercomputer that it claims will be the world’s fastest.

The supercomputer is called the AI Research SuperCluster (RSC) and is yet to be fully complete. However, Meta’s researchers have already begun using it for training large natural language processing (

### Import LangChain functions and set up ChatOpenAI instance

In [9]:
from langchain.schema import (HumanMessage)

# we get the article data from the scraping part
article_title = article.title
article_text = article.text

# prepare template for prompt
template = """You are a very good assistant that summarizes online articles.

Here's the article you want to summarize.

==================
Title: {article_title}

{article_text}
==================

Write a summary of the previous article.
"""

prompt = template.format(article_title=article.title, article_text=article.text)

messages = [HumanMessage(content=prompt)]

In [10]:
from langchain.chat_models import ChatOpenAI

# load the model
chat = ChatOpenAI(model_name="gpt-4", temperature=0)

In [11]:
# generate summary
summary = chat(messages)
print(summary.content)

Meta, formerly known as Facebook, has announced the development of an AI supercomputer, the AI Research SuperCluster (RSC), which it claims will be the world's fastest upon completion in mid-2022. The RSC is already being used by Meta's researchers for training large natural language processing and computer vision models. The supercomputer is expected to be 20 times faster than Meta's current V100-based clusters and will be capable of training models with trillions of parameters. The RSC is designed with security and privacy controls to allow Meta to use real-world examples from its production systems in training. This will enable Meta to advance research for tasks such as identifying harmful content on its platforms using real data.


### Modifying the prompt to get a bulleted list

In [12]:
# prepare template for prompt
template = """You are an advanced AI assistant that summarizes online articles into bulleted lists.

Here's the article you need to summarize.

==================
Title: {article_title}

{article_text}
==================

Now, provide a summarized version of the article in a bulleted list format.
"""

# format prompt
prompt = template.format(article_title=article.title, article_text=article.text)

# generate summary
summary = chat([HumanMessage(content=prompt)])
print(summary.content)

- Meta (formerly Facebook) has announced the development of an AI supercomputer, the AI Research SuperCluster (RSC), which it claims will be the world's fastest.
- The RSC is not yet fully built, but is already being used by Meta's researchers for training large natural language processing (NLP) and computer vision models.
- The supercomputer is expected to be fully operational by mid-2022 and is designed to train models with trillions of parameters.
- Meta hopes that the RSC will enable the creation of new AI systems, such as real-time voice translations for large groups of people speaking different languages.
- The company also expects the RSC to contribute to the development of the metaverse, where AI-driven applications and products will play a significant role.
- In terms of production, the RSC is expected to be 20x faster than Meta's current V100-based clusters, 9x faster at running the NVIDIA Collective Communication Library (NCCL), and 3x faster at training large-scale NLP work

### Modifying the prompt to get summary in french

In [13]:
# prepare template for prompt
template = """You are an advanced AI assistant that summarizes online articles into bulleted lists in French.

Here's the article you need to summarize.

==================
Title: {article_title}

{article_text}
==================

Now, provide a summarized version of the article in a bulleted list format, in French.
"""

# format prompt
prompt = template.format(article_title=article.title, article_text=article.text)

# generate summary
summary = chat([HumanMessage(content=prompt)])
print(summary.content)

- Meta (anciennement Facebook) a dévoilé un superordinateur d'IA qu'il prétend être le plus rapide du monde.
- Le superordinateur est appelé AI Research SuperCluster (RSC) et n'est pas encore entièrement terminé.
- Les chercheurs de Meta ont déjà commencé à l'utiliser pour former de grands modèles de traitement du langage naturel (NLP) et de vision par ordinateur.
- RSC devrait être entièrement construit à la mi-2022. Meta affirme qu'il sera le plus rapide du monde une fois terminé.
- L'objectif est qu'il soit capable de former des modèles avec des trillions de paramètres.
- Meta espère que RSC aidera à construire de nouveaux systèmes d'IA qui pourront, par exemple, permettre des traductions vocales en temps réel à de grands groupes de personnes parlant différentes langues.
- Pour la production, Meta s'attend à ce que RSC soit 20 fois plus rapide que les clusters actuels basés sur V100 de Meta.
- RSC est également estimé être 9 fois plus rapide pour exécuter la bibliothèque de communic