<a href="https://colab.research.google.com/github/MUmairAB/NewsGPT-Using-LangChain-and-FastAPI/blob/main/NLP.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# NewsGPT

NewsGPT is a web-based application that utilizes the **LangChain** and **FastAPI** frameworks to generate concise summaries of news articles.

## Instructions for Utilizing NewsGPT:

To make use of this application, simply input the news article you wish to condense. The application will employ an advanced Large Language Model (LLM) to create a summary of the article.

## Technical Specifications:

The model leverages an open-source Large Language Model (LLM) available on the HuggingFace Hub. The current version of the notebook utilizes the [BART-Large-CNN](https://huggingface.co/facebook/bart-large-cnn) LLM developed by **Facebook**. However, if you prefer to use **OpenAI's ChatGPT** and possess the requisite API key, that option is also available. The notebook includes the necessary code and guidance for making this switch.

The application retrieves the news article content from the provided link using LangChain. Subsequently, the LLM generates the summarized content, and the deployment is handled through FastAPI, which is built upon the REST framework.


To obtain the LLM from HuggingFace, it's essential to safeguard your **Access Key**. To ensure its security when sharing the notebook on your GitHub repository, a prudent approach is taken. The key is first stored within a Python file, which is subsequently uploaded to the current Colab session. Following this, the file is relocated to a concealed folder, and access permissions are modified to restrict it to the current session exclusively. If you are replicating this code and you won't be sharing your code publicly, you may opt to omit this part.

In [1]:
#Upgrade pip
!pip install -q --upgrade pip

[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/2.1 MB[0m [31m?[0m eta [36m-:--:--[0m[2K     [91m━[0m[90m╺[0m[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.1/2.1 MB[0m [31m1.8 MB/s[0m eta [36m0:00:02[0m[2K     [91m━━━━━━━━━━━[0m[90m╺[0m[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.6/2.1 MB[0m [31m8.4 MB/s[0m eta [36m0:00:01[0m[2K     [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[91m╸[0m [32m2.1/2.1 MB[0m [31m21.8 MB/s[0m eta [36m0:00:01[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.1/2.1 MB[0m [31m18.4 MB/s[0m eta [36m0:00:00[0m
[?25h

In [2]:
#Install the required packages.
# Use the -q tag for downloading the files quietly.

#Install LangChain
!pip install -q langchain

#Install uvicorn
!pip install -q uvicorn

#Install FastAPI
!pip install -q fastapi

#Install pyngrok
!pip install pyngrok

#Install HuggingFace
!pip install -q huggingface_hub

[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/1.7 MB[0m [31m?[0m eta [36m-:--:--[0m[2K   [91m━━━━━━━━[0m[90m╺[0m[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.3/1.7 MB[0m [31m10.1 MB/s[0m eta [36m0:00:01[0m[2K   [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[91m╸[0m [32m1.6/1.7 MB[0m [31m25.5 MB/s[0m eta [36m0:00:01[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.7/1.7 MB[0m [31m19.7 MB/s[0m eta [36m0:00:00[0m
[?25h[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/49.4 kB[0m [31m?[0m eta [36m-:--:--[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m49.4/49.4 kB[0m [31m4.3 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m58.3/58.3 kB[0m [31m910.6 kB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m59.5/59.5 kB[0m [31m2.2 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━

In [3]:
#Create the files_upload session to upload the file secret_key.py
# in which the HuggingFace API key is stored
"""
from google.colab import files
files.upload()
"""

#I have commented the above code because I am manually uploading the secret_key.py file

'\nfrom google.colab import files\nfiles.upload()\n'

In [4]:
#Create "key_directory" directory
!mkdir ~/.key_directory

#Move the secret_key.py file to this directory
!mv secret_key.py ~/.key_directory/

#Change the file access rights to the current user only
!chmod 600 ~/.key_directory/secret_key.py

In [5]:
#Change the current directory to the hidden directory
%cd ~/.key_directory

/root/.key_directory


In [6]:
#Extract the API key as API_key
from secret_key import API_key

In [7]:
#cd out of the current hidden directory to the parent directory
%cd ..

/root


In [8]:
#Create an Environment Variable named HUGGINGFACEHUB_API_TOKEN to store the API_key
import os

os.environ["HUGGINGFACEHUB_API_TOKEN"] = API_key

In [9]:
#Load necessary libraries
from langchain import HuggingFaceHub
from langchain import PromptTemplate, LLMChain

#Declare the HuggingFace repo id
repo_id = "facebook/bart-large-cnn"

#Instantiate the LLM
llm = HuggingFaceHub(repo_id=repo_id)

I am currently using the [BART-Large-CNN](https://huggingface.co/facebook/bart-large-cnn) by **Facebook**. But if you want to use the OpenAI's ChatGPT and has the API key, then store the API key as Environment variable. Then un-comment the following cell code to instantiate the OpenAI's ChatGPT as your desired LLM.

In [10]:
#If you want to use the OpenAI and has the API key, you can un-comment the following code
"""
from langchain.chat_models import ChatOpenAI
llm = ChatOpenAI(temperature=0, model_name="gpt-3.5-turbo-16k")
"""

'\nfrom langchain.chat_models import ChatOpenAI\nllm = ChatOpenAI(temperature=0, model_name="gpt-3.5-turbo-16k")\n'

## Define FastAPI

In [11]:
from fastapi import FastAPI
from fastapi.middleware.cors import CORSMiddleware
from fastapi.responses import HTMLResponse


#Instantiate the app
app = FastAPI()

#The following code is necessary to run the FastAPI app in Colab
# But if you are running it locally, then you can un-comment the following
app.add_middleware(
    CORSMiddleware,
    allow_origins=['*'],
    allow_credentials=True,
    allow_methods=['*'],
    allow_headers=['*'],
)

#Define the root function/landing page
@app.get("/")
async def root():
    return {"message": "Welcome to NewsGPT Website"}

#Define the summary page
@app.get("/summary")
async def summarizer(URL:str):
    #Load the necessary libraries
    from langchain.document_loaders import WebBaseLoader
    from langchain.chains.summarize import load_summarize_chain

    #Scrape the URL
    loader = WebBaseLoader(URL)
    docs = loader.load()

    #Instantiate the Chain Object
    chain = load_summarize_chain(llm, chain_type="stuff")

    #Use the chain on the scraped website
    result = chain.run(docs)

    return {"Summary of the news article":result}

In [12]:
#Run the FastAPI app on local server
import nest_asyncio
from pyngrok import ngrok
import uvicorn

ngrok_tunnel = ngrok.connect(8000)
print('Public URL:', ngrok_tunnel.public_url)
nest_asyncio.apply()
uvicorn.run(app, port=8000)



INFO:     Started server process [336]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on http://127.0.0.1:8000 (Press CTRL+C to quit)


Public URL: https://ea78-35-194-207-7.ngrok.io
INFO:     39.59.22.204:0 - "GET / HTTP/1.1" 200 OK
INFO:     39.59.22.204:0 - "GET /favicon.ico HTTP/1.1" 404 Not Found
INFO:     39.59.22.204:0 - "GET / HTTP/1.1" 200 OK
INFO:     39.59.22.204:0 - "GET /summary?URL=https://www.theguardian.com/commentisfree/2023/aug/11/ai-tech-designers-tool-communities HTTP/1.1" 200 OK
INFO:     39.59.22.204:0 - "GET /summary?URL=https://www.dawn.com/news/1775013/climate-activists-block-dutch-motorway-in-major-protest HTTP/1.1" 200 OK
INFO:     39.59.22.204:0 - "GET /summary?URL=https://www.dawn.com/news/1774979/musks-x-sues-over-having-to-post-moderation-policies HTTP/1.1" 200 OK
INFO:     39.59.22.204:0 - "GET /summary?URL=https://www.theguardian.com/technology/2023/sep/10/china-troubles-could-upset-apples-cart-as-it-prepares-to-launch-the-iphone-15 HTTP/1.1" 200 OK
INFO:     39.59.22.204:0 - "GET /summary?URL=https://www.theguardian.com/world/2023/sep/10/chinas-good-for-marriage-womens-trend-ignites-so

INFO:     Shutting down
INFO:     Waiting for application shutdown.
INFO:     Application shutdown complete.
INFO:     Finished server process [336]


The App is accessible using the **Public URL** as long as the cell is running. We can use this end-point to access the app through Python code in local IDE or in any Command Line Interface (CLI) terminal like GitBash.