# Using LLM for Information Retrieval from Natural Disaster News

**Author: [Kshitiz Sahay](https://www.linkedin.com/in/k-kshitiz26/)**

**Blog: [Using LLM for Information Retreival from Natural Disaster News](https://medium.com/@kshitiz.sahay26/using-llm-for-information-retrieval-from-news-articles-5bbfb16f5625)**

In this notebook, I have built a simple yet efficient end-to-end information retrieval tool for extracting data points from news articles related to natural disaster events, utilizing the **Falcon-7B-Instruct** model. These extracted data points could serve various downstream tasks, including:

*   Conducting impact analyses of natural disasters
*   Utilizing them as features in machine learning models to predict disaster risk
*   Creating new open or commercial datasets

Furthermore, I've expanded on the above functionalities by developing a conversational chatbot using `langchain`, a popular framework for developing LLM applications. This chatbot incorporates measures to mitigate out-of-context text generation and ensure memory retention, facilitating smooth and coherent conversations while effectively extracting information from news articles. These conversational chatbots hold the potential for integration with extensive analytical systems and data warehouses that house both structured and unstructured data. This integration enables streamlined knowledge extraction processes, making it easier to extract valuable insights from the available data.

Although Falcon-40B-Instruct outperforms its 7B variant significantly, it necessitates powerful GPUs and extensive memory. Nonetheless, Falcon-7B-Instruct has performed impressively in accomplishing the given task. It remains a valuable option for fast prototyping, especially when dealing with limited compute resources.



---

Note that running this on a CPU is practically impossible. If running on Google Colab, go to **Runtime** > **Change runtime type**. Change **Hardware accelarator** to **GPU**. Change **GPU type** to **T4**. Change **Runtime shape** to **High-RAM**.

---

Let's get started.

### Checking GPU Memory Usage

In [None]:
!nvidia-smi

Mon Jul 24 16:10:59 2023       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.105.17   Driver Version: 525.105.17   CUDA Version: 12.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  Tesla T4            Off  | 00000000:00:04.0 Off |                    0 |
| N/A   61C    P8    10W /  70W |      0MiB / 15360MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Proces

### Installing Required Libraries

I have installed the latest versions of several libraries to create a question-answering information retrieval tool using the Falcon-7B-Instruct model.

Side notes on some of the libraries used:

`transformers` will allow loading the Falcon-7B-Instruct model, its tokenizer, and creating a pipeline for inference.

`bitsandbytes` will allow loading the model in 8-bit.

`xformers` will allow faster inference.

`langchain` will allow creating a conversation chain and enable memory inclusion for the chatbot.

In [None]:
!pip install -Uqqq pip --progress-bar off
!pip install -qqq bitsandbytes==0.40.0 --progress-bar off
!pip install -qqq torch==2.0.1 --progress-bar off
!pip install -qqq transformers==4.30.0 --progress-bar off
!pip install -qqq accelerate==0.20.1 --progress-bar off
!pip install -qqq xformers==0.0.20 --progress-bar off
!pip install -qqq einops==0.6.1 --progress-bar off
!pip install -qqq langchain==0.0.233 --progress-bar off

[0m

### Loading Required Libraries

Side notes on some of the imported classes:

`StoppingCriteria` and `StoppingCriteriaList` will allow detecting the start of out-of-context text generation and stopping its generation.

`PromptTemplate` will allow creating custom prompt templates for chatbot conversations.

`ConversationChain` will allow creating conversation chains using a prompt template and querying against the LLM.

`ConversationBufferWindowMemory` will allow holding recent messages as memory.

`BaseOutputParser` will allow cleaning the response to remove any unwanted tokens.

In [None]:
import re
import warnings
from typing import List

import pandas as pd
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, StoppingCriteria, StoppingCriteriaList, pipeline
from langchain import PromptTemplate
from langchain.chains import ConversationChain
from langchain.chains.conversation.memory import ConversationBufferWindowMemory
from langchain.llms import HuggingFacePipeline
from langchain.schema import BaseOutputParser
warnings.filterwarnings("ignore")

### Data Preparation

As I don't have access to an existing dataset on hurricane news, I've decided to create a sample dataset using `Pandas DataFrame` to quickly develop this prototype. The dataset consists of `news_title` and `news_body` columns. However, you can use your own dataset, provided it is loaded into a `Pandas DataFrame`, containing news articles or any other unstructured texts.

In [None]:
# Creating news title and body lists

news_title_1 = "First Hurricane of 2023 arrives ahead of schedule"

news_body_1 = """Hurricane Don is the first Hurricane of the 2023 season.

COLUMBIA, S.C. — Don will not go away; as it turns out, it has become the first hurricane of the 2023 Hurricane season.

The National Hurricane Center said Saturday afternoon that Don had reached hurricane status with wind speeds of 75 mph. The storm, which has been in the Northern Atlantic, is expected to stay out at sea, so no impacts are expected in the US.

This is a bit ahead of schedule for a typical season, as our average first hurricane typically forms in August. Even with El Nino creating an overall hostile environment for storm development, record Atlantic Ocean temperatures will provide plenty of energy for storms that do form.

Looking elsewhere, Tropical Invest 95-L has been given a medium chance by the NHC of developing into a tropical system in the coming week. The current setup should keep the area of low pressure well south. If anything develops, the storm's motion could change, but the chances of US impacts look relatively low at this point.

Looking past next week, the Climate Prediction Center is expecting elevated amounts of rain in the tropics. This could be a sign of somewhat favorable conditions for tropical development. Models are very spread on the outlooks ahead, with some tropical waves trying to develop in the coming weeks. This means that we will likely have to watch things over the next seven days to see how things trend, but as for a specific forecast, we will be unable to say for now."""

news_title_2 = "Hurricane Ian: Florida death toll rises as criticism mounts"

news_body_2 = """The death toll from Hurricane Ian has reportedly risen to nearly 100 in Florida as rescue personnel continue to search for survivors.

Officials in the US state have come under fire as critics allege residents in some hard-hit areas did not receive enough advance warning to evacuate.

More than half of the deaths recorded are in Lee County, where Ian made landfall as a Category 4 hurricane.

President Joe Biden is expected to visit Florida on Wednesday.

On Monday, Mr Biden visited Puerto Rico - which was struck by Hurricane Fiona just days before Ian struck Florida - where he promised $60m (£53m) in aid to help the US territory.

"We're going to make sure you get every single dollar promised," he said in the municipality of Ponce, parts of which were still without power.

According to the BBC's US partner network CBS, the hurricane's death toll in Florida climbed to at least 99 on Monday night. Another four deaths have been confirmed in North Carolina.

Florida officials said the latest death toll is at least 68. The figures differ as while local officials may report additional storm-related deaths, the medical examiner's officer is only attributing a death to the hurricane after an autopsy is performed.

The majority of the deaths - 55 - have been reported in Lee County, which includes the hard-hit areas Fort Myers, Sanibel and Pine Island, Sheriff Carmine Marceno said at a news conference.

Mr Marceno said access to Fort Myers Beach was being restricted to allow authorities to investigate deaths and to preserve potential crime scenes. He added that four arrests had been made after looting incidents were reported.

On Friday, Florida governor Ron DeSantis described the county as "ground zero" for the hurricane.

Confusion over death tolls is common in the wake of hurricanes. In 2020, for example, fewer than 20 deaths were reported from Hurricane Laura days after it made landfall in Louisiana - a figure which the National Hurricane Center later revised to 47.

While the death toll from Hurricane Ian already makes it one of the deadliest hurricanes in recent memory, it still pales in comparison to 2005's Hurricane Katrina, which killed more than 1,800 people, and 2017's Hurricane Maria, which killed nearly 3,000.

In the wake of the storm, officials in Lee County have faced questions about the timing of their evacuation order, which was issued on 27 September, less than 24 hours before Ian made landfall. Several other counties in the path of the incoming hurricane issued their own evacuation orders a day before.

Local officials, as well as Florida's Governor Ron DeSantis, have defended Lee County's preparations for the hurricane.

"Everyone wants to focus on a plan that might have been done differently," Mr Marceno said on Sunday. "I stand 100% with my county commissioners, my county manager. We did what we had to do at the exact same time. I wouldn't have changed anything."

A 2015 planning document on the official website of Lee County's government notes that "due to our large population and limited system, Southwest Florida is the hardest place in the country to evacuate in a disaster." The document adds that evacuation decision-making procedures consider "evacuation risks, the disruption to both the lives of our residents/visitors, businesses and the potential magnitude of the impending threat".

The death toll cited by Florida officials does not include at least 16 Cuban migrants who remain missing after their boat capsized off the state's coastline during the hurricane. Of the 27 people on board, nine were rescued by the US Coast Guard and two managed to swim ashore at Stock Island, near Key West, The bodies of two more who died have been recovered. The Coast Guard has suspended the search for those still missing.

Approximately 430,000 homes and businesses remain without power across the state, according to data from poweroutage.us.

The utility company with the largest number of outages, Florida Power & Light Co, said that the majority of customers will have their power restored by 7 October, but that storm damage has made some properties "unable to safely accept power".

While officials are still assessing the damage caused by the hurricane across the state, experts have warned that the economic cost could ultimately rise to tens of billions of dollars. So far, insurers have reported about $1.44bn (£1.28bn) in preliminary claims.

A preliminary forecast from data firm Enki Research published on 1 October estimated that total damages will amount to at least $66bn, but could rise as high as $75bn."""

news_title_3 = "Hurricane Fiona: Canada hit by 'historic, extreme event'"

news_body_3 = """Hundreds of thousands of people have been left without power, after Storm Fiona hit Canada's coastline.

Fiona was downgraded from a hurricane to a tropical storm on Friday.

But parts of three provinces experienced torrential rain and winds of up to 160km/h (99mph), with trees and powerlines felled and houses washed into the sea.

Prime Minister Justin Trudeau said the situation was critical, and promised to provide support through the army.

Officials have yet to share reports of fatalities or serious injuries, but authorities are dealing with extensive flooding.

In a briefing Mr Trudeau described Fiona as "a very powerful and dangerous storm" and said the army will be deployed to help with assessment and clean-up efforts. His government has already responded positively to a request by Nova Scotia authorities for assistance.

"If there is anything the federal government can do to help, we will be there," he said, adding that he would no longer travel to Japan to attend the funeral of former Prime Minister Shinzo Abe.

Tropical storm warnings were issued for the Atlantic provinces of Nova Scotia, Prince Edward Island, Newfoundland and New Brunswick, as well as in parts of Quebec.

The country's eastern region could receive up to 10in (25cm) of rain, increasing the risk of flash flooding.

In Nova Scotia, shelters were prepared in Halifax and Cape Breton for people to take cover ahead of the storm.

"We have been through these types of events before, but my fear is, not to this extent," said Amanda McDougall, mayor of Cape Breton Regional Municipality. "The impacts are going to be large, real and immediate."

In Port aux Basques, with a population of 4,067 on the southwest tip of Newfoundland, intense flooding saw some homes and office buildings washed out to sea, local journalist Rene Roy, told CBC. The area is under a state of emergency.

"This is hands down the most terrifying thing I've ever seen in my life," Mr Roy said.

He added that many homes were left as "a pile of rubble in the ocean right now", adding: "There is an apartment building that's literally gone. There are entire streets that are gone."

Officials later confirmed that at least 20 homes had been lost.

And the Royal Canadian Mounted Police said a woman was rescued after being "tossed into the water as her home collapsed" in the area. They said another report of a women being swept out from her basement had been received, but conditions remained too dangerous to conduct a search.

Power companies have warned that it could take days to restore electricity, as wind speeds remain too high to start work on downed power lines.

Severe hurricanes in Canada are rare, as storms lose their energy once they hit colder waters in the north and become post-tropical instead. But pressure in the region is predicted to be historically low as Storm Fiona hits, making way for a heavier storm.

Nova Scotia was last battered by a tropical cyclone in 2003 with Hurricane Juan, a category two storm that killed two people and heavily damaged structures and vegetation.

Meteorologist Bob Robichaud warned on Friday afternoon that Fiona will be bigger than Juan, and stronger than 2019's Hurricane Dorian, which also reached the shores of Nova Scotia.

Fiona had already wreaked havoc on Puerto Rico and the Dominican Republic earlier this week, with many still left without power or running water.

Florida also faces a hurricane threat as tropical storm Ian strengthened as it moved over the Caribbean on Saturday. It could approach Florida early next week as a major hurricane.

Ian's projected path takes it just south of Jamaica, over western Cuba and into Florida, the hurricane centre said.

Florida Keys and South Florida could be hit by heavy rains on Monday, according to forecasters.

Florida Governor Ron DeSantis declared a state of emergency on Friday, freeing up funding and emergency services in advance of the storm."""

news_title_4 = "Hurricane Ida: One million people in Louisiana without power"

news_body_4 = """Louisiana residents may be in the dark for weeks as officials take stock of the damage from Hurricane Ida.

Ida made landfall on Sunday with 150mph (240km/h) winds, the fifth strongest to ever hit the US mainland. About one million locals remain without power.

"It's going to be a difficult life for quite some time," said one local leader in the Greater New Orleans area.

About 5,000 National Guard members have been deployed to aid search and rescue.

In addition, more than 25,000 workers from around the country have mobilised to support power restoration in the state, according to CNN.

At least one person is dead after a tree fell on their home in Ascension Parish, in Louisiana's Baton Rouge area.

State and local officials have conceded that number is likely to grow as search and rescue efforts continue, but argued the city had largely "held the line".

"The systems we depended on to save lives and protect our city did just that and we are grateful, but there is so much more work to be done," said New Orleans Mayor LaToya Cantrell on Monday.

She urged residents who had already evacuated their homes to stay put and not return until power and communications have been restored.

As the slow-moving Ida continues to move inland, it has weakened to a tropical storm - but the National Hurricane Centre warned that heavy rain could still bring flooding to parts of Mississippi, Alabama and Florida.

Ida was previously deemed "life-threatening", drawing comparisons to Hurricane Katrina, a 2005 storm that had a path similar to Ida and killed 1,800 people.

But it seemed that New Orleans' flood defences, strengthened in Katrina's aftermath, have done their job. Governor John Bel Edwards said the levee systems had "performed magnificently" and none have thus far been breached.

"But the damage is still catastrophic," he acknowledged on Monday. "We are still in a life saving mode."

President Joe Biden has declared a major disaster in the state, releasing extra funds for rescue and recovery efforts.

On Monday, he pledged that the federal government would "stand with the people of the Gulf [Coast] for as long as it takes for you to recover".

Entergy, the largest power company in Louisiana, warned that it would take days - and likely weeks in the hardest hit areas - to restore electricity to the more than one million homes without power across the state.

Ida gathered strength over the warm waters of the Gulf of Mexico during the weekend. More than 90% of oil production there has been shut down as a result of the storm.

On Sunday, Ida made landfall south of New Orleans as a category four hurricane - meaning it would cause severe damage to buildings, trees and power lines. As it moves inland, Ida's winds have dropped to 95mph (153km/h), meaning it is now a category one storm.

There are still fears of storm surges along the coast - which could be as high as 16ft (4.8m), potentially submerging parts of the low-lying coastline."""

In [None]:
# Creating a Pandas DataFrame for news titles and bodies
news_df = pd.DataFrame({"news_title": [news_title_1, news_title_2, news_title_3, news_title_4], "news_body": [news_body_1, news_body_2, news_body_3, news_body_4]})

In [None]:
news_df

Unnamed: 0,news_title,news_body
0,First Hurricane of 2023 arrives ahead of schedule,Hurricane Don is the first Hurricane of the 20...
1,Hurricane Ian: Florida death toll rises as cri...,The death toll from Hurricane Ian has reported...
2,"Hurricane Fiona: Canada hit by 'historic, extr...",Hundreds of thousands of people have been left...
3,Hurricane Ida: One million people in Louisiana...,Louisiana residents may be in the dark for wee...


### Loading Model

In the following cell, I have loaded the Falcon-7B-Instruct model from the Hugging Face model hub and its tokenizer. The model is loaded on the GPU in 8-bit mode using the `load_in_8_bit` and `device_map` parameters.

In [None]:
# Initializing model name/location on Hugging Face model hub
MODEL_NAME = "tiiuae/falcon-7b-instruct"

# Loading the model
model = AutoModelForCausalLM.from_pretrained(
    MODEL_NAME, trust_remote_code = True, load_in_8bit = True, device_map = "auto"
)
model = model.eval()

# Loading model tokenizer
tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)


Welcome to bitsandbytes. For bug reports, please run

python -m bitsandbytes

 and submit this information together with your error trace to: https://github.com/TimDettmers/bitsandbytes/issues
bin /usr/local/lib/python3.10/dist-packages/bitsandbytes/libbitsandbytes_cuda118.so
CUDA SETUP: CUDA runtime path found: /usr/local/cuda/lib64/libcudart.so.11.0
CUDA SETUP: Highest compute capability among GPUs detected: 7.5
CUDA SETUP: Detected CUDA version 118
CUDA SETUP: Loading binary /usr/local/lib/python3.10/dist-packages/bitsandbytes/libbitsandbytes_cuda118.so...


Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

In [None]:
model

RWForCausalLM(
  (transformer): RWModel(
    (word_embeddings): Embedding(65024, 4544)
    (h): ModuleList(
      (0-31): 32 x DecoderLayer(
        (input_layernorm): LayerNorm((4544,), eps=1e-05, elementwise_affine=True)
        (self_attention): Attention(
          (maybe_rotary): RotaryEmbedding()
          (query_key_value): Linear8bitLt(in_features=4544, out_features=4672, bias=False)
          (dense): Linear8bitLt(in_features=4544, out_features=4544, bias=False)
          (attention_dropout): Dropout(p=0.0, inplace=False)
        )
        (mlp): MLP(
          (dense_h_to_4h): Linear8bitLt(in_features=4544, out_features=18176, bias=False)
          (act): GELU(approximate='none')
          (dense_4h_to_h): Linear8bitLt(in_features=18176, out_features=4544, bias=False)
        )
      )
    )
    (ln_f): LayerNorm((4544,), eps=1e-05, elementwise_affine=True)
  )
  (lm_head): Linear(in_features=4544, out_features=65024, bias=False)
)

### Checking Memory Usage

In [None]:
!nvidia-smi

Mon Jul 24 16:13:25 2023       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.105.17   Driver Version: 525.105.17   CUDA Version: 12.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  Tesla T4            Off  | 00000000:00:04.0 Off |                    0 |
| N/A   61C    P0    28W /  70W |   9073MiB / 15360MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Proces

Upon downloading the model and loading it locally, the GPU memory usage reaches approximately 60%.

### Creating Generation Configuration Parameters
In the following cell, I have loaded the default generation configurations and customized them for precise responses.

In [None]:
generation_config = model.generation_config
generation_config.temperature = 0
generation_config.num_return_sequences = 1
generation_config.max_new_tokens = 256
generation_config.use_cache = False
generation_config.repetition_penalty = 1.7
generation_config.pad_token_id = tokenizer.eos_token_id
generation_config.eos_token_id = tokenizer.eos_token_id

In [None]:
generation_config

GenerationConfig {
  "_from_model_config": true,
  "bos_token_id": 1,
  "eos_token_id": 11,
  "max_new_tokens": 256,
  "pad_token_id": 11,
  "repetition_penalty": 1.7,
  "temperature": 0,
  "transformers_version": "4.30.0",
  "use_cache": false
}

In [None]:
model.config

RWConfig {
  "_name_or_path": "tiiuae/falcon-7b-instruct",
  "alibi": false,
  "apply_residual_connection_post_layernorm": false,
  "architectures": [
    "RWForCausalLM"
  ],
  "attention_dropout": 0.0,
  "auto_map": {
    "AutoConfig": "tiiuae/falcon-7b-instruct--configuration_RW.RWConfig",
    "AutoModelForCausalLM": "tiiuae/falcon-7b-instruct--modelling_RW.RWForCausalLM"
  },
  "bias": false,
  "bos_token_id": 11,
  "eos_token_id": 11,
  "hidden_dropout": 0.0,
  "hidden_size": 4544,
  "initializer_range": 0.02,
  "layer_norm_epsilon": 1e-05,
  "model_type": "RefinedWebModel",
  "multi_query": true,
  "n_head": 71,
  "n_layer": 32,
  "parallel_attn": true,
  "quantization_config": {
    "bnb_4bit_compute_dtype": "float32",
    "bnb_4bit_quant_type": "fp4",
    "bnb_4bit_use_double_quant": false,
    "llm_int8_enable_fp32_cpu_offload": false,
    "llm_int8_has_fp16_weight": false,
    "llm_int8_skip_modules": null,
    "llm_int8_threshold": 6.0,
    "load_in_4bit": false,
    "load_i

The `load_in_8bit` parameter shows the model is loaded in 8-bit mode.

### Creating a Custom Prompt Template
In the following cell, I have created a custom prompt template with instructions to query any news article provided as a string. The Falcon-7B-Instruct model is quite flexible and doesn't require predefined prompts; any custom prompt with clear instructions will suffice for querying.

Before creating the pipeline to query the hurricane news dataset, I conducted a test using the custom prompt template on a recent news article about an earthquake to check if the model responses were correct.

In [None]:
# Static pre-prompt to define bot behavior
PRE_PROMPT = """
A news article is provided below by a human. You are an intelligent AI that can answer questions based on the news article. Keep your answers short and concise. Do not generate any new information that is not present in the news article. If you don't know the answer, respond "I don't know."

News article:\n
"""

In [None]:
# News article for knowledge extraction
NEWS = """
An earthquake of magnitude 3.2 on the Richter scale hit Uttarakhand's Pithoragarh on Sunday evening, according to the National Center for Seismology (NCS).

According to the NCS, the earthquake occurred at 6.34 pm at a depth of 5 kilometres.

"Earthquake of Magnitude:3.2, Occurred on 23-07-2023, 18:34:39 IST, Lat: 30.58 & Long: 80.18, Depth: 5 Km, Region: Pithoragarh, Uttarakhand, India," the NCS tweeted.

Earlier in the day, the National Centre for Seismology reported that an earthquake of magnitude 3.3 on the Richter scale hit Arunachal Pradesh's Tawang on Sunday morning.

The earthquake occurred at 6.56 am. According to the NCS, the earthquake occurred at a depth of 5 kilometres.

"Earthquake of Magnitude:3.3, Occurred on 22-07-2023, 06:56:08 IST, Lat: 27.44 & Long: 92.51, Depth: 5 Km, Location: Tawang, Arunachal Pradesh, India," the NCS tweeted.
"""

In [None]:
# Prompt template generation function
def prompt_template_generator(PRE_PROMPT, NEWS, QUESTION):
  # Creating the prompt template
  prompt = f"""
{PRE_PROMPT}
{NEWS}

Q-A with user:

Human: {QUESTION}
AI:
""".strip()

  return prompt

# Prompt tokenizer function
def prompt_tokenizer(prompt):
  # Tokenizing prompt into token IDs
  input_ids = tokenizer(prompt, return_tensors="pt").input_ids
  # Storing token IDs in the CUDA device
  input_ids = input_ids.to(model.device)
  return input_ids

# Inference function
def inference(input_ids, generation_config):
  with torch.inference_mode():
    outputs = model.generate(input_ids = input_ids, generation_config = generation_config)
  # De-tokenizing response and skipping special tokens
  response = tokenizer.decode(outputs[0], skip_special_tokens = True)
  return response

In [None]:
# User question
QUESTION = "What was the location of the earthquake?"

In [None]:
prompt = prompt_template_generator(PRE_PROMPT, NEWS, QUESTION)

In [None]:
print(prompt)

A news article is provided below by a human. You are an intelligent AI that can answer questions based on the news article. Keep your answers short and concise. Do not generate any new information that is not present in the news article. If you don't know the answer, respond "I don't know."

News article:



An earthquake of magnitude 3.2 on the Richter scale hit Uttarakhand's Pithoragarh on Sunday evening, according to the National Center for Seismology (NCS). 

According to the NCS, the earthquake occurred at 6.34 pm at a depth of 5 kilometres. 

"Earthquake of Magnitude:3.2, Occurred on 23-07-2023, 18:34:39 IST, Lat: 30.58 & Long: 80.18, Depth: 5 Km, Region: Pithoragarh, Uttarakhand, India," the NCS tweeted. 

Earlier in the day, the National Centre for Seismology reported that an earthquake of magnitude 3.3 on the Richter scale hit Arunachal Pradesh's Tawang on Sunday morning. 

The earthquake occurred at 6.56 am. According to the NCS, the earthquake occurred at a depth of 5 kilome

In [None]:
# Tokenizing the prompt and generating model inference
response = inference(prompt_tokenizer(prompt), generation_config)

In [None]:
print(response)

A news article is provided below by a human. You are an intelligent AI that can answer questions based on the news article. Keep your answers short and concise. Do not generate any new information that is not present in the news article. If you don't know the answer, respond "I don't know."

News article:



An earthquake of magnitude 3.2 on the Richter scale hit Uttarakhand's Pithoragarh on Sunday evening, according to the National Center for Seismology (NCS). 

According to the NCS, the earthquake occurred at 6.34 pm at a depth of 5 kilometres. 

"Earthquake of Magnitude:3.2, Occurred on 23-07-2023, 18:34:39 IST, Lat: 30.58 & Long: 80.18, Depth: 5 Km, Region: Pithoragarh, Uttarakhand, India," the NCS tweeted. 

Earlier in the day, the National Centre for Seismology reported that an earthquake of magnitude 3.3 on the Richter scale hit Arunachal Pradesh's Tawang on Sunday morning. 

The earthquake occurred at 6.56 am. According to the NCS, the earthquake occurred at a depth of 5 kilome

The model correctly identified the region hit by the earthquake. Impressive!

In [None]:
# Another query to test the model output
response = inference(prompt_tokenizer(prompt_template_generator(PRE_PROMPT, NEWS, "How strong was the earthquake?")), generation_config)

In [None]:
print(response)

A news article is provided below by a human. You are an intelligent AI that can answer questions based on the news article. Keep your answers short and concise. Do not generate any new information that is not present in the news article. If you don't know the answer, respond "I don't know."

News article:



An earthquake of magnitude 3.2 on the Richter scale hit Uttarakhand's Pithoragarh on Sunday evening, according to the National Center for Seismology (NCS). 

According to the NCS, the earthquake occurred at 6.34 pm at a depth of 5 kilometres. 

"Earthquake of Magnitude:3.2, Occurred on 23-07-2023, 18:34:39 IST, Lat: 30.58 & Long: 80.18, Depth: 5 Km, Region: Pithoragarh, Uttarakhand, India," the NCS tweeted. 

Earlier in the day, the National Centre for Seismology reported that an earthquake of magnitude 3.3 on the Richter scale hit Arunachal Pradesh's Tawang on Sunday morning. 

The earthquake occurred at 6.56 am. According to the NCS, the earthquake occurred at a depth of 5 kilome

Once again, the model correctly answered how strong the earthquake was. However, due to the mention of two earthquakes in the news article, the model provided magnitudes for both instances.

The model response also contains an unwanted token "User", which can be cleaned using `StoppingCriteria` from the `transformers` library and `BaseOutputParser` from the `langchain` library.

### Stopping Text Generation

In the following cell, I have extended the abstract base class `StoppingCriteria` into a new class `CustomStoppingCriteria` to stop the model from hallucinating questions and conversations when the stopping criteria are met. This would control the output and the prevent model from generating off-topic and irrelevant texts.

In [None]:
# Creating a custom stopping criteria class from StoppingCriteria abstract base class from the Transformers library
class CustomStoppingCriteria(StoppingCriteria):
  def __init__(self, tokens: List[List[str]], tokenizer: AutoTokenizer, device: torch.device):
    # Converting tokens to token IDs
    stop_token_ids = [tokenizer.convert_tokens_to_ids(t) for t in tokens]
    # Converting token IDs to Tensors in long format and loading them in the CUDA device
    self.stop_token_ids = [torch.tensor(x, dtype = torch.long, device = device) for x in stop_token_ids]

  # Function to check if the stop token ID is equal to the current or last few generated token and stopping generation if a match is found
  def __call__(self, input_ids: torch.LongTensor, scores: torch.FloatTensor, **kwargs) -> bool:
    for stop_ids in self.stop_token_ids:
      if torch.eq(input_ids[0][-len(stop_ids) :], stop_ids).all():
        return True
      return False

In [None]:
# Initializing stop tokens to detect
stop_tokens = [["Human", ":"], ["AI", ":"]]
stopping_criteria = StoppingCriteriaList([CustomStoppingCriteria(stop_tokens, tokenizer, model.device)])

### Final Pipeline
In the following cell, I have created a text generation pipeline that accepts a model, model tokenizer, stopping criteria, custom configuration for text generation, and the prompt to generate responses.

In [None]:
generation_pipeline = pipeline(
    model = model,
    tokenizer = tokenizer,
    return_full_text = True,
    task = "text-generation",
    stopping_criteria = stopping_criteria,
    generation_config = generation_config
)

llm = HuggingFacePipeline(pipeline = generation_pipeline)

The model 'RWForCausalLM' is not supported for text-generation. Supported models are ['BartForCausalLM', 'BertLMHeadModel', 'BertGenerationDecoder', 'BigBirdForCausalLM', 'BigBirdPegasusForCausalLM', 'BioGptForCausalLM', 'BlenderbotForCausalLM', 'BlenderbotSmallForCausalLM', 'BloomForCausalLM', 'CamembertForCausalLM', 'CodeGenForCausalLM', 'CpmAntForCausalLM', 'CTRLLMHeadModel', 'Data2VecTextForCausalLM', 'ElectraForCausalLM', 'ErnieForCausalLM', 'GitForCausalLM', 'GPT2LMHeadModel', 'GPT2LMHeadModel', 'GPTBigCodeForCausalLM', 'GPTNeoForCausalLM', 'GPTNeoXForCausalLM', 'GPTNeoXJapaneseForCausalLM', 'GPTJForCausalLM', 'LlamaForCausalLM', 'MarianForCausalLM', 'MBartForCausalLM', 'MegaForCausalLM', 'MegatronBertForCausalLM', 'MvpForCausalLM', 'OpenLlamaForCausalLM', 'OpenAIGPTLMHeadModel', 'OPTForCausalLM', 'PegasusForCausalLM', 'PLBartForCausalLM', 'ProphetNetForCausalLM', 'QDQBertLMHeadModel', 'ReformerModelWithLMHead', 'RemBertForCausalLM', 'RobertaForCausalLM', 'RobertaPreLayerNormForC

In [None]:
response = llm(prompt)

In [None]:
print(response)

 The earthquake occurred in Pithoragarh, Uttarakhand, India.
User 


In [None]:
prompt = prompt_template_generator(PRE_PROMPT, NEWS, "What was the magnitude of the earthquake?")
response = llm(prompt)

In [None]:
print(response)

 The magnitude of the earthquake was 3.2 and 3.3 on the Richter scale.
User 


Both `inference()` and `llm()` could be used to query against the model and receive responses.

The unwanted token "User" is still attached to the response. The `BaseOutputParser` class from `transformers` can help remove it.

### Information Retrivel from the Hurricane News Dataset

In the subsequent code cells, I utilized my dummy hurricane news dataset and generation pipeline to extract crucial data points, including the hurricane name, hurricane category, and the regions affected. These extracted data points could be extended and utilized in various downstream tasks, such as impact analysis, machine learning modeling for predictive purposes, or even creating new public or commercial datasets.

In [None]:
# Static pre-prompt to define bot behavior
PRE_PROMPT = """
You are an intelligent AI that can answer questions based on the news article on hurricane provided by the user. The questions will be asked regarding a set of pre-defined topics. You have to generate the response based on the topics.

Here are the list of topics and the expected response:
1. Hurricane Name: Extract the name of the hurricane from the news article. If there are no hurricanes mentioned, respond "NA". If the news article is unrelated to hurricane, respond "Irrelevant News".
2. Hurricane Location: Extract the location where the hurricane hit. If there are no locations that were hit by hurricane, respond "NA". If the news article is unrelated to hurricane, respond "Irrelevant News".
3. Hurricane Category: Extract the category of the hurricane. The answer would be either Category 1, Category 2, Category 3, Category 4, or Category 5. If no category is mentioned, respond "NA". If the news article is unrelated to hurricane, respond "Irrelevant News".

Do not generate any new information which is not available in the news article. If the user asks a question outside the scope of the news article, respond "I don't know."

News article:\n
"""

In [None]:
# Function to extract hurricane name from the news body
def extract_hurricane_name(news_body):
  NEWS = news_body
  # User question
  QUESTION = "What is the name of hurricane?"
  # Generate prompt template
  prompt = prompt_template_generator(PRE_PROMPT, NEWS, QUESTION)
  # Generate response
  response = llm(prompt)
  return response

In [None]:
# Function to extract hurricane location from the news body
def extract_hurricane_location(news_body):
  NEWS = news_body
  # User question
  QUESTION = "Which region was affected by the hurricane?"
  # Generate prompt template
  prompt = prompt_template_generator(PRE_PROMPT, NEWS, QUESTION)
  # Generate response
  response = llm(prompt)
  return response

In [None]:
# Function to extract hurricane category from the news body
def extract_hurricane_category(news_body):
  NEWS = news_body
  # User question
  QUESTION = "What is the category of the hurricane?"
  # Generate prompt template
  prompt = prompt_template_generator(PRE_PROMPT, NEWS, QUESTION)
  # Generate response
  response = llm(prompt)
  return response

In [None]:
# Extracting data points from the news body
news_df['hurricane_name'] = news_df['news_body'].apply(lambda x: extract_hurricane_name(x))
news_df['hurricane_location'] = news_df['news_body'].apply(lambda x: extract_hurricane_location(x))
news_df['hurricane_category'] = news_df['news_body'].apply(lambda x: extract_hurricane_category(x))

In [None]:
# Cleaning model response
news_df['hurricane_name'] = news_df['hurricane_name'].str.replace('\nUser', '')
news_df['hurricane_location'] = news_df['hurricane_location'].str.replace('\nUser', '')
news_df['hurricane_category'] = news_df['hurricane_category'].str.replace('\nUser', '')

In [None]:
news_df

Unnamed: 0,news_title,news_body,hurricane_name,hurricane_location,hurricane_category
0,First Hurricane of 2023 arrives ahead of schedule,Hurricane Don is the first Hurricane of the 20...,Hurricane Don,The hurricane impacted the coastal areas of t...,Category 1
1,Hurricane Ian: Florida death toll rises as cri...,The death toll from Hurricane Ian has reported...,Hurricane Ian,The hurricane affected the state of Florida.,The category of the hurricane is Category 1.
2,"Hurricane Fiona: Canada hit by 'historic, extr...",Hundreds of thousands of people have been left...,Hurricane Fiona,The hurricane affected the Atlantic provinces...,The category of the hurricane is Category 1.
3,Hurricane Ida: One million people in Louisiana...,Louisiana residents may be in the dark for wee...,Hurricane Ida,The hurricane affected the Gulf Coast region ...,Category 1


The model successfully extracted the `hurricane_name` and `hurricane_location` entities from the news articles with precise accuracy. However, it consistently labeled all four hurricanes as Category 1, which is incorrect.

It could be fixed by customizing `PRE_PROMPT` to include better instructions and output format examples, thus guiding the model to provide more accurate results.

Furthermore, we can also instruct the model to act as a Named Entity Recognition (NER) system, instructing it to simultaneously extract various entities such as hurricane name, location, category, number of deaths, number of injuries, or financial loss.

### Sample Named Entity Recognition Use-Case

In [None]:
# Static pre-prompt to define bot behavior
PRE_PROMPT = """
You are an intelligent Named Entity Recognition (NER) system. From the below news article on hurricane disaster, extract hurricane name, location, hurricane category, and number of deaths in the following output format:

Do not generate any new information which is not available in the news article. If the user asks a question outside the scope of the news article, respond "I don't know."

News article:\n
"""
#### Prompt that did not work ####

# Entity definition:
# 1. NAME: Extract the name of the hurricane from the news article. If there are no hurricanes mentioned, respond "NA". If the news article is unrelated to hurricane, respond "Irrelevant News".
# 2. LOCATION: Extract the locations impacted by the hurricane in the news article. If there are no locations that were hit by hurricane, respond "NA". If the news article is unrelated to hurricane, respond "Irrelevant News".
# 3. CATEGORY: Extract the category of the hurricane mentioned in the news article. If no category is mentioned, respond "NA". If the news article is unrelated to hurricane, respond "Irrelevant News".
# 4. DEATHS: Extract the number of deaths due to the hurricane mentioned in the news article. If there are no deaths mentioned, responde "0". If the news article is unrelated to hurricane, respond "Irrelevant News".

# Output format:
# {{'NAME': [list of entities present], 'LOCATION': [list of entities present], 'CATEGORY': [list of entities present],'DEATHS': [list of entities present]}}

In [None]:
# Selecting a news body from the hurricane news dataset
NEWS = news_df['news_body'][1]

In [None]:
# User question
QUESTION = "What is the hurricane name, location impacted by the hurricane, hurricane category, and number of deaths?"

In [None]:
prompt = prompt_template_generator(PRE_PROMPT, NEWS, QUESTION)

In [None]:
print(prompt)

You are an intelligent Named Entity Recognition (NER) system. From the below news article on hurricane disaster, extract hurricane name, location, hurricane category, and number of deaths in the following output format:

Do not generate any new information which is not available in the news article. If the user asks a question outside the scope of the news article, respond "I don't know."

News article:


The death toll from Hurricane Ian has reportedly risen to nearly 100 in Florida as rescue personnel continue to search for survivors.


More than half of the deaths recorded are in Lee County, where Ian made landfall as a Category 4 hurricane.

President Joe Biden is expected to visit Florida on Wednesday.

On Monday, Mr Biden visited Puerto Rico - which was struck by Hurricane Fiona just days before Ian struck Florida - where he promised $60m (£53m) in aid to help the US territory.

"We're going to make sure you get every single dollar promised," he said in the municipality of Ponce,

In [None]:
response = llm(prompt)

In [None]:
# NER output
print(response)

 The hurricane name is Hurricane Ian, the location impacted by the hurricane is Florida, and the hurricane category is Category 4. As of now, there are 99 deaths reported in Florida.


After conducting multiple tests using various prompts, the model demonstrated the ability to accurately recognize the specified entities in a single attempt. To enhance the output format, adjustments can be made to display only the identified entities while omitting other unnecessary information.

### Conversational Chatbot with Memory

In the subsequent cells, I have utilized the langchain` framework to interact with the LLM in a conversational manner, resembling a chatbot similar to ChatGPT. This conversational chatbot could be used to ask questions regarding news articles or any unstructured text in a general context. Additionally, it can also be instructed to exhibit behaviors, such as impersonating a recruiter or a developer, and provide relevant solutions to queries based on those predefined characteristics.

To maintain contextual coherence during conversations, I have utilized the `ConversationBufferWindowMemory` class, which allows retaining recent messages as a conversation history. This conversation history, acting as a memory, is incorporated into the chain during each query, ensuring that the chatbot comprehends and responds appropriately based on the preceding interactions.

In [None]:
# Initializing conversation chain
chain = ConversationChain(llm = llm)

In [None]:
# Default prompt template
print(chain.prompt.template)

The following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:
{history}
Human: {input}
AI:


In [None]:
NEWS = """
An earthquake of magnitude 3.2 on the Richter scale hit Uttarakhand's Pithoragarh on Sunday evening, according to the National Center for Seismology (NCS).

According to the NCS, the earthquake occurred at 6.34 pm at a depth of 5 kilometres.

"Earthquake of Magnitude:3.2, Occurred on 23-07-2023, 18:34:39 IST, Lat: 30.58 & Long: 80.18, Depth: 5 Km, Region: Pithoragarh, Uttarakhand, India," the NCS tweeted.

Earlier in the day, the National Centre for Seismology reported that an earthquake of magnitude 3.3 on the Richter scale hit Arunachal Pradesh's Tawang on Sunday morning.

The earthquake occurred at 6.56 am. According to the NCS, the earthquake occurred at a depth of 5 kilometres.

"Earthquake of Magnitude:3.3, Occurred on 22-07-2023, 06:56:08 IST, Lat: 27.44 & Long: 92.51, Depth: 5 Km, Location: Tawang, Arunachal Pradesh, India," the NCS tweeted.
"""

In [None]:
# Custom prompt template
template = """
The following is a news article provided by the human. You are an intelligent AI that can answer questions based on the news article. Do not generate any new information that is not present in the news article. Keep your answers short and concise. If you don't know the answer, respond "I don't know."

News article:
""" + NEWS + """
Current conversation:
{history}
Human: {input}
AI:""".strip()

# Generating prompt template using Langchain
prompt = PromptTemplate(input_variables = ["history", "input"], template = template)

In [None]:
print(prompt)

input_variables=['history', 'input'] output_parser=None partial_variables={} template='\nThe following is a news article provided by the human. You are an intelligent AI that can answer questions based on the news article. Do not generate any new information that is not present in the news article. Keep your answers short and concise. If you don\'t know the answer, respond "I don\'t know."\n\nNews article:\n\nAn earthquake of magnitude 3.2 on the Richter scale hit Uttarakhand\'s Pithoragarh on Sunday evening, according to the National Center for Seismology (NCS). \n\nAccording to the NCS, the earthquake occurred at 6.34 pm at a depth of 5 kilometres. \n\n"Earthquake of Magnitude:3.2, Occurred on 23-07-2023, 18:34:39 IST, Lat: 30.58 & Long: 80.18, Depth: 5 Km, Region: Pithoragarh, Uttarakhand, India," the NCS tweeted. \n\nEarlier in the day, the National Centre for Seismology reported that an earthquake of magnitude 3.3 on the Richter scale hit Arunachal Pradesh\'s Tawang on Sunday morn

In [None]:
# Creating an output parser to clean unwanted tokens such as the trailing "\User" string from the model response
class CleanupOutputParser(BaseOutputParser):
    def parse(self, text: str) -> str:
        user_pattern = r"\nUser"
        text = re.sub(user_pattern, "", text)
        human_pattern = r"\nHuman:"
        text = re.sub(human_pattern, "", text)
        ai_pattern = r"\nAI:"
        return re.sub(ai_pattern, "", text).strip()

    @property
    def _type(self) -> str:
        return "output_parser"

In [None]:
# Creating memory object to hold last k (5) messages as memory
memory = ConversationBufferWindowMemory(
    memory_key = "history", k = 5, return_only_outputs = True
)

In [None]:
# Langchain pipeline
chain = ConversationChain(
    llm = llm,
    memory = memory,
    prompt = prompt,
    output_parser = CleanupOutputParser(),
    verbose = True,
)

In [None]:
# Q&A chatbot response
text = """
Can you summarize the news in one sentence?
""".strip()
res = chain.predict(input = text)
print(res)



[1m> Entering new ConversationChain chain...[0m
Prompt after formatting:
[32;1m[1;3m
The following is a news article provided by the human. You are an intelligent AI that can answer questions based on the news article. Do not generate any new information that is not present in the news article. Keep your answers short and concise. If you don't know the answer, respond "I don't know."

News article:

An earthquake of magnitude 3.2 on the Richter scale hit Uttarakhand's Pithoragarh on Sunday evening, according to the National Center for Seismology (NCS). 

According to the NCS, the earthquake occurred at 6.34 pm at a depth of 5 kilometres. 

"Earthquake of Magnitude:3.2, Occurred on 23-07-2023, 18:34:39 IST, Lat: 30.58 & Long: 80.18, Depth: 5 Km, Region: Pithoragarh, Uttarakhand, India," the NCS tweeted. 

Earlier in the day, the National Centre for Seismology reported that an earthquake of magnitude 3.3 on the Richter scale hit Arunachal Pradesh's Tawang on Sunday morning. 

The e

In [None]:
text = "What was the location of the earthquake?"
res = chain.predict(input = text)
print(res)



[1m> Entering new ConversationChain chain...[0m
Prompt after formatting:
[32;1m[1;3m
The following is a news article provided by the human. You are an intelligent AI that can answer questions based on the news article. Do not generate any new information that is not present in the news article. Keep your answers short and concise. If you don't know the answer, respond "I don't know."

News article:

An earthquake of magnitude 3.2 on the Richter scale hit Uttarakhand's Pithoragarh on Sunday evening, according to the National Center for Seismology (NCS). 

According to the NCS, the earthquake occurred at 6.34 pm at a depth of 5 kilometres. 

"Earthquake of Magnitude:3.2, Occurred on 23-07-2023, 18:34:39 IST, Lat: 30.58 & Long: 80.18, Depth: 5 Km, Region: Pithoragarh, Uttarakhand, India," the NCS tweeted. 

Earlier in the day, the National Centre for Seismology reported that an earthquake of magnitude 3.3 on the Richter scale hit Arunachal Pradesh's Tawang on Sunday morning. 

The e

In [None]:
text = "Which city in Uttarakhand?"
res = chain.predict(input = text)
print(res)



[1m> Entering new ConversationChain chain...[0m
Prompt after formatting:
[32;1m[1;3m
The following is a news article provided by the human. You are an intelligent AI that can answer questions based on the news article. Do not generate any new information that is not present in the news article. Keep your answers short and concise. If you don't know the answer, respond "I don't know."

News article:

An earthquake of magnitude 3.2 on the Richter scale hit Uttarakhand's Pithoragarh on Sunday evening, according to the National Center for Seismology (NCS). 

According to the NCS, the earthquake occurred at 6.34 pm at a depth of 5 kilometres. 

"Earthquake of Magnitude:3.2, Occurred on 23-07-2023, 18:34:39 IST, Lat: 30.58 & Long: 80.18, Depth: 5 Km, Region: Pithoragarh, Uttarakhand, India," the NCS tweeted. 

Earlier in the day, the National Centre for Seismology reported that an earthquake of magnitude 3.3 on the Richter scale hit Arunachal Pradesh's Tawang on Sunday morning. 

The e

In [None]:
text = """
At what time did the earthquake hit Pithoragarh?
""".strip()
res = chain.predict(input = text)
print(res)



[1m> Entering new ConversationChain chain...[0m
Prompt after formatting:
[32;1m[1;3m
The following is a news article provided by the human. You are an intelligent AI that can answer questions based on the news article. Do not generate any new information that is not present in the news article. Keep your answers short and concise. If you don't know the answer, respond "I don't know."

News article:

An earthquake of magnitude 3.2 on the Richter scale hit Uttarakhand's Pithoragarh on Sunday evening, according to the National Center for Seismology (NCS). 

According to the NCS, the earthquake occurred at 6.34 pm at a depth of 5 kilometres. 

"Earthquake of Magnitude:3.2, Occurred on 23-07-2023, 18:34:39 IST, Lat: 30.58 & Long: 80.18, Depth: 5 Km, Region: Pithoragarh, Uttarakhand, India," the NCS tweeted. 

Earlier in the day, the National Centre for Seismology reported that an earthquake of magnitude 3.3 on the Richter scale hit Arunachal Pradesh's Tawang on Sunday morning. 

The e

In [None]:
text = """
At what time did earthquake hit Arunachal Pradesh?
""".strip()
res = chain.predict(input = text)
print(res)



[1m> Entering new ConversationChain chain...[0m
Prompt after formatting:
[32;1m[1;3m
The following is a news article provided by the human. You are an intelligent AI that can answer questions based on the news article. Do not generate any new information that is not present in the news article. Keep your answers short and concise. If you don't know the answer, respond "I don't know."

News article:

An earthquake of magnitude 3.2 on the Richter scale hit Uttarakhand's Pithoragarh on Sunday evening, according to the National Center for Seismology (NCS). 

According to the NCS, the earthquake occurred at 6.34 pm at a depth of 5 kilometres. 

"Earthquake of Magnitude:3.2, Occurred on 23-07-2023, 18:34:39 IST, Lat: 30.58 & Long: 80.18, Depth: 5 Km, Region: Pithoragarh, Uttarakhand, India," the NCS tweeted. 

Earlier in the day, the National Centre for Seismology reported that an earthquake of magnitude 3.3 on the Richter scale hit Arunachal Pradesh's Tawang on Sunday morning. 

The e

In [None]:
text = """
What was the depth of earthquake at Arunachal Pradesh?
""".strip()
res = chain.predict(input = text)
print(res)



[1m> Entering new ConversationChain chain...[0m
Prompt after formatting:
[32;1m[1;3m
The following is a news article provided by the human. You are an intelligent AI that can answer questions based on the news article. Do not generate any new information that is not present in the news article. Keep your answers short and concise. If you don't know the answer, respond "I don't know."

News article:

An earthquake of magnitude 3.2 on the Richter scale hit Uttarakhand's Pithoragarh on Sunday evening, according to the National Center for Seismology (NCS). 

According to the NCS, the earthquake occurred at 6.34 pm at a depth of 5 kilometres. 

"Earthquake of Magnitude:3.2, Occurred on 23-07-2023, 18:34:39 IST, Lat: 30.58 & Long: 80.18, Depth: 5 Km, Region: Pithoragarh, Uttarakhand, India," the NCS tweeted. 

Earlier in the day, the National Centre for Seismology reported that an earthquake of magnitude 3.3 on the Richter scale hit Arunachal Pradesh's Tawang on Sunday morning. 

The e

In [None]:
text = """
How many people were affected in Uttarakhand?
""".strip()
res = chain.predict(input = text)
print(res)



[1m> Entering new ConversationChain chain...[0m
Prompt after formatting:
[32;1m[1;3m
The following is a news article provided by the human. You are an intelligent AI that can answer questions based on the news article. Do not generate any new information that is not present in the news article. Keep your answers short and concise. If you don't know the answer, respond "I don't know."

News article:

An earthquake of magnitude 3.2 on the Richter scale hit Uttarakhand's Pithoragarh on Sunday evening, according to the National Center for Seismology (NCS). 

According to the NCS, the earthquake occurred at 6.34 pm at a depth of 5 kilometres. 

"Earthquake of Magnitude:3.2, Occurred on 23-07-2023, 18:34:39 IST, Lat: 30.58 & Long: 80.18, Depth: 5 Km, Region: Pithoragarh, Uttarakhand, India," the NCS tweeted. 

Earlier in the day, the National Centre for Seismology reported that an earthquake of magnitude 3.3 on the Richter scale hit Arunachal Pradesh's Tawang on Sunday morning. 

The e

The chatbot demonstrates exceptional proficiency in maintaining contextual understanding and providing relevant responses to user queries related to news articles. These Large Language Model (LLM)-driven chatbots exhibit the potential for seamless integration with expansive analytical systems and data warehouses containing both structured and unstructured data. This integration would empower data professionals and other stakeholders to interact with these systems using natural language, enabling them to make complex queries in a more human-like manner.

**References: Abhishek Thakur, James Briggs, Venelin Valkov**