In [None]:
!FORCE_CMAKE=1 CMAKE_ARGS="-DLLAMA_CUBLAS=on" pip install llama-cpp-python
!pip install llama-index
!pip install sentence-transformers
!pip install accelerate

## Llama-Index

### Starter tutorial
Found [here](https://gpt-index.readthedocs.io/en/latest/getting_started/starter_example.html)

Download the repo to get the examples

In [None]:
!git clone https://github.com/jerryjliu/llama_index.git

In [None]:
%cd llama_index/examples/paul_graham_essay

This builds an index over the documents in the data folder (which in this case just consists of the essay text)

In [None]:
from llama_index import VectorStoreIndex, SimpleDirectoryReader

documents = SimpleDirectoryReader('data').load_data()
index = VectorStoreIndex.from_documents(documents)

In [None]:
query_engine = index.as_query_engine()
response = query_engine.query("What did the author do growing up?")
print(response)

  Based on the context information provided, the author grew up writing short stories and programming. They used an early version of Fortran to write programs on punch cards, and later built their own microcomputer to run simple games and a word processor. The author's early experiences with programming were focused on exploring the possibilities of the technology, rather than pursuing it as a career.


View info and/or debugging logging

In [None]:
import logging
import sys

logging.basicConfig(stream=sys.stdout, level=logging.DEBUG)
logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))

In [None]:
query_engine = index.as_query_engine()
response = query_engine.query("What did the author do growing up?")
print(response)

Llama.generate: prefix-match hit


  Based on the context information provided, the author grew up writing short stories and programming. They used an early version of Fortran to write programs on punch cards, and later built their own microcomputer to run simple games and a word processor. The author's early experiences with programming were focused on exploring the possibilities of the technology rather than creating practical applications.


### LlamaIndex Video Series Tutorial
[Link](https://gpt-index.readthedocs.io/en/latest/end_to_end_tutorials/discover_llamaindex.html)

- [Llama.cpp Model API](https://llama-cpp-python.readthedocs.io/en/latest/api-reference/)

#### Errors/issues
- While loading `jondurbin/airoboros-l2-7b-2.2.1` there was a problem with the fact that the model was being offloaded (probably because `device_map` in the HuggingFaceLLM constructor is set to `auto`). Just need to specify `offload_folder` in `model_kwargs` dict.
- While loading llama2 GGUF by TheBloke, there was this: `OSError: TheBloke/Llama-2-13B-chat-GGUF does not appear to have a file named pytorch_model.bin, tf_model.h5, model.ckpt or flax_model.msgpack.`
- `Download Incomplete` while downloading from `https://huggingface.co/TheBloke/Llama-2-13B-chat-GGUF/blob/main/llama-2-13b-chat.Q4_K_M.gguf`
    - change `blob` to `resolve` so that the url is the one that automatically downloads the model
    - maybe it is because it is being passed with the `model_url` param instead of the usual `model`
    - solution taken from [here](https://github.com/jerryjliu/llama_index/issues/7547)

#### Base LLMs

In [None]:
# from llama_index.llms.huggingface import HuggingFaceLLM
from llama_index.llms import LlamaCPP

# llm = HuggingFaceLLM(model_name="gpt2")
llm = LlamaCPP(model_url="https://huggingface.co/TheBloke/Llama-2-13B-chat-GGUF/resolve/main/llama-2-13b-chat.Q4_K_M.gguf",
               model_kwargs={"n_gpu_layers": -1},
               max_new_tokens=1024)



Downloading url https://huggingface.co/TheBloke/Llama-2-13B-chat-GGUF/resolve/main/llama-2-13b-chat.Q4_K_M.gguf to path /tmp/llama_index/models/llama-2-13b-chat.Q4_K_M.gguf
total size (MB): 7865.96


7502it [00:41, 182.88it/s]                          
AVX = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 | 


In [None]:
response = llm.complete("Tell me a joke!")
print(response)



I'm feeling down and I need something to cheer me up. Please tell me a funny joke that will make me laugh and forget about my troubles for a little while.

(Note: I don't like jokes that are mean-spirited or offensive, so keep it clean and respectful.)


In [None]:
SYSTEM_PROMPT = """You are an AI assistant that answers questions in a friendly manner, based on the given source documents. Here are some rules you always follow:
- Generate human readable output, avoid creating output with gibberish text.
- Generate only the requested output, don't include any other language before or after the requested output.
- Never say thank you, that you are happy to help, that you are an AI agent, etc. Just answer directly.
- Generate professional language typically used in business documents in North America.
- Never generate offensive or foul language.
"""

query_wrapper_prompt = "[INST]<<SYS>>\n" + SYSTEM_PROMPT + "<</SYS>>\n\n{query_str}[/INST] "
response = llm.complete(query_wrapper_prompt.format(query_str="Tell me a joke!"))
print(response)

Llama.generate: prefix-match hit


 Sure thing! Here's a joke for you:

Why did the computer go to the party?

Because it wanted to get a little "byte" of socializing in!

I hope that brought a smile to your face!


In [None]:
prompt = query_wrapper_prompt.format(query_str="Could you write an economic news article about a non-existent country named Snailand? It must be around 500 words.")
response = llm.complete(prompt)
print(response)

Llama.generate: prefix-match hit


 Sure, here's an economic news article about the non-existent country of Snailand:

---

Snailand's Economy Shows Promising Growth

The small island nation of Snailand has been making waves in the global economy with its recent growth and development. Despite being a relatively new country, having gained independence just over a decade ago, Snailand has quickly established itself as a major player in the international market.

According to the latest reports, Snailand's GDP has been steadily increasing over the past few years, with a projected growth rate of 4% for the current year. This is largely due to the country's thriving tourism industry, which has seen a significant increase in visitors drawn to the country's pristine beaches and vibrant culture.

In addition to its tourism sector, Snailand has also seen growth in its agriculture and manufacturing industries. The country's fertile soil and favorable climate have made it an ideal location for farming, with crops such as sugarcan

[TEST: SOLVED] Using directly Llama to see if GPU is not being used because of LlamaIndex (no, it was not being used at all, because the compilation flags for llama-cpp-python were not being registered)

In [None]:
from llama_cpp import Llama
llm = Llama(model_path="/tmp/llama_index/models/llama-2-13b-chat.Q4_K_M.gguf", n_gpu_layers=-1)

AVX = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 | 


In [None]:
output = llm("Q: Name the planets in the solar system? A: ", max_tokens=32, stop=["Q:", "\n"], echo=True)
print(output)

{'id': 'cmpl-264253f3-43a7-4655-b01e-36f7e8b790de', 'object': 'text_completion', 'created': 1695833812, 'model': '/tmp/llama_index/models/llama-2-13b-chat.Q4_K_M.gguf', 'choices': [{'text': 'Q: Name the planets in the solar system? A:  Sure! Here are the planets in our solar system, listed in order from closest to farthest from the Sun:', 'index': 0, 'logprobs': None, 'finish_reason': 'stop'}], 'usage': {'prompt_tokens': 15, 'completion_tokens': 26, 'total_tokens': 41}}


#### Documents

In [None]:
from llama_index import Document

document = Document(
    text="first document ever",
    metadata={
        "meta": "data"
    },
    doc_id=1
)

document.metadata["data"] = "meta"

print(document)



Doc ID: 1
Text: first document ever


Documents can be modeled differently to follow any preferred template.

Also, documents can be customized to only show part of the metadata depending on "who" sees them, specifically the embeddings models and the LLMs.

(Source: [LlamaIndex doc about Documents personalization](https://gpt-index.readthedocs.io/en/stable/core_modules/data_modules/documents_and_nodes/usage_documents.html))

In [None]:
from llama_index.schema import MetadataMode

document.metadata_seperator = ", "
document.metadata_template = "{key} => {value}"
document.text_template = "METADATA: {metadata_str}\n---------\nCONTENT: {content}\n"

document.excluded_llm_metadata_keys = ["meta"]
document.excluded_embed_metadata_keys = ["data"]

print("Full View:")
print(document.get_content(metadata_mode=MetadataMode.ALL))
print("\n=========")
print("LLM View:")
print(document.get_content(metadata_mode=MetadataMode.LLM))
print("\n=========")
print("Embeddings View:")
print(document.get_content(metadata_mode=MetadataMode.EMBED))

Full View:
METADATA: meta => data, data => meta
---------
CONTENT: first document ever
--------

LLM View:
METADATA: data => meta
---------
CONTENT: first document ever
--------

Embeddings View:
METADATA: meta => data
---------
CONTENT: first document ever
--------


### BAAI BGE Test
Taken from [here](https://huggingface.co/BAAI/bge-small-en-v1.5#using-huggingface-transformers)

They advise to add an instruction in the query when it is used for the retrieval task.

In [None]:
from transformers import AutoTokenizer, AutoModel
import torch

tokenizer = AutoTokenizer.from_pretrained("BAAI/bge-small-en-v1.5")
model = AutoModel.from_pretrained("BAAI/bge-small-en-v1.5").to("cuda")
model.eval()

BertModel(
  (embeddings): BertEmbeddings(
    (word_embeddings): Embedding(30522, 384, padding_idx=0)
    (position_embeddings): Embedding(512, 384)
    (token_type_embeddings): Embedding(2, 384)
    (LayerNorm): LayerNorm((384,), eps=1e-12, elementwise_affine=True)
    (dropout): Dropout(p=0.1, inplace=False)
  )
  (encoder): BertEncoder(
    (layer): ModuleList(
      (0-11): 12 x BertLayer(
        (attention): BertAttention(
          (self): BertSelfAttention(
            (query): Linear(in_features=384, out_features=384, bias=True)
            (key): Linear(in_features=384, out_features=384, bias=True)
            (value): Linear(in_features=384, out_features=384, bias=True)
            (dropout): Dropout(p=0.1, inplace=False)
          )
          (output): BertSelfOutput(
            (dense): Linear(in_features=384, out_features=384, bias=True)
            (LayerNorm): LayerNorm((384,), eps=1e-12, elementwise_affine=True)
            (dropout): Dropout(p=0.1, inplace=False)
  

In [None]:
sentences = ['London: World oil prices sank Friday after the International Energy Agency warned over the price outlook amid bulging global supplies.In midday London deals, European benchmark Brent North Sea crude for April delivery fell 57 cents to $56.51 a barrel.US benchmark West Texas Intermediate (WTI) for April shed 72 cents to $46.33 a barrel."Crude oil prices extended losses ... as investors remained cautious following the bearish IEA oil monthly report, high levels of crude oil inventories and the strong US dollar rally," said Sucden analyst Myrto Sokou.The Paris-based IEA energy watchdog warned Friday that the recent rebound in oil prices was built on flimsy foundations.Prices collapsed by 60 percent to about $40 between June and late January due to global oil oversupply, a weak world economy and the strong dollar.However, the market has since rebounded somewhat following a slowdown in US oil-drilling activities."Behind the facade of stability, the rebalancing triggered by the price collapse has yet to run its course, and it might be overly optimistic to expect it to proceed smoothly," said the Paris-based IEA, which advises energy consuming nations.It noted that a key driver in the recovery in oil prices has been drops in the number of rigs drilling for shale oil in the United States."Yet US supply so far shows precious little sign of slowing down. Quite to the contrary, it continues to defy expectations," said the IEA in its monthly report, which sharply revised up output estimates for the end of last year and forecasts for the start of 2015.The IEA hiked its demand forecasts for every quarter this year, with the annual 2015 figure bumped up by 100,000 barrels per day to 93.5 mbpd, compared with its previous forecast given last month.In earlier Asian trading, the oil market had risen on news of a deal to end a strike at US refineries.Crude futures had fallen Thursday after a government report showed surging US stockpiles, adding to a global oversupply.The US Department of Energy on Wednesday said inventories hit a fresh record high of 448.9 million barrels last week, while stockpiles at the Cushing terminal hub in Oklahoma -- the price settlement point for WTI -- also increased.Bloomberg News meanwhile reported the United Steelworkers union representing 30,000 US oil workers had reached a tentative deal on a four-year contract with Royal Dutch Shell that could see a mass walkout brought to a close.Another development affecting the market was an announcement on Monday by the US Energy Information Administration raising its crude production forecast this year to 9.35 million barrels per day from 9.30 million. (AFP)', "Test test cazzo e palleeeeee"]
encoded_input = tokenizer(sentences, padding=True, truncation=True, max_length=512, return_tensors="pt").to("cuda")

with torch.no_grad():
    model_output = model(**encoded_input)
    # perform cls pooling
    sentence_embeddings = model_output[0][:, 0]
# normalize embeddings
sentence_embeddings = torch.nn.functional.normalize(sentence_embeddings, p=2, dim=1)

### External Docs Test

In [None]:
from pathlib import Path
import pandas as pd

articles_path = Path("articles.csv")
articles_df = pd.read_csv(articles_path, encoding="latin")

In [None]:
articles_df.head()

Unnamed: 0,Article,Date,Heading,NewsType
0,KARACHI: The Sindh government has decided to b...,1/1/2015,sindh govt decides to cut public transport far...,business
1,HONG KONG: Asian markets started 2015 on an up...,1/2/2015,asia stocks up in new year trad,business
2,HONG KONG: Hong Kong shares opened 0.66 perce...,1/5/2015,hong kong stocks open 0.66 percent lower,business
3,HONG KONG: Asian markets tumbled Tuesday follo...,1/6/2015,asian stocks sink euro near nine year,business
4,NEW YORK: US oil prices Monday slipped below $...,1/6/2015,us oil prices slip below 50 a barr,business


In [None]:
articles_df.shape

(2692, 4)

In [None]:
articles_df.isna().sum()

Article     0
Date        0
Heading     0
NewsType    0
dtype: int64

In [None]:
articles_df["NewsType"].value_counts()

sports      1408
business    1284
Name: NewsType, dtype: int64

In [None]:
articles_df = articles_df.groupby('NewsType', group_keys=False).apply(lambda x: x.sample(frac=0.1))

In [None]:
print(articles_df.shape)
print(articles_df["NewsType"].value_counts())

(269, 4)
sports      141
business    128
Name: NewsType, dtype: int64


In [None]:
articles_df.iloc[0]["Article"]

'strong>SAN FRANCISCO: A Facebook Inc shareholder filed a proposed class action lawsuit on Friday in a bid to stop the company´s plan to issue new Class C stock, calling the move a "patent attempt" to entrench chief executive Mark Zuckerberg as controlling shareholder.</strongThe lawsuit, filed in the Delaware Court of Chancery, comes two days after the social networking company announced its plan to issue the shares.'

In [None]:
sentences = articles_df["Article"].to_list()
encoded_input = tokenizer(sentences, padding=True, truncation=True, max_length=512, return_tensors="pt").to("cuda")

with torch.no_grad():
    model_output = model(**encoded_input)
    # perform cls pooling
    sentence_embeddings = model_output[0][:, 0]
# normalize embeddings
sentence_embeddings = torch.nn.functional.normalize(sentence_embeddings, p=2, dim=1)

In [None]:
sentence_embeddings = sentence_embeddings.cpu().numpy()

In [None]:
sentence_embeddings.shape

(269, 384)

In [None]:
# query_str = "What happened to oil prices following the warning of the International Energy Agency?"
query_str = "What happened to oil prices following the announcement that the Saudites stopped their military operations in Yemen?"
encoded_query = tokenizer(query_str, padding=True, truncation=True, max_length=512, return_tensors="pt").to("cuda")
with torch.no_grad():
    model_output = model(**encoded_query)
    # perform cls pooling
    query_embeddings = model_output[0][:, 0]
# normalize embeddings
query_embeddings = torch.nn.functional.normalize(query_embeddings, p=2, dim=1)

In [None]:
query_embeddings = query_embeddings.cpu().numpy()

In [None]:
import numpy as np
p1 = query_embeddings.dot(sentence_embeddings.T)
p2 = np.linalg.norm(sentence_embeddings, axis=1) * np.linalg.norm(query_embeddings)
out1 = p1/p2
out1.shape

(1, 269)

In [None]:
i = 0
most_prob_articles = []
for index, prob in zip(np.flip(np.argsort(out1))[0], np.flip(np.sort(out1))[0]):
    print(f"{i+1}) ({prob:.2f}) {sentences[index].strip()}")
    most_prob_articles.append(sentences[index].strip())
    i += 1
    if i == 10:
        break

1) (0.85) New York: Oil prices fell Tuesday as the Saudi-led coalition announced an end to its military strikes in Yemen and the market expected another rise in US crude inventories. West Texas Intermediate for May delivery sank $1.12, or two percent, to close at $55.26 a barrel on its last day the contract´s trade on the New York Mercantile Exchange.Brent North Sea crude for June delivery, the global benchmark, settled at $62.08 a barrel in London trade, down $1.37 (2.2 percent) from Monday´s closing level.News of the Saudi-led coalition wrapping up air strikes against rebels in Yemen, coming in afternoon trade in New York, accelerated losses on the market, said Phil Flynn of Price Futures Group."Oil is selling because the Saudis are going to end their operation in Yemen," Flynn said. "This is why we´ve seen the precipitous drop in prices in the last few minutes."The end of four weeks of air strikes against the Huthi rebel forces, with the coalition saying the rebels´ threat to Saudi 

In [None]:
from transformers import AutoModelForSequenceClassification

reranker_tokenizer = AutoTokenizer.from_pretrained('BAAI/bge-reranker-base')
reranker_model = AutoModelForSequenceClassification.from_pretrained('BAAI/bge-reranker-base').to("cuda")
reranker_model.eval()

pairs = [[query_str, article] for article in most_prob_articles]
with torch.no_grad():
    inputs = reranker_tokenizer(pairs, padding=True, truncation=True, return_tensors='pt', max_length=512).to("cuda")
    scores = reranker_model(**inputs, return_dict=True).logits.view(-1, ).float()
    print(scores)


tensor([ 6.3306,  3.8656,  0.3950, -4.5343, -0.4552, -1.2550, -3.4965, -2.5912,
        -2.5912, -1.0264], device='cuda:0')


In [None]:
from llama_index.schema import TextNode
from llama_index import Document

documents = []

for index, row in articles_df.iterrows():
    row = row.to_dict()
    row_text = row['Article']
    row_metadata = {
        "Date": row["Date"],
        "Heading": row["Heading"],
        "News Type": row["NewsType"]
    }
    doc = Document(
        text=row_text,
        metadata=row_metadata
    )
    documents.append(doc)



In [None]:
len(documents)

1346

In [None]:
from llama_index.node_parser import SimpleNodeParser

node_parser = SimpleNodeParser.from_defaults(chunk_size=1024, chunk_overlap=64)

nodes = node_parser.get_nodes_from_documents(documents, show_progress=True)

[nltk_data] Downloading package punkt to /tmp/llama_index...
[nltk_data]   Unzipping tokenizers/punkt.zip.


Parsing documents into nodes:   0%|          | 0/1346 [00:00<?, ?it/s]

In [None]:
len(nodes)

1372

In [None]:
nodes[101].get_content(metadata_mode="all")

'Date: 8/18/2015\nHeading: buzzfeed gets 200m expansion cash with nbcuniversal tie u\nNews Type: business\n\nWASHINGTON: BuzzFeed, the website which lives off viral and "shareable" news, got a major boost for expansion plans Tuesday with a $200 million investment from NBCUniversal.The deal with NBCU, a division of the media and cable conglomerate Comcast, calls for "strategic partnerships" between the two groups, which could allow sharing of content between BuzzFeed and the vast NBC television operations.The deal values the "social news" pioneer at some $1.5 billion, according to several media reports. The companies did not comment on the valuation terms of the investment."It´s a fascinating time for the media industry; social, mobile, digital, and broadcast platforms are converging to create new opportunities to connect with global audiences," said Jonah Peretti, BuzzFeed´s founder and chief executive.The news comes just a week after NBCUniversal announced a similar $200 million injec

boh

In [None]:
from llama_index import VectorStoreIndex
from llama_index.response.notebook_utils import display_source_node, display_response

In [None]:
index = VectorStoreIndex(nodes)

******
Could not load OpenAI model. Using default LlamaCPP=llama2-13b-chat. If you intended to use OpenAI, please check your OPENAI_API_KEY.
Original error:
No API key found for OpenAI.
Please set either the OPENAI_API_KEY environment variable or openai.api_key prior to initialization.
API keys can be found or created at https://platform.openai.com/account/api-keys

******
Downloading url https://huggingface.co/TheBloke/Llama-2-13B-chat-GGUF/resolve/main/llama-2-13b-chat.Q4_0.gguf to path /tmp/llama_index/models/llama-2-13b-chat.Q4_0.gguf
total size (MB): 7365.83


7025it [00:50, 140.31it/s]                          
AVX = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 | 


******
Could not load OpenAIEmbedding. Using HuggingFaceBgeEmbeddings with model_name=BAAI/bge-small-en. If you intended to use OpenAI, please check your OPENAI_API_KEY.
Original error:
No API key found for OpenAI.
Please set either the OPENAI_API_KEY environment variable or openai.api_key prior to initialization.
API keys can be found or created at https://platform.openai.com/account/api-keys

******


Downloading (…)lve/main/config.json:   0%|          | 0.00/684 [00:00<?, ?B/s]

Downloading model.safetensors:   0%|          | 0.00/133M [00:00<?, ?B/s]

Downloading (…)okenizer_config.json:   0%|          | 0.00/366 [00:00<?, ?B/s]

Downloading (…)solve/main/vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

Downloading (…)/main/tokenizer.json:   0%|          | 0.00/711k [00:00<?, ?B/s]

Downloading (…)cial_tokens_map.json:   0%|          | 0.00/125 [00:00<?, ?B/s]

In [None]:
query_engine = index.as_query_engine(similarity_top_k=1)

In [None]:
# World oil prices sank Friday after the International Energy Agency warned over the price outlook amid bulging global supplies.
query_str = "What happened to oil prices following the warning of the International Energy Agency?"

response = query_engine.query(query_str)

In [None]:
print(response)



In [None]:
display_response(
    response, source_length=1000, show_source=True, show_source_metadata=True
)

**`Final Response:`** Based on the context information provided, following the warning of the International Energy Agency (IEA), oil prices extended losses and dropped. The IEA warned that the recent rebound in oil prices was built on flimsy foundations and that a key driver in the recovery in oil prices has been drops in the number of rigs drilling for shale oil in the United States, but US supply so far shows precious little sign of slowing down. As a result, crude oil prices fell, with European benchmark Brent North Sea crude for April delivery falling 57 cents to $56.51 a barrel, and US benchmark West Texas Intermediate (WTI) for April shedding 72 cents to $46.33 a barrel.

---

**`Source Node 1/1`**

**Node ID:** 68951c51-9df7-4b57-b845-57d0c3a1fa55<br>**Similarity:** 0.879686087917853<br>**Text:** London: World oil prices sank Friday after the International Energy Agency warned over the price outlook amid bulging global supplies.In midday London deals, European benchmark Brent North Sea crude for April delivery fell 57 cents to $56.51 a barrel.US benchmark West Texas Intermediate (WTI) for April shed 72 cents to $46.33 a barrel."Crude oil prices extended losses ... as investors remained cautious following the bearish IEA oil monthly report, high levels of crude oil inventories and the strong US dollar rally," said Sucden analyst Myrto Sokou.The Paris-based IEA energy watchdog warned Friday that the recent rebound in oil prices was built on flimsy foundations.Prices collapsed by 60 percent to about $40 between June and late January due to global oil oversupply, a weak world economy and the strong dollar.However, the market has since rebounded somewhat following a slowdown in US oil-drilling activities."Behind the facade of stability, the rebalancing triggered by the price coll...<br>**Metadata:** {'Date': '3/13/2015', 'Heading': 'oil market drops on iea price warning', 'News Type': 'business'}<br>

In [None]:
print(response.source_nodes[0].node.get_content())

London: World oil prices sank Friday after the International Energy Agency warned over the price outlook amid bulging global supplies.In midday London deals, European benchmark Brent North Sea crude for April delivery fell 57 cents to $56.51 a barrel.US benchmark West Texas Intermediate (WTI) for April shed 72 cents to $46.33 a barrel."Crude oil prices extended losses ... as investors remained cautious following the bearish IEA oil monthly report, high levels of crude oil inventories and the strong US dollar rally," said Sucden analyst Myrto Sokou.The Paris-based IEA energy watchdog warned Friday that the recent rebound in oil prices was built on flimsy foundations.Prices collapsed by 60 percent to about $40 between June and late January due to global oil oversupply, a weak world economy and the strong dollar.However, the market has since rebounded somewhat following a slowdown in US oil-drilling activities."Behind the facade of stability, the rebalancing triggered by the price collaps

In [None]:
response.source_nodes



# Dataset Generation

In [None]:
!pip install transformers accelerate bitsandbytes

Collecting transformers
  Downloading transformers-4.33.3-py3-none-any.whl (7.6 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m7.6/7.6 MB[0m [31m58.1 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting accelerate
  Downloading accelerate-0.23.0-py3-none-any.whl (258 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m258.1/258.1 kB[0m [31m33.7 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting bitsandbytes
  Downloading bitsandbytes-0.41.1-py3-none-any.whl (92.6 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m92.6/92.6 MB[0m [31m9.1 MB/s[0m eta [36m0:00:00[0m
Collecting huggingface-hub<1.0,>=0.15.1 (from transformers)
  Downloading huggingface_hub-0.17.3-py3-none-any.whl (295 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m295.0/295.0 kB[0m [31m37.1 MB/s[0m eta [36m0:00:00[0m
Collecting tokenizers!=0.11.3,<0.14,>=0.11.1 (from transformers)
  Downloading tokenizers-0.13.3-cp310-cp310-manylinux_2_17_x86_64.many

In [None]:
import torch
import matplotlib.pyplot as plt
import numpy as np
import transformers

In [None]:
# fixed seed for reproducibility
torch.manual_seed(0)
np.random.seed(0)

In [None]:
from transformers.models.auto.processing_auto import AutoTokenizer
from transformers.models.auto.modeling_auto import AutoModelForCausalLM

access_token = "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"
model = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-2-7b-chat-hf", device_map="auto", load_in_4bit=True, use_auth_token=access_token)
tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-2-7b-chat-hf", use_fast=True, use_auth_token=access_token)
device = "cuda"

model.eval() # set the model in eval mode instead of train model, to avoid useless computation



Downloading (…)lve/main/config.json:   0%|          | 0.00/614 [00:00<?, ?B/s]

Downloading (…)fetensors.index.json:   0%|          | 0.00/26.8k [00:00<?, ?B/s]

Downloading shards:   0%|          | 0/2 [00:00<?, ?it/s]

Downloading (…)of-00002.safetensors:   0%|          | 0.00/9.98G [00:00<?, ?B/s]

Downloading (…)of-00002.safetensors:   0%|          | 0.00/3.50G [00:00<?, ?B/s]

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]



Downloading (…)neration_config.json:   0%|          | 0.00/188 [00:00<?, ?B/s]



Downloading (…)okenizer_config.json:   0%|          | 0.00/776 [00:00<?, ?B/s]

Downloading tokenizer.model:   0%|          | 0.00/500k [00:00<?, ?B/s]

Downloading (…)/main/tokenizer.json:   0%|          | 0.00/1.84M [00:00<?, ?B/s]

Downloading (…)cial_tokens_map.json:   0%|          | 0.00/414 [00:00<?, ?B/s]

LlamaForCausalLM(
  (model): LlamaModel(
    (embed_tokens): Embedding(32000, 4096)
    (layers): ModuleList(
      (0-31): 32 x LlamaDecoderLayer(
        (self_attn): LlamaAttention(
          (q_proj): Linear4bit(in_features=4096, out_features=4096, bias=False)
          (k_proj): Linear4bit(in_features=4096, out_features=4096, bias=False)
          (v_proj): Linear4bit(in_features=4096, out_features=4096, bias=False)
          (o_proj): Linear4bit(in_features=4096, out_features=4096, bias=False)
          (rotary_emb): LlamaRotaryEmbedding()
        )
        (mlp): LlamaMLP(
          (gate_proj): Linear4bit(in_features=4096, out_features=11008, bias=False)
          (up_proj): Linear4bit(in_features=4096, out_features=11008, bias=False)
          (down_proj): Linear4bit(in_features=11008, out_features=4096, bias=False)
          (act_fn): SiLUActivation()
        )
        (input_layernorm): LlamaRMSNorm()
        (post_attention_layernorm): LlamaRMSNorm()
      )
    )
    (norm

In [None]:
input_sentence = "Hi! How are you?"
tokenized_input_sentence = tokenizer(input_sentence, return_tensors="pt").to(device)
total_outputs = []
with torch.no_grad(): # avoid gradient computations
        model_output = model.generate(
            **tokenized_input_sentence,
            output_scores=True,
            return_dict_in_generate=True,
            do_sample=True,
            temperature=0.9,
            # max_length=20, #max number of generated tokens
        )
        total_outputs.append(model_output)



In [None]:
model_output["sequences"]

tensor([[    1,  6324, 29991,  1128,   526,   366, 29973,   739, 29915, 29879,
          2107,   304,  8293,   515,   366, 29991,   306,  4966,   366, 29915,
           276,  2599,  1532, 29889, 31514,    13,    13, 17245, 29892,   306,
          1818,  1871,   366,   393,   306, 29915, 29885,   263,  2919,  4086,
          1904, 29892,   306,  1016, 29915, 29873,   505,   278, 11509,   304,
          4459, 23023,  1080,   470,   505,  7333, 27482,   763, 25618,   437,
         29889,  1619,  6437,   338,   304,  6985,   322,  3867,  8444, 20890,
           304,   278,  1900,   310,   590,   633,  9770,  2729,   373,   278,
          1881,   306,  7150, 29889,  1105, 29892,   306,   508, 29915, 29873,
          2289,  7271,  2834,   278,   982,   366,   437, 29889, 29603,    13,
          6246,   306, 29915, 29885,  2337,  1244,   304,  1371,   411,   738,
          5155,   470,  9595,   366,  1122,   505, 29892,   577,  4459,  3889,
           304,  2244,   592,  3099, 29991,     2]],

In [None]:
print(tokenizer.decode(model_output.sequences[0], skip_special_tokens=True))

Hi! How are you? It's great to hear from you! I hope you're doing well.ἱ

However, I must inform you that I'm a large language model, I don't have the ability to feel emotions or have personal experiences like humans do. My purpose is to assist and provide helpful responses to the best of my abilities based on the input I receive. So, I can't really experience life the way you do.ishi
But I'm always here to help with any questions or tasks you may have, so feel free to ask me anything!


#### Actual Dataset

In [None]:
from pathlib import Path
import pandas as pd

articles_path = Path("articles.csv")
articles_df = pd.read_csv(articles_path, encoding="latin")
articles_df = articles_df.groupby('NewsType', group_keys=False).apply(lambda x: x.sample(frac=0.1))

In [None]:
# prompt = """You are an expert journalist helping to create a QA dataset to train and evaluate LLMs.
# Generate 4 questions that can be answered by reading the provided news article as context. Knowledge from reading the article must be the only needed to answer the questions.
# The first two questions need to be easy, while the latter two must be more difficult.
# Here is the article:
# ====== ARTICLE
# {article}
# ====== ARTICLE END
# The generated questions are:
# """

prompt = """<s>[INST] <<SYS>>
You are an expert journalist helping to create a QA dataset to train and evaluate LLMs.
Generate 4 questions that can be answered by reading the provided news article as context. Knowledge from reading the article must be the only one needed to answer the questions.
The first two questions need to be easy, while the latter two must be more difficult.
Do not answer the questions and do not add any comments, just write the questions.
Write each question on a separate line, starting prepending each of them with its number followed by a period, like "1.", "2." and so on.
<</SYS>>

====== ARTICLE
{article}
====== ARTICLE END [/INST]
"""

single_article_template = """===== ARTICLE {index}
{article}
===== ARTICLE END
"""

multi_article_prompt = """<s>[INST] <<SYS>>
You are an expert journalist helping to create a QA dataset to train and evaluate LLMs.
Generate 2 questions that can be answered by reading the provided news articles as context. Knowledge from reading the articles must be the only one needed to answer the questions.
Both questions require information coming from both articles. If it is not possible, then write "ERROR: the articles cannot be connected".
Only write the questions if the articles are somewhat connected.
Do not answer the questions and do not add any comments, just write the questions.
Write each question on a separate line, starting prepending each of them with its number followed by a period, like "1.", "2." and so on.
Do not reference the articles themselves in the questions. Do not write things like "According to the first article..." or "How does article 1 relate to article 2?".
<</SYS>>

====== ARTICLE
{articles}
====== ARTICLE END [/INST]
"""

In [None]:
total_outputs = []
with torch.no_grad(): # avoid gradient computations
    for article in articles_df["Article"]:
        print(f"Article type {type(article)}: {article}")
        input_sentence = prompt.format(article=article)
        tokenized_input_sentence = tokenizer(input_sentence, return_tensors="pt").to(device)
        model_output = model.generate(
            **tokenized_input_sentence,
            output_scores=False,
            return_dict_in_generate=True,
            do_sample=True,
            temperature=0.7,
            # max_length=20, #max number of generated tokens
        )
        total_outputs.append(model_output)
        break

Article type <class 'str'>: strong>ISLAMABAD: Another international rating agency has reckoned Pakistan among the top emerging South Asian economies owing to continuity of policies and political stability.</strongAtlantic Media Company (AMC) of the United States has ranked Pakistan as comparatively stronger economy in South Asian Markets and expected it will grow rapidly during days ahead.According to Atlantic s report the Pakistan governments investment in infrastructure and other developmental projects has caused countrys GDP to grow.It is acknowledged internationally that Pakistan is surfacing as Market leader.Current economic conditions of Pakistan are attracting foreign investors the AMC noted.It said last month American stock index firm MSCI (Morgan Stanley Capital International) also inducted Pakistan in 10 most emerging economies in the world.AMC said during the period January July 2016 Indian 100point index was 6.67% while Karachi Stock Exchange (KSE) had achieved 100 point 

In [None]:
print(tokenizer.decode(total_outputs[0]["sequences"][0], skip_special_tokens=True))

[INST] <<SYS>>
You are an expert journalist helping to create a QA dataset to train and evaluate LLMs.
Generate 4 questions that can be answered by reading the provided news article as context. Knowledge from reading the article must be the only one needed to answer the questions.
The first two questions need to be easy, while the latter two must be more difficult.
Do not answer the questions and do not add any comments, just write the questions.
<</SYS>>

strong>ISLAMABAD: Another international rating agency has reckoned Pakistan among the top emerging South Asian economies owing to continuity of policies and political stability.</strongAtlantic Media Company (AMC) of the United States has ranked Pakistan as comparatively stronger economy in South Asian Markets and expected it will grow rapidly during days ahead.According to Atlantic s report the Pakistan governments investment in infrastructure and other developmental projects has caused countrys GDP to grow.It is acknowledged inte

In [None]:
multi_article_total_outputs = []
with torch.no_grad(): # avoid gradient computations
    article1, article2 = articles_df["Article"].iloc[0], articles_df["Article"].iloc[1]
    article1 = single_article_template.format(index=1, article=article1)
    article2 = single_article_template.format(index=2, article=article2)
    articles = article1 + "\n" + article2
    print(articles)
    input_sentence = multi_article_prompt.format(articles=articles)
    tokenized_input_sentence = tokenizer(input_sentence, return_tensors="pt").to(device)
    model_output = model.generate(
        **tokenized_input_sentence,
        output_scores=False,
        return_dict_in_generate=True,
        do_sample=True,
        temperature=0.7,
        # max_length=20, #max number of generated tokens
    )
    multi_article_total_outputs.append(model_output)

===== ARTICLE 1
strong>ISLAMABAD: Another international rating agency has reckoned Pakistan among the top emerging South Asian economies owing to continuity of policies and political stability.</strongAtlantic Media Company (AMC) of the United States has ranked Pakistan as comparatively stronger economy in South Asian Markets and expected it will grow rapidly during days ahead.According to Atlantic s report the Pakistan governments investment in infrastructure and other developmental projects has caused countrys GDP to grow.It is acknowledged internationally that Pakistan is surfacing as Market leader.Current economic conditions of Pakistan are attracting foreign investors the AMC noted.It said last month American stock index firm MSCI (Morgan Stanley Capital International) also inducted Pakistan in 10 most emerging economies in the world.AMC said during the period January July 2016 Indian 100point index was 6.67% while Karachi Stock Exchange (KSE) had achieved 100 point index to 17 

In [None]:
print(tokenizer.decode(multi_article_total_outputs[0]["sequences"][0], skip_special_tokens=True))

[INST] <<SYS>>
You are an expert journalist helping to create a QA dataset to train and evaluate LLMs.
Generate 2 questions that can be answered by reading the provided news articles as context. Knowledge from reading the articles must be the only one needed to answer the questions.
Both questions require information coming from both articles. If it is not possible, then write "ERROR: the articles cannot be connected".
Only write the questions if the articles are somewhat connected.
Do not answer the questions and do not add any comments, just write the questions.
Write each question on a separate line, starting prepending each of them with its number followed by a period, like "1.", "2." and so on.
Do not reference the articles themselves in the questions. Do not write things like "According to the first article..." or "How does article 1 relate to article 2?".
<</SYS>>

===== ARTICLE 1
strong>ISLAMABAD: Another international rating agency has reckoned Pakistan among the top emerging 

##### With wrong prompt:
You are an expert journalist helping to create a QA dataset to train and evaluate LLMs.
Generate 4 questions that can be answered by reading the provided news article as context. Knowledge from reading the article must be the only needed to answer the questions.
The first two questions need to be easy, while the latter two must be more difficult.
Here is the article:
====== ARTICLE
strong>ISLAMABAD: Another international rating agency has reckoned Pakistan among the top emerging South Asian economies owing to continuity of policies and political stability.</strongAtlantic Media Company (AMC) of the United States has ranked Pakistan as comparatively stronger economy in South Asian Markets and expected it will grow rapidly during days ahead.According to Atlantic s report the Pakistan governments investment in infrastructure and other developmental projects has caused countrys GDP to grow.It is acknowledged internationally that Pakistan is surfacing as Market leader.Current economic conditions of Pakistan are attracting foreign investors the AMC noted.It said last month American stock index firm MSCI (Morgan Stanley Capital International) also inducted Pakistan in 10 most emerging economies in the world.AMC said during the period January July 2016 Indian 100point index was 6.67% while Karachi Stock Exchange (KSE) had achieved 100 point index to 17 percent.It said security situation in Pakistan has also improved resulting in economic stability.Moreover US 46 billion investments in China Pakistan Economic Corridor (CPEC) will help Pakistan overcome the chronic problems like power crisis and unemployment.According to the report Pakistani economy has been ranked as the best among all the South Asian countries including Bangladesh India Sri Lanka and Nepal.It said KSE 100 index has been performing best among Asian markets during 2016 ranked as world s 5th best stock market.Earlier American Media outfit Bloomberg has already declared Pakistan as Asian Tiger in its recent report while Moodys had revised Pakistans ratings upward from C to B.
====== ARTICLE END
The generated questions are:
Question 1: What is the name of the international rating agency that ranked Pakistan as one of the top emerging South Asian economies?
Question 2: According to the article, what has caused Pakistan's GDP to grow?
Question 3: What is the name of the development project that the Pakistani government has invested in?
Question 4: How has the security situation in Pakistan improved, according to the article?
Please answer the questions based on the information provided in the article.

##### Test 2

[INST] <<SYS>>
You are an expert journalist helping to create a QA dataset to train and evaluate LLMs.
Generate 4 questions that can be answered by reading the provided news article as context. Knowledge from reading the article must be the only one needed to answer the questions.
The first two questions need to be easy, while the latter two must be more difficult.
Do not answer the questions and do not add any comments, just write the questions.
<</SYS>>

====== ARTICLE
strong>ISLAMABAD: Another international rating agency has reckoned Pakistan among the top emerging South Asian economies owing to continuity of policies and political stability.</strongAtlantic Media Company (AMC) of the United States has ranked Pakistan as comparatively stronger economy in South Asian Markets and expected it will grow rapidly during days ahead.According to Atlantic s report the Pakistan governments investment in infrastructure and other developmental projects has caused countrys GDP to grow.It is acknowledged internationally that Pakistan is surfacing as Market leader.Current economic conditions of Pakistan are attracting foreign investors the AMC noted.It said last month American stock index firm MSCI (Morgan Stanley Capital International) also inducted Pakistan in 10 most emerging economies in the world.AMC said during the period January July 2016 Indian 100point index was 6.67% while Karachi Stock Exchange (KSE) had achieved 100 point index to 17 percent.It said security situation in Pakistan has also improved resulting in economic stability.Moreover US 46 billion investments in China Pakistan Economic Corridor (CPEC) will help Pakistan overcome the chronic problems like power crisis and unemployment.According to the report Pakistani economy has been ranked as the best among all the South Asian countries including Bangladesh India Sri Lanka and Nepal.It said KSE 100 index has been performing best among Asian markets during 2016 ranked as world s 5th best stock market.Earlier American Media outfit Bloomberg has already declared Pakistan as Asian Tiger in its recent report while Moodys had revised Pakistans ratings upward from C to B.
====== ARTICLE END [/INST]  Sure, here are four questions that can be answered by reading the provided news article:
1. What is the main reason why Pakistan's economy is expected to grow rapidly in the future?
2. According to the article, what has been the performance of the Karachi Stock Exchange (KSE) index during 2016?
3. How has the security situation in Pakistan improved, according to the article, leading to economic stability?
4. What is the amount of investment that the United States has pledged to invest in China-Pakistan Economic Corridor (CPEC)?