# Advanced RAG 01: Small to Big

### Child-Parent RecursiveRetriever and Sentence Window Retrieval with LlamaIndex

Sources:
- https://docs.llamaindex.ai/en/stable/examples/retrievers/recursive_retriever_nodes.html
- https://docs.llamaindex.ai/en/latest/examples/node_postprocessor/MetadataReplacementDemo.html

In [1]:
! pip install -U llama_hub llama_index braintrust autoevals pypdf pillow transformers torch torchvision

Collecting braintrust
  Obtaining dependency information for braintrust from https://files.pythonhosted.org/packages/b7/17/ac2a463891a639bf9856d02a28fa144aa9ff5d3f56fb099791fc0e46923d/braintrust-0.0.64-py3-none-any.whl.metadata
  Downloading braintrust-0.0.64-py3-none-any.whl.metadata (1.6 kB)
Collecting autoevals
  Obtaining dependency information for autoevals from https://files.pythonhosted.org/packages/ab/a5/29e350a10a7edd7cc5efdd6536a5ddd21bf88f3e3f15e567d7f5643932e5/autoevals-0.0.28-py3-none-any.whl.metadata
  Downloading autoevals-0.0.28-py3-none-any.whl.metadata (7.4 kB)
Collecting transformers
  Obtaining dependency information for transformers from https://files.pythonhosted.org/packages/9a/06/e4ec2a321e57c03b7e9345d709d554a52c33760e5015fdff0919d9459af0/transformers-4.35.0-py3-none-any.whl.metadata
  Downloading transformers-4.35.0-py3-none-any.whl.metadata (123 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m123.1/123.1 kB[0m [31m2.4 MB/s[0m eta [36m0:

In [13]:
import logging
import sys
import os

logging.basicConfig(stream=sys.stdout, level=logging.INFO)
logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))

In [14]:
import openai
openai.api_key = GPT_API

In [4]:
!wget --user-agent "Mozilla" "https://arxiv.org/pdf/2307.09288.pdf" -O "llama2.pdf"

--2023-11-05 02:32:41--  https://arxiv.org/pdf/2307.09288.pdf
Resolving arxiv.org (arxiv.org)... 128.84.21.199
Connecting to arxiv.org (arxiv.org)|128.84.21.199|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 13661300 (13M) [application/pdf]
Saving to: ‘llama2.pdf’


2023-11-05 02:32:47 (2.68 MB/s) - ‘llama2.pdf’ saved [13661300/13661300]



# Basic RAG Review

In [1]:
from pathlib import Path
from llama_hub.file.pdf.base import PDFReader
from llama_index.response.notebook_utils import display_source_node
from llama_index.retrievers import RecursiveRetriever
from llama_index.query_engine import RetrieverQueryEngine
from llama_index import VectorStoreIndex, ServiceContext
from llama_index.llms import OpenAI
import json

### Step 1: Loading Documents

In [2]:
import tika
tika.initVM()
from tika import parser
from llama_index import Document


doc_text = parser.from_file("giaithuat.pdf")['content']
docs = [Document(text=doc_text)]

In [3]:
docs[0]

Document(id_='9d03d8ac-17f2-41d7-81f1-527b3ef8c5f4', embedding=None, metadata={}, excluded_embed_metadata_keys=[], excluded_llm_metadata_keys=[], relationships={}, hash='f1f3044ede988e28a107f3c4b24d52a98b55289edaf1aa238d9f7b14e154d47f', text='\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\nhttps://thuviensach.vn\n\nLÊ MINH HOÀNG \n  \n\n \n\n \n\n \n\n \n\n \n\n \n\n \n \n\nBài giảng chuyên đề \n \n\n \n\n \n\n \n\n \n\n \n\n \n\n \n\n \n\nĐại học Sư phạm Hà Nội, 1999-2002 \n\n\n\nhttps://thuviensach.vn\n\n\n\nhttps://thuviensach.vn\n\n \n\n \n\n \n\nLời cảm ơn \n\n \n\n \nTôi muốn bày tỏ lòng biết ơn đối với những người thầy đã chỉ dạy tận tình trong những năm tháng \n\nđầy khó khăn khi tôi mới bước vào học tin học và lập trình. Sự hiểu biết và lòng nhiệt tình của các \n\nthầy không những đã cung cấp cho tôi những kiến thức quý báu mà còn là tấm gương sáng cho tôi \n\nnoi theo khi tôi đứng trên bục giảng cũng với tư cách là một người thầy. \n\n \n\nCuốn tài liệu này được v

### Step 2: Parsing Documents into Text Chunks (Nodes)

In [5]:
from llama_index.node_parser import SimpleNodeParser
from llama_index.schema import IndexNode

In [6]:
node_parser = SimpleNodeParser.from_defaults(chunk_size=1024)

In [7]:
node_parser

SimpleNodeParser(text_splitter=SentenceSplitter(chunk_size=1024, chunk_overlap=20, separator=' ', paragraph_separator='\n\n\n', secondary_chunking_regex='[^,.;。？！]+[,.;。？！]?', chunking_tokenizer_fn=<function split_by_sentence_tokenizer.<locals>.split at 0x7f6d6820fe20>, callback_manager=<llama_index.callbacks.base.CallbackManager object at 0x7f6db06d2a50>, tokenizer=functools.partial(<bound method Encoding.encode of <Encoding 'gpt2'>>, allowed_special='all')), include_metadata=True, include_prev_next_rel=True, metadata_extractor=None, callback_manager=<llama_index.callbacks.base.CallbackManager object at 0x7f6db06d2a50>)

In [8]:
base_nodes = node_parser.get_nodes_from_documents(docs)


In [9]:
base_nodes[0]

TextNode(id_='50afbc33-f175-44ab-a3f9-55992ca4fbf4', embedding=None, metadata={}, excluded_embed_metadata_keys=[], excluded_llm_metadata_keys=[], relationships={<NodeRelationship.SOURCE: '1'>: RelatedNodeInfo(node_id='9d03d8ac-17f2-41d7-81f1-527b3ef8c5f4', node_type=<ObjectType.DOCUMENT: '4'>, metadata={}, hash='f1f3044ede988e28a107f3c4b24d52a98b55289edaf1aa238d9f7b14e154d47f'), <NodeRelationship.NEXT: '3'>: RelatedNodeInfo(node_id='0bdb5105-60c1-429a-8757-1806d03779b5', node_type=<ObjectType.TEXT: '1'>, metadata={}, hash='31a230a9dcefdb0a283d846726eddfb4d9cc8dbb3f7bb148fdccbe9b9d2351b5')}, hash='2d0603d929bc0570fd51397877fe863de68392a56ce7703b2994fb2c81e57940', text='https://thuviensach.vn\n\nLÊ MINH HOÀNG \n  \n\n \n\n \n\n \n\n \n\n \n\n \n\n \n \n\nBài giảng chuyên đề \n \n\n \n\n \n\n \n\n \n\n \n\n \n\n \n\n \n\nĐại học Sư phạm Hà Nội, 1999-2002 \n\n\n\nhttps://thuviensach.vn\n\n\n\nhttps://thuviensach.vn\n\n \n\n \n\n \n\nLời cảm ơn \n\n \n\n \nTôi muốn bày tỏ lòng biết ơn đối v

In [10]:
# set node ids to be a constant
for idx, node in enumerate(base_nodes):
    node.id_ = f"node-{idx}"

In [11]:
base_nodes[0]

TextNode(id_='node-0', embedding=None, metadata={}, excluded_embed_metadata_keys=[], excluded_llm_metadata_keys=[], relationships={<NodeRelationship.SOURCE: '1'>: RelatedNodeInfo(node_id='9d03d8ac-17f2-41d7-81f1-527b3ef8c5f4', node_type=<ObjectType.DOCUMENT: '4'>, metadata={}, hash='f1f3044ede988e28a107f3c4b24d52a98b55289edaf1aa238d9f7b14e154d47f'), <NodeRelationship.NEXT: '3'>: RelatedNodeInfo(node_id='0bdb5105-60c1-429a-8757-1806d03779b5', node_type=<ObjectType.TEXT: '1'>, metadata={}, hash='31a230a9dcefdb0a283d846726eddfb4d9cc8dbb3f7bb148fdccbe9b9d2351b5')}, hash='2d0603d929bc0570fd51397877fe863de68392a56ce7703b2994fb2c81e57940', text='https://thuviensach.vn\n\nLÊ MINH HOÀNG \n  \n\n \n\n \n\n \n\n \n\n \n\n \n\n \n \n\nBài giảng chuyên đề \n \n\n \n\n \n\n \n\n \n\n \n\n \n\n \n\n \n\nĐại học Sư phạm Hà Nội, 1999-2002 \n\n\n\nhttps://thuviensach.vn\n\n\n\nhttps://thuviensach.vn\n\n \n\n \n\n \n\nLời cảm ơn \n\n \n\n \nTôi muốn bày tỏ lòng biết ơn đối với những người thầy đã chỉ dạy

### Step 3: Select Embedding Model and LLM

In [21]:
from llama_index.embeddings import resolve_embed_model
from llama_index.embeddings import OpenAIEmbedding

embed_model = OpenAIEmbedding()
llm = OpenAI(model="gpt-3.5-turbo")
service_context = ServiceContext.from_defaults(
    llm=llm, embed_model=embed_model
)

### Step 4: Create Index, retriever, and query engine

In [14]:
base_index = VectorStoreIndex(base_nodes, service_context=service_context)
base_retriever = base_index.as_retriever(similarity_top_k=2)

In [19]:
retrievals = base_retriever.retrieve(
    "How much money will i receive?"
)

In [20]:
for n in retrievals:
    display_source_node(n, source_length=1500)

**Node ID:** node-1<br>**Similarity:** 0.774154330185156<br>**Text:** to use it until after it has been remotely wiped. After you have signed this Agreement, you will
need to connect your laptop to wi-fi and it will be remotely wiped and restored to manufacturer
settings. Details regarding this process are explained in the Separation FAQ document.

5. Stipends: You have until November 30, 2023 to utilize any remaining funds in your Remote Work
Set-Up Stipend and your 2023 Growth Stipend. If you incur expenses that qualify for the Remote
Work Set-Up Stipend or the Growth Stipend, you can submit those receipts through Forma by
November 30, 2023. You will also have until November 30, 2023 to claim your Work from Home,
Health & Wellness, and/or Commuter Stipends for October and November. You must also submit
those receipts through Forma by November 30, 2023. If you fail to submit your receipts by
November 30, 2023, you will forfeit any remaining funds in your stipends.

6. Relocation: If you relocated at the request of the Company for business needs, the Company will
pay relocation expenses for you to return to your original place of residence. These efforts will be
coordinated by the Company’s Immigration team (immigration@faire.com). They will be in touch
with you directly with details regarding this option.

7. It is a condition of receiving the foregoing and, by signing below, you agree that:

a. You acknowledge and will continue to abide by your continuing obligations under your
Confidentiality and Proprietary Information Agreement (attache...<br>

**Node ID:** node-0<br>**Similarity:** 0.763560731694995<br>**Text:** Hoang Phan_Canada Separation Agreement.docx


November 1, 2023

PRIVATE AND CONFIDENTIAL

WITHOUT PREJUDICE

DELIVERED BY EMAIL AND COURIER

Hoang Phan
34 Cardill Cresent
Waterloo, ON N2L 3Y6
Canada

Dear Hoang:

This letter (“Agreement”) confirms that your, Hoang Phan (“Employee”), employment with Faire
Wholesale, Inc. (Canada) (the “Company”) terminates on November 1, 2023 (“Termination Date”).
However, in order to assist you in your transition to new employment, the Company is prepared to offer
you with the following in exchange for you signing this Agreement which includes a Release:

1. Separation Payment: The Company will pay you a lump sum payment of CAD $25,000.00,
representing 10 weeks’ base salary, less applicable deductions and withholdings (the
“Separation Payment”) as soon as administratively feasible upon the signing and receipt by the
Company of the Release enclosed with this letter. The Separation Payment is in addition to your
termination entitlements under  the Ontario  Employment Standards Act, 2000 as described in our
other letter. The Separation Payment, together with the Termination Entitlements, will provide you
with  a total of 10 weeks’ of severance pay and eight weeks of notice pay, less applicable
deductions.  

2. Health Benefits: Subject to the terms and conditions of the applicable plans and policies, the
Company will  continue your participation in only the health and dental benefit plans in which you
 presently participate until the earlier ...<br>

In [17]:
query_engine_base = RetrieverQueryEngine.from_args(
    base_retriever, service_context=service_context
)

In [22]:
response = query_engine_base.query(
    "How much money will i receive?"
)
print(str(response))

You will receive a lump sum payment of CAD $25,000.00 as the Separation Payment.


# Chunk References: Smaller Child Chunks Referring to Bigger Parent Chunk

In [15]:
sub_chunk_sizes = [128, 256, 512]
sub_node_parsers = [
    SimpleNodeParser.from_defaults(chunk_size=c) for c in sub_chunk_sizes
]

all_nodes = []
for base_node in base_nodes:
    for n in sub_node_parsers:
        sub_nodes = n.get_nodes_from_documents([base_node])
        sub_inodes = [
            IndexNode.from_text_node(sn, base_node.node_id) for sn in sub_nodes
        ]
        all_nodes.extend(sub_inodes)

    # also add original node to node
    original_node = IndexNode.from_text_node(base_node, base_node.node_id)
    all_nodes.append(original_node)

In [16]:
all_nodes_dict = {n.node_id: n for n in all_nodes}

In [17]:
len(all_nodes_dict)

9290

In [26]:
all_nodes_dict

{'f91f23d1-6c16-45f0-aa30-a605465dfa9f': IndexNode(id_='f91f23d1-6c16-45f0-aa30-a605465dfa9f', embedding=None, metadata={}, excluded_embed_metadata_keys=[], excluded_llm_metadata_keys=[], relationships={<NodeRelationship.SOURCE: '1'>: RelatedNodeInfo(node_id='node-0', node_type=<ObjectType.TEXT: '1'>, metadata={}, hash='a03a72c5fc0cf6884d2cda0fda5d03b1a8935f405a73a79b48bfad20b12568b2'), <NodeRelationship.NEXT: '3'>: RelatedNodeInfo(node_id='b839d022-1cec-4cda-bcae-8ad67a665f82', node_type=<ObjectType.TEXT: '1'>, metadata={}, hash='680506121bd558f6d8589f20ed0f061da649aa07c5c1172c67db663baab52a67')}, hash='a567074cd406e86a129bc30fa3791cface3b8c9594cefe4e5fe28b650a792bc3', text='Hoang Phan_Canada Separation Agreement.docx\n\n\nNovember 1, 2023\n\nPRIVATE AND CONFIDENTIAL\n\nWITHOUT PREJUDICE\n\nDELIVERED BY EMAIL AND COURIER\n\nHoang Phan\n34 Cardill Cresent\nWaterloo, ON N2L 3Y6\nCanada\n\nDear Hoang:\n\nThis letter (“Agreement”) confirms that your, Hoang Phan (“Employee”), employment wi

In [18]:
import chromadb
chroma_client = chromadb.EphemeralClient()
chroma_collection = chroma_client.create_collection("quickstart")

INFO:chromadb.telemetry.product.posthog:Anonymized telemetry enabled. See                     https://docs.trychroma.com/telemetry for more information.
Anonymized telemetry enabled. See                     https://docs.trychroma.com/telemetry for more information.


In [22]:
from llama_index import VectorStoreIndex, SimpleDirectoryReader, ServiceContext
from llama_index.vector_stores import ChromaVectorStore
from llama_index.storage.storage_context import StorageContext
from llama_index.embeddings import HuggingFaceEmbedding
from IPython.display import Markdown, display
import chromadb

# set up ChromaVectorStore and load in data
vector_store = ChromaVectorStore(chroma_collection=chroma_collection)
storage_context = StorageContext.from_defaults(vector_store=vector_store)
vector_index_chunk = VectorStoreIndex(
    all_nodes, storage_context=storage_context, service_context=service_context
)


In [30]:
vector_index_chunk = VectorStoreIndex(
    all_nodes, service_context=service_context
)

In [36]:
vector_retriever_chunk = vector_index_chunk.as_retriever(similarity_top_k=2)

In [37]:
retriever_chunk = RecursiveRetriever(
    "vector",
    retriever_dict={"vector": vector_retriever_chunk},
    node_dict=all_nodes_dict,
    verbose=True,
)

In [38]:
nodes = retriever_chunk.retrieve(
    "Cho tôi một ví dụ về thuật toán Hamilton?"
)
for node in nodes:
    display_source_node(node, source_length=2000)

[1;3;34mRetrieving with query id None: Cho tôi một ví dụ về thuật toán Hamilton?
[0m[1;3;38;5;200mRetrieved node with id, entering: node-276
[0m[1;3;34mRetrieving with query id node-276: Cho tôi một ví dụ về thuật toán Hamilton?
[0m[1;3;38;5;200mRetrieved node with id, entering: node-419
[0m[1;3;34mRetrieving with query id node-419: Cho tôi một ví dụ về thuật toán Hamilton?
[0m

**Node ID:** node-276<br>**Similarity:** 0.8839771767534763<br>**Text:** https://thuviensach.vn

Quy hoạch động 

Lê Minh Hoàng 

 147 

{Hàm Find, tìm vị trí j mà nếu đem ai ghép vào đầu dãy con đơn điệu tăng dài nhất bắt đầu từ aj sẽ được dãy đơn 
điệu tăng dài nhất bắt đầu tại ai} 
function Find(i: Integer): Integer; 
var 
  inf, sup, median, j: Integer; 
begin 
  inf := 1; sup := m + 1; 
  repeat  {Thuật toán tìm kiếm nhị phân} 
    median := (inf + sup) div 2; 
    j := StartOf[median]; 
    if a[j] > a[i] then inf := median {Luôn để aStartOf[inf] > ai ≥ aStartOf[sup]} 
    else sup := median; 
  until inf + 1 = sup; 
  Find := StartOf[inf]; 
end; 
 
procedure Optimize; 
var 
  i, j, k: Integer; 
begin 
  for i := n downto 0 do 
    begin 
      j := Find(i); 
      k := L[j] + 1; 
      if k > m then 
        begin 
          m := k; 
          StartOf[k] := i; 
        end 
      else 
        if a[StartOf[k]] < a[i] then 
          StartOf[k] := i; 
      L[i] := k; 
      T[i] := j; 
    end; 
end; 
 
procedure Result; 
var 
  f: Text; 
  i: Integer; 
begin 
  Assign(f, OutputFile); Rewrite(f); 
  WriteLn(f, m - 2); 
  i := T[0]; 
  while i <> n + 1 do 
    begin 
      WriteLn(f, 'a[', i, '] = ', a[i]); 
      i := T[i]; 
    end; 
  Close(f); 
end; 
 
begin 
  Enter; 
  Init; 
  Optimize; 
  Result; 
end. 
Dễ thấy chi phí thời gian thực hiện giải thuật này cấp O(nlogn), đây là một ví dụ điển hình 

cho thấy rằng một công thức truy hồi có thể có nhiều phương pháp tính. 



https://thuviensach.vn

Chuyên đề 

Đại học Sư phạm Hà Nội, 1999-2002 

 148 

3.2.<br>

**Node ID:** node-419<br>**Similarity:** 0.8777730779727762<br>**Text:** https://thuviensach.vn

Các thuật toán trên đồ thị 

Lê Minh Hoàng 

 229 

Hamilton. Ta có thể xây dựng hành trình bằng thuật toán quay lui kết hợp với phương pháp duyệt 

ưu tiên Warnsdorff: Nếu gọi deg(x, y) là số ô kề với ô (x, y) và chưa đi qua (kề ở đây theo nghĩa 

đỉnh kề chứ không phải là ô kề cạnh) thì từ một ô ta sẽ không thử xét lần lượt các hướng đi có thể, 

mà ta sẽ ưu tiên thử hướng đi tới ô có deg nhỏ nhất trước. Trong trường hợp có tồn tại đường 

đi, phương pháp này hoạt động với tốc độ tuyệt vời: Với mọi n chẵn trong khoảng từ 6 tới 18, với 

mọi vị trí ô xuất phát, trung bình thời gian tính từ lúc bắt đầu tới lúc tìm ra một nghiệm < 1 giây. 

Tuy nhiên trong trường hợp n lẻ, có lúc không tồn tại đường đi, do phải duyệt hết mọi khả năng 

nên thời gian thực thi lại hết sức tồi tệ. (Có xét ưu tiên như trên hay xét thứ tự như trước kia thì 

cũng vậy thôi. Ta có thể thử với n lẻ: 5, 7, 9 … và ô xuất phát (1, 2), sau đó ngồi xem máy tính toát 

mồ hôi). 



https://thuviensach.vn

Chuyên đề 

Đại học Sư phạm Hà Nội, 1999-2002 

 230 

§8. BÀI TOÁN ĐƯỜNG ĐI NGẮN NHẤT 

8.1.<br>

In [39]:
query_engine_chunk = RetrieverQueryEngine.from_args(
    retriever_chunk, service_context=service_context
)

In [40]:
response = query_engine_chunk.query(
    "Cho tôi một ví dụ về thuật toán Hamilton?"
)
print(str(response))

[1;3;34mRetrieving with query id None: Cho tôi một ví dụ về thuật toán Hamilton?
[0m[1;3;38;5;200mRetrieved node with id, entering: node-276
[0m[1;3;34mRetrieving with query id node-276: Cho tôi một ví dụ về thuật toán Hamilton?
[0m[1;3;38;5;200mRetrieved node with id, entering: node-419
[0m[1;3;34mRetrieving with query id node-419: Cho tôi một ví dụ về thuật toán Hamilton?
[0mMột ví dụ về thuật toán Hamilton là thuật toán quay lui kết hợp với phương pháp duyệt ưu tiên Warnsdorff. Thuật toán này được sử dụng để xây dựng hành trình Hamilton trên đồ thị. Nếu có một đường đi Hamilton tồn tại, phương pháp này có thể tìm ra hành trình đó với tốc độ tuyệt vời. Tuy nhiên, trong trường hợp không tồn tại đường đi Hamilton, thời gian thực thi của thuật toán này có thể rất lâu.


# Evaluation

In [None]:
from llama_index.evaluation import (
    generate_question_context_pairs,
    EmbeddingQAFinetuneDataset,
)
import nest_asyncio

nest_asyncio.apply()

In [None]:
eval_dataset = generate_question_context_pairs(base_nodes)

100%|██████████| 80/80 [02:37<00:00,  1.97s/it]


In [None]:
eval_dataset.save_json("llama2_eval_dataset.json")
# eval_dataset = EmbeddingQAFinetuneDataset.from_json("data/llama2_eval_dataset.json")

In [None]:
import pandas as pd
from llama_index.evaluation import RetrieverEvaluator, get_retrieval_results_df

# set vector retriever similarity top k to higher
top_k = 10


def display_results(names, results_arr):
    """Display results from evaluate."""

    hit_rates = []
    mrrs = []
    for name, eval_results in zip(names, results_arr):
        metric_dicts = []
        for eval_result in eval_results:
            metric_dict = eval_result.metric_vals_dict
            metric_dicts.append(metric_dict)
        results_df = pd.DataFrame(metric_dicts)

        hit_rate = results_df["hit_rate"].mean()
        mrr = results_df["mrr"].mean()
        hit_rates.append(hit_rate)
        mrrs.append(mrr)

    final_df = pd.DataFrame(
        {"retrievers": names, "hit_rate": hit_rates, "mrr": mrrs}
    )
    display(final_df)

In [None]:
# base
base_retriever = base_index.as_retriever(similarity_top_k=top_k)
retriever_evaluator = RetrieverEvaluator.from_metric_names(
    ["mrr", "hit_rate"], retriever=base_retriever
)
results_base = await retriever_evaluator.aevaluate_dataset(
    eval_dataset, show_progress=True
)




100%|██████████| 167/167 [00:15<00:00, 10.80it/s]


In [None]:
# chunk
vector_retriever_chunk = vector_index_chunk.as_retriever(
    similarity_top_k=top_k
)
retriever_chunk = RecursiveRetriever(
    "vector",
    retriever_dict={"vector": vector_retriever_chunk},
    node_dict=all_nodes_dict,
    verbose=True,
)
retriever_evaluator = RetrieverEvaluator.from_metric_names(
    ["mrr", "hit_rate"], retriever=retriever_chunk
)

results_chunk = await retriever_evaluator.aevaluate_dataset(
    eval_dataset, show_progress=True
)

  0%|          | 0/167 [00:00<?, ?it/s]

[1;3;34mRetrieving with query id None: In the context of language processing, what is the significance of the paper "A general language assistant as a laboratory for alignment" by Askell et al. (2021a)?
[0m[1;3;38;5;200mRetrieved node with id, entering: node-7
[0m[1;3;34mRetrieving with query id node-7: In the context of language processing, what is the significance of the paper "A general language assistant as a laboratory for alignment" by Askell et al. (2021a)?
[0m[1;3;38;5;200mRetrieved node with id, entering: node-41
[0m[1;3;34mRetrieving with query id node-41: In the context of language processing, what is the significance of the paper "A general language assistant as a laboratory for alignment" by Askell et al. (2021a)?
[0m[1;3;38;5;200mRetrieved node with id, entering: node-38
[0m[1;3;34mRetrieving with query id node-38: In the context of language processing, what is the significance of the paper "A general language assistant as a laboratory for alignment" by Askel

100%|██████████| 167/167 [00:29<00:00,  5.62it/s]

[1;3;38;5;200mRetrieved node with id, entering: node-79
[0m[1;3;34mRetrieving with query id node-79: How does the study introduce Llama 2 as a new family of pretrained and fine-tuned models? Discuss the scale of parameters, competitiveness with existing models, and alignment with principles of helpfulness and safety.
[0m[1;3;38;5;200mRetrieved node with id, entering: node-4
[0m[1;3;34mRetrieving with query id node-4: How does the study introduce Llama 2 as a new family of pretrained and fine-tuned models? Discuss the scale of parameters, competitiveness with existing models, and alignment with principles of helpfulness and safety.
[0m[1;3;38;5;200mRetrieved node with id, entering: node-7
[0m[1;3;34mRetrieving with query id node-7: How does the study introduce Llama 2 as a new family of pretrained and fine-tuned models? Discuss the scale of parameters, competitiveness with existing models, and alignment with principles of helpfulness and safety.
[0m[1;3;38;5;200mRetrieved n




In [None]:
full_results_df = get_retrieval_results_df(
    [
        "Base Retriever",
        "Retriever (Chunk References)"
    ],
    [results_base, results_chunk],
)
display(full_results_df)

Unnamed: 0,retrievers,hit_rate,mrr
0,Base Retriever,0.802395,0.603816
1,Retriever (Chunk References),0.91018,0.768686


# Sentence Window Retrieval


In [41]:
from llama_index.node_parser import SentenceWindowNodeParser

In [42]:
# create the sentence window node parser w/ default settings
node_parser = SentenceWindowNodeParser.from_defaults(
    window_size=3,
    window_metadata_key="window",
    original_text_metadata_key="original_text",
)

In [43]:
node_parser

SentenceWindowNodeParser(sentence_splitter=<function split_by_sentence_tokenizer.<locals>.split at 0x7f458bba5f80>, window_size=3, window_metadata_key='window', original_text_metadata_key='original_text', include_metadata=True, include_prev_next_rel=True, metadata_extractor=None, callback_manager=<llama_index.callbacks.base.CallbackManager object at 0x7f45cd398b50>)

In [44]:
sentence_nodes = node_parser.get_nodes_from_documents(docs)

In [None]:
sentence_index = VectorStoreIndex(sentence_nodes, service_context=service_context)

In [None]:
from llama_index.indices.postprocessor import MetadataReplacementPostProcessor

query_engine = sentence_index.as_query_engine(
    similarity_top_k=2,
    # the target key defaults to `window` to match the node_parser's default
    node_postprocessors=[
        MetadataReplacementPostProcessor(target_metadata_key="window")
    ],
)


In [None]:
window_response = query_engine.query(
    "Can you tell me about the key concepts for safety finetuning"
)
print(window_response)

The key concepts for safety fine-tuning include supervised safety fine-tuning and safety reinforcement learning from human feedback (RLHF). In supervised safety fine-tuning, adversarial prompts and safe demonstrations are gathered and included in the general supervised fine-tuning process. This helps align the model with safety guidelines even before RLHF and lays the foundation for high-quality human preference data annotation. Safety RLHF involves integrating safety into the general RLHF pipeline. These techniques are used to mitigate safety risks and ensure the safety of the system during fine-tuning.


In [None]:
# check the original sentence that was retrieved for each node, as well as the actual window of sentences that was sent to the LLM.
window = window_response.source_nodes[0].node.metadata["window"]
sentence = window_response.source_nodes[0].node.metadata["original_text"]

print(f"Window: {window}")
print("------------------")
print(f"Original Sentence: {sentence}")

Window: Further
testing and mitigation should be done to understand bias and other social issues for the specific context
in which a system may be deployed.  For this, it may be necessary to test beyond the groups available in
theBOLDdataset(race,religion,andgender).  AsLLMsareintegratedanddeployed,welookforwardto
continuing research that will amplify their potential for positive impact on these important social issues.
 4.2 Safety Fine-Tuning
In this section, we describe our approach to safety fine-tuning, including safety categories, annotation
guidelines,andthetechniquesweusetomitigatesafetyrisks.  Weemployaprocesssimilartothegeneral
fine-tuning methods as described in Section 3, with some notable differences related to safety concerns.
 Specifically, we use the following techniques in safety fine-tuning:
1.Supervised Safety Fine-Tuning : We initialize by gathering adversarial prompts and safe demonstra-
tions that are then included in the general supervised fine-tuning process (Sec