# Metadata Replacement + Node Sentence Window

In this notebook, we use the `SentenceWindowNodeParser` to parse documents into single sentences per node. Each node also contains a “window” with the sentences on either side of the node sentence.

Then, during retrieval, before passing the retrieved sentences to the LLM, the single sentences are replaced with a window containing the surrounding sentences using the MetadataReplacementNodePostProcessor.

This is most useful for large documents/indexes, as it helps to retrieve more fine-grained details.

By default, the sentence window is 5 sentences on either side of the original sentence.

In this case, chunk size settings are not used, in favor of following the window settings.


In [1]:
from llama_index import ServiceContext

from llama_index.node_parser import (
    SentenceWindowNodeParser,
)

node_parser = SentenceWindowNodeParser(
    window_size=5,
    window_metadata_key="window",
    original_text_metadata_key="original_text",
)


service_context = ServiceContext.from_defaults(llm="local", embed_model="local", node_parser=node_parser)

llama_model_loader: loaded meta data with 19 key-value pairs and 363 tensors from /Users/zeyuli/Library/Caches/llama_index/models/llama-2-13b-chat.Q4_0.gguf (version GGUF V2)
llama_model_loader: - tensor    0:                token_embd.weight q4_0     [  5120, 32000,     1,     1 ]
llama_model_loader: - tensor    1:           blk.0.attn_norm.weight f32      [  5120,     1,     1,     1 ]
llama_model_loader: - tensor    2:            blk.0.ffn_down.weight q4_0     [ 13824,  5120,     1,     1 ]
llama_model_loader: - tensor    3:            blk.0.ffn_gate.weight q4_0     [  5120, 13824,     1,     1 ]
llama_model_loader: - tensor    4:              blk.0.ffn_up.weight q4_0     [  5120, 13824,     1,     1 ]
llama_model_loader: - tensor    5:            blk.0.ffn_norm.weight f32      [  5120,     1,     1,     1 ]
llama_model_loader: - tensor    6:              blk.0.attn_k.weight q4_0     [  5120,  5120,     1,     1 ]
llama_model_loader: - tensor    7:         blk.0.attn_output.weight q

In [3]:
from qasper_data.qasper_dataset import QasperDataset, PaperIndex

train_paper = QasperDataset("train").as_papers()[0]


paper_index = PaperIndex(train_paper, service_context)

sentence_index = paper_index.as_index()


In [9]:
from llama_index.indices.postprocessor import MetadataReplacementPostProcessor, SentenceTransformerRerank

query_engine = sentence_index.as_query_engine(
    similarity_top_k=3,
    # the target key defaults to `window` to match the node_parser's default
    node_postprocessors=[
        MetadataReplacementPostProcessor(target_metadata_key="window"),
    ],
)


Llama.generate: prefix-match hit


  Based on the provided context information, the study's results are as follows:

1. The proposed method for extracting affective events from Japanese text using discourse relations performed well even with a minimal amount of supervision.
2. The method correctly learned non-compositional expressions, such as idiomatic phrases like "肩を落とす" (lit. drop one's shoulders), which express disappointed feelings.
3. The event pairs linked by discourse analysis were found to be useful, but noisy, and adding linguistically-motivated filtering rules could improve the performance.
4. The study used a dataset of about 100 million sentences extracted from Japanese websites using HTML layouts and linguistic patterns, which covered various genres.
5. The model was trained using a semi-supervised approach with the latest version of the ACP Corpus, which consists of 15 positive words and 15 negative words.
6. The objective function for supervised training was defined as the sum of the reference scores of


llama_print_timings:        load time =   15141.56 ms
llama_print_timings:      sample time =      24.29 ms /   256 runs   (    0.09 ms per token, 10540.62 tokens per second)
llama_print_timings: prompt eval time =   14134.71 ms /   989 tokens (   14.29 ms per token,    69.97 tokens per second)
llama_print_timings:        eval time =   15494.57 ms /   255 runs   (   60.76 ms per token,    16.46 tokens per second)
llama_print_timings:       total time =   30224.37 ms
