## Setup

Install the required packages for Llama Index Embeddings

In [1]:
# I commented out the following line because it's already satisfied in my env

# %pip install llama-index-embeddings-openai
# %pip install llama-index-embeddings-huggingface
# %pip install llama-index-llms-openai

In [2]:
# Enable automatic reloading of modules
%load_ext autoreload
%autoreload 2

In [3]:
# I commented out the following line because it's already satisfied in my env

# !pip install llama-index

In [1]:
import os
import openai
from dotenv import load_dotenv

# Load environment variables from a .env file
load_dotenv()

True

In [2]:
# Set the OpenAI API key from the environment variable
openai.api_key = os.getenv("OPENAI_API_KEY")

In [3]:
from llama_index.llms.openai import OpenAI
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from llama_index.core.node_parser import SentenceWindowNodeParser
from llama_index.core.node_parser import SentenceSplitter

# create the sentence window node parser w/ default settings
node_parser = SentenceWindowNodeParser.from_defaults(
    window_size=5,
    window_metadata_key="window",
    original_text_metadata_key="original_text",
)

# base node parser is a sentence splitter
text_splitter = SentenceSplitter()

We'll define two crucial models :

LLM (Language Model): The OpenAI model with the configuration "gpt-3.5-turbo" and a temperature setting of 0.1. This model is responsible for generating responses based on user queries and relevant context.

Embedding Model: The HuggingFaceEmbedding model with the model name "sentence-transformers/all-mpnet-base-v2" and a maximum sequence length of 512. This model is used to create vector embeddings for individual text chunks.

In [4]:
# Select Embedding Model and LLM
llm = OpenAI(model="gpt-3.5-turbo", temperature=0.1)
embed_model = HuggingFaceEmbedding(
    model_name="sentence-transformers/all-mpnet-base-v2", max_length=512
)

from llama_index.core import Settings

Settings.llm = llm
Settings.embed_model = embed_model
Settings.text_splitter = text_splitter

## Data Loading, Index Building
In this section, we load the data and construct the vector index.

In [5]:
from llama_index.core import SimpleDirectoryReader

# List of PDF files to be read
input_files = [
    "./data/Grasslan_carbon_sequ.pdf",
    "./data/Rot_gas_serre.pdf"
]

# Create a SimpleDirectoryReader instance
documents = SimpleDirectoryReader(input_files=input_files).load_data()

## Extract Nodes

We identify and extract the set of nodes that will be stored in the VectorIndex. This set encompasses nodes processed by the sentence window parser and the "base" nodes extracted using the standard parser.

In [6]:
nodes = node_parser.get_nodes_from_documents(documents)

In [7]:
base_nodes = text_splitter.get_nodes_from_documents(documents)

for idx, node in enumerate(base_nodes):
    node.id_ = f"node-{idx}"
    print(node)

Node ID: node-0
Text: Animal (2010), 4:3, pp 334–350 &The Animal Consortium 2009
doi:10.1017/S1751731109990784animal Mitigating the greenhouse gas
balance of ruminant production systems through carbon sequestration in
grasslands J. F . Soussana1-, T. Tallec1and V. Blanfort1,2 1INRA
UR0874, UREP Grassland Ecosystem Research, 234, Avenue du Bre ´zet,
Clermont-Ferrand, ...
Node ID: node-1
Text: This will require avoidingland use changes that reduce ecosystem
soil C stocks (e.g.deforestation, ploughing up long-term grasslands)
and a cautious management of pastures, aiming at preserving
andrestoring soils and their soil organic matter content. Combinedwith
other mitigation measures, such as a reduction in the useof N
fertilisers, of foss...
Node ID: node-2
Text: and provide a variety of goods and services to support ﬂora,
fauna, and human populations worldwide. On aglobal scale, livestock
use 3.4 billion hectares of grazingland (i.e. grasslands and
rangelands), in addition to animal feed pr

We build both the sentence index, as well as the “base” index.

In [13]:
from llama_index.core import VectorStoreIndex

# Create a VectorStoreIndex instance using the 'nodes' dataset
sentence_index = VectorStoreIndex(nodes)

In [14]:
# Create another VectorStoreIndex instance using the 'base_nodes' dataset
base_index = VectorStoreIndex(base_nodes)

## Querying

#### With MetadataReplacementPostProcessor

#### Querying with window method

In [186]:
from llama_index.core.postprocessor import MetadataReplacementPostProcessor

# Create a QueryEngine instance using the window retrieval
query_engine = sentence_index.as_query_engine(
    similarity_top_k=5,
    # the target key defaults to `window` to match the node_parser's default
    node_postprocessors=[
        MetadataReplacementPostProcessor(target_metadata_key="window")
    ],
)

#### Querying with normal VectorStoreIndex

In [187]:
# Create a QueryEngine instance using base retrieval
base_query_engine = base_index.as_query_engine(similarity_top_k=2)

If that didn't work, we should bump up the top k! (example k = 4) This will be slower and use more tokens compared to the sentence window index.

## Use Gradio UI to interact with our chatbot

In [191]:
import gradio as gr
import time


with gr.Blocks() as demo:
    chatbot = gr.Chatbot()
    msg = gr.Textbox()
    clear = gr.ClearButton([msg, chatbot])

    def respond(message, chat_history):
        bot_message = str(base_query_engine.query(message))
        print(bot_message)
        chat_history.append((message, bot_message))
        time.sleep(0.05)
        return "", chat_history

    msg.submit(respond, [msg, chatbot], [msg, chatbot])

demo.launch()

Running on local URL:  http://127.0.0.1:7862

To create a public link, set `share=True` in `launch()`.




# Analysis

We examine both the initial sentence retrieved for each node and the specific set of sentences sent to the LLM.

In [175]:
window_response = query_engine.query("What GHG stands for?")

sentence = window_response.source_nodes[0].node.metadata["original_text"]
window = window_response.source_nodes[0].node.metadata["window"]

print(f"Original Sentence: {sentence}")
print("------------------")
print(f"Window: {window}")

Original Sentence: Advanced (Tier 3) and veriﬁable methodolo-
gies still need to be developed in order to include GHGremovals obtained by farm scale mitigation options inagriculture, forestry and land use (AFOLU sector, IPCC,2006) national GHG inventories.

------------------
Window: The balance of the farm gate
GHG ﬂuxes leads to a sink activity for four out of the sevenfarms.  When including pre-chain emissions related to inputs,all farms – but one – were found to be net sources of GHG.The total farm GHG balance varied between a sink of 270
and a source of 1310 kg CO
2equivalents per unit (GJ)
energy in animal farm products.  Byrne et al . ( 2 0 0 7 ) ,m e a -
suring C balance for two dairy farms in South West Ireland,equally considered the farm perimeter as the systemboundary for inputs and outputs of C. In the two casestudies, both farms appeared as net C sinks, sequesteringbetween 200 and 215 g C/m
2per year (Table 1).
 Farm scale mitigation options thus need to be carefully
asses

In [19]:
for source_node in window_response.source_nodes:
    print("------------------")
    print(source_node.node.metadata["original_text"])

------------------
Advanced (Tier 3) and veriﬁable methodolo-
gies still need to be developed in order to include GHGremovals obtained by farm scale mitigation options inagriculture, forestry and land use (AFOLU sector, IPCC,2006) national GHG inventories.

------------------
The GHG inventory methodology usedby IPCC (IPCC, 1996 and 2006) only includes, however,farm emissions in the agriculture sector. 
------------------
Nevertheless, sites that were intensively managed bygrazing and cutting had a negative NGHG and were
therefore estimated to be GHG sources in CO
2equivalents.

------------------
The mean on-site NGHG reached 198 g CO 2equiva-
lents/m2per year, indicating a sink for the atmosphere.

------------------
The net emissions of GHGs (methane, nitrousoxide and carbon dioxide) are related to C and nitrogen
ﬂows and to environmental conditions.



Here, we observe that the sentence window index effortlessly identified two nodes discussing SOC.

**N.B:** although the embeddings are based solely on the original sentence, the LLM also comprehends the context of SOC during its reading process.

In [189]:
vector_response = base_query_engine.query("What is NCS?")

for node in vector_response.source_nodes:
    print("NCS mentioned?", "NCS" in node.node.text)
    print("--------")

NCS mentioned? True
--------
NCS mentioned? True
--------


In [190]:
print(vector_response.source_nodes[0].node.text)

The uncertainty associated to NCS can be estimated using
Gaussian error propagation rules and accounting for sitenumber in each management type. NCS uncertainty reached
25% and 80% of the mean (data not shown) for grazed and
for cut and mixed systems, respectively. Indeed, Ammann
et
al. (2007) reported that cutting and manure application
introduce further uncertainties in NCS estimates.
A literature search shows that grassland C sequestration
reaches on average 5 630 g C/m2per year according to
inventories of SOC stocks and 22 656 g C/m2per year
according to C ﬂux balance (Table 1). These two estimates
are therefore not signiﬁcantly different, although there has
not yet been any direct comparison at the same sitebetween C ﬂux and C stock change measurements.According to both ﬂux ( 2231 and 77 g C/m
2per year,
respectively, Table 1A) and inventory (Bellamy et al ., 2005)Grassland carbon sequestration
341


LLMs often exhibit a tendency to prioritize and assign more significance to the text at the beginning and end of retrieved context, while downplaying the middle. 
So SOC is disuccsed, but it is in the middle chunk of the document.

# Evaluation

We'll do an evaluation to assess the effectiveness of the sentence window retriever in comparison to the base retriever.

We establish or load an evaluation benchmark dataset and proceed to execute various assessments on it.

In [193]:
from llama_index.core.evaluation import DatasetGenerator

from llama_index.llms.openai import OpenAI
import nest_asyncio
import random

nest_asyncio.apply()

In [9]:
len(base_nodes)

61

In [194]:
num_nodes_eval = 10

# there are 61 nodes total. Take the first 25 to generate question
sample_eval_nodes = random.sample(base_nodes[:25], num_nodes_eval)

# generate questions from the largest chunks (1024)
dataset_generator = DatasetGenerator(
    sample_eval_nodes,
    llm=OpenAI(model="gpt-3.5-turbo"),
    show_progress=True,
    num_questions_per_chunk=2,
)

  dataset_generator = DatasetGenerator(


In [195]:
eval_dataset = await dataset_generator.agenerate_dataset_from_nodes()

100%|██████████| 10/10 [00:03<00:00,  3.32it/s]
100%|██████████| 2/2 [00:03<00:00,  1.51s/it]
100%|██████████| 2/2 [00:02<00:00,  1.38s/it]
100%|██████████| 2/2 [00:02<00:00,  1.25s/it]
100%|██████████| 2/2 [00:02<00:00,  1.03s/it]
100%|██████████| 2/2 [00:03<00:00,  1.66s/it]
100%|██████████| 2/2 [00:02<00:00,  1.09s/it]
100%|██████████| 2/2 [00:02<00:00,  1.14s/it]
100%|██████████| 2/2 [00:03<00:00,  1.57s/it]
100%|██████████| 2/2 [00:03<00:00,  1.55s/it]
100%|██████████| 2/2 [00:04<00:00,  2.29s/it]
  return QueryResponseDataset(queries=queries, responses=responses_dict)


In [197]:
eval_dataset.save_json("data/ipcc_eval_qr_dataset.json")

### Compare Results

In [198]:
import nest_asyncio

nest_asyncio.apply()

In [199]:
max_samples = 10

eval_qs = eval_dataset.questions
ref_response_strs = [r for (_, r) in eval_dataset.qr_pairs]

# resetup base query engine and sentence window query engine
# base query engine
base_query_engine = base_index.as_query_engine(similarity_top_k=2)

# sentence window query engine
query_engine = sentence_index.as_query_engine(
    similarity_top_k=2,
    # the target key defaults to `window` to match the node_parser's default
    node_postprocessors=[
        MetadataReplacementPostProcessor(target_metadata_key="window")
    ],
)

In [200]:
import asyncio

# Define a function to get responses
async def aget_responses(questions, query_engine, show_progress=True):
    tasks = []
    for question in questions:
        tasks.append(query_engine.aquery(question))
    return await asyncio.gather(*tasks)

# Get responses using the defined function
base_pred_responses = await aget_responses(eval_qs[:max_samples], base_query_engine, show_progress=True)
pred_responses = await aget_responses(eval_qs[:max_samples], query_engine, show_progress=True)

# Convert responses to strings
pred_response_strs = [str(p) for p in pred_responses]
base_pred_response_strs = [str(p) for p in base_pred_responses]

The first evaluation question.

In [201]:
eval_qs[0]

'How does soil carbon sequestration contribute to mitigating greenhouse gas emissions in ruminant production systems, according to the information provided in the document?'

The first reference answer.

In [202]:
ref_response_strs[0]

'Soil carbon sequestration plays a significant role in mitigating greenhouse gas emissions in ruminant production systems. It is mentioned in the document that soil carbon sequestration is the mechanism responsible for most of the greenhouse gas mitigation potential in the agriculture sector. By measuring changes in soil organic carbon stocks and the net balance of carbon fluxes, it is determined that grassland carbon sequestration can reach significant levels. Practices such as avoiding soil tillage, converting grasslands to arable use, and using light grazing instead of heavy grazing can help reduce carbon losses and increase carbon sequestration. Overall, soil carbon sequestration in grasslands has the potential to partly mitigate the greenhouse gas balance of ruminant production systems.'

The first generated answer using window retriever.

In [203]:
pred_response_strs[0]

'Soil carbon sequestration contributes significantly to mitigating greenhouse gas emissions in ruminant production systems by acting as the primary mechanism for reducing the greenhouse gas balance in the agriculture sector. It can be directly determined by measuring changes in soil organic carbon stocks and indirectly by measuring the net balance of carbon fluxes. The document indicates that grassland carbon sequestration plays a crucial role in offsetting greenhouse gas emissions, with on-site and off-site carbon sequestration activities helping to partially mitigate the overall greenhouse gas balance of ruminant production systems.'

Storing a list of contexts and showing the context of the first generated response using the window retriever.

In [204]:
contexts = []

for i in range(0, max_samples):
    sublist = []
    sublist.append(pred_responses[i].source_nodes[0].node.metadata['window'])
    contexts.append(sublist)

contexts[0]

['Animal (2010), 4:3, pp 334–350 &The Animal Consortium 2009\ndoi:10.1017/S1751731109990784animal\nMitigating the greenhouse gas balance of ruminant production\nsystems through carbon sequestration in grasslands\nJ. F .  Soussana1-, T. Tallec1and V. Blanfort1,2\n1INRA UR0874, UREP Grassland Ecosystem Research, 234, Avenue du Bre ´zet, Clermont-Ferrand, F-63100, France;2CIRAD UR 8, Livestock Systems,\nCampus International de Baillarguet, Cedex 5, Montpellier, F-34398, France\n(Received 6 January 2009; Accepted 12 June 2009; First published online 22 September 2009)\nSoil carbon sequestration (enhanced sinks) is the mechanism responsible for most of the greenhouse gas (GHG) mitigation\npotential in the agriculture sector.  Carbon sequestration in grasslands can be determined directly by measuring changes in soil\norganic carbon (SOC) stocks and indirectly by measuring the net balance of C ﬂuxes.  A literature search shows that grassland\nC sequestration reaches on average 5 630 g C/m2per

RAGAS Evaluation

In [205]:
from datasets import Dataset 
from ragas import evaluate
from ragas.metrics import faithfulness, answer_correctness, answer_relevancy, answer_similarity

data_samples = {
    'question': eval_qs[:max_samples],
    'answer': pred_response_strs[:max_samples],
    'ground_truth': ref_response_strs[:max_samples],
    'contexts': contexts[:max_samples]
}

dataset = Dataset.from_dict(data_samples)

score = evaluate(dataset,metrics=[faithfulness, answer_correctness, answer_relevancy, answer_similarity])
score.to_pandas()

Evaluating:   0%|          | 0/40 [00:00<?, ?it/s]

Unnamed: 0,question,answer,ground_truth,contexts,faithfulness,answer_correctness,answer_relevancy,answer_similarity
0,How does soil carbon sequestration contribute ...,Soil carbon sequestration contributes signific...,Soil carbon sequestration plays a significant ...,"[Animal (2010), 4:3, pp 334–350 &The Animal Co...",1.0,0.746012,0.923576,0.984045
1,What management practices are recommended to r...,Management practices recommended to reduce car...,The management practices recommended to reduce...,"[Soussana1-, T. Tallec1and V. Blanfort1,2\n1IN...",1.0,0.998033,0.98411,0.99213
2,How does the fraction of non-labile C in manur...,The fraction of non-labile C in manure (f humi...,The fraction of non-labile C in manure (f humi...,"[In the barn, ruminant’s digestionof the harve...",0.6,0.455902,0.974616,0.966466
3,Compare the net carbon storage (NCS) values in...,"The net carbon storage (NCS) values in grazed,...",The NCS values in grazed grassland systems rea...,[Table 1 Literature survey of net C storage (N...,0.0,0.64923,0.94813,0.960547
4,What are the key processes controlling soil or...,The key processes controlling soil organic car...,The key processes controlling soil organic car...,[Processes controlling soil organic carbon acc...,1.0,0.569947,0.972375,0.994073
5,How does land use change impact carbon sequest...,Land use change can impact carbon sequestratio...,Land use change can impact carbon sequestratio...,"[In an upland grassland in France, the meanres...",1.0,0.54231,0.933471,0.96924
6,What is the average annual nitrogen applicatio...,The average annual nitrogen application rate f...,The average annual nitrogen application rate f...,[(2007) 200 kg N/ha per year\nIntensive meadow...,0.0,0.99843,0.977981,0.99372
7,Compare the carbon sequestration rates in the ...,The carbon sequestration rates in the abandone...,The carbon sequestration rates in the abandone...,"[Animal (2010), 4:3, pp 334–350 &The Animal Co...",0.5,0.675442,0.951458,0.987533
8,How do future drought events impact the carbon...,Future drought events could potentially turn t...,Future drought events could turn temperate gra...,"[Hence, when expressed in CO 2\nequivalents, e...",1.0,0.61862,0.888208,0.974439
9,What are the potential implications of climate...,The potential implications of climate change o...,The potential implications of climate change o...,"[According to empirical niche-based models, pr...",0.75,0.597507,0.977585,0.978264
