# PubMed QA using LlamaIndex

## Introduction
This notebook presents a RAG workflow for the [PubMed QA](https://pubmedqa.github.io/) task using [LlamaIndex](https://www.llamaindex.ai/). The code is written in a configurable fashion, giving you the flexibility to edit the RAG configuration and observe the change in output/responses.

It covers a step-by-step procedure for building the RAG workflow (Stages 1-4) and later runs the pipeline on a sample from the dataset. The notebook also covers the sparse, dense, hybrid retrieval strategies along with the re-ranker. We have alse added an optional component for RAG evaluation using the [Ragas](https://docs.ragas.io/en/stable/) library.

### <u>Requirements</u>
1. As you will accessing the LLMs and embedding models through Vector AI Engineering's Kaleidoscope Service (Vector Inference + Autoscaling), you will need to request a KScope API Key:

      Run the following command (replace ```<user_id>``` and ```<password>```) from **within the cluster** to obtain the API Key. The ```access_token``` in the output is your KScope API Key.
  ```bash
  curl -X POST -d "grant_type=password" -d "username=<user_id>" -d "password=<password>" https://kscope.vectorinstitute.ai/token
  ```
2. After obtaining the `.env` configurations, make sure to create the ```.kscope.env``` file in your home directory (```/h/<user_id>```) and set the following env variables:
- For local models through Kaleidoscope (KScope):
    ```bash
    export OPENAI_BASE_URL="https://kscope.vectorinstitute.ai/v1"
    export OPENAI_API_KEY=<kscope_api_key>
    ```
- For OpenAI models:
   ```bash
   export OPENAI_BASE_URL="https://api.openai.com/v1"
   export OPENAI_API_KEY=<openai_api_key>
   ```

## STAGE 0 - Set up the RAG workflow environment

#### Import libraries, custom classes and functions

In [1]:
import warnings
warnings.filterwarnings('ignore')

In [2]:
import sys
import os
import random

from pathlib import Path
from pprint import pprint

from llama_index.core import ServiceContext, Settings, set_global_handler
from llama_index.core.node_parser import SentenceSplitter

from task_dataset import PubMedQATaskDataset

from utils.hosting_utils import RAGLLM
from utils.rag_utils import (
    DocumentReader, RAGEmbedding, RAGQueryEngine, RagasEval, 
    extract_yes_no, validate_rag_cfg
    )
from utils.storage_utils import RAGIndex

#### Load config files

In [3]:
# Add root folder of the rag_bootcamp repo to PYTHONPATH
current_dir = Path().resolve()
parent_dir = current_dir.parent
sys.path.insert(0, str(parent_dir))

from utils.load_secrets import load_env_file
load_env_file()

  for line in open(env_path, "r").read().splitlines():


In [4]:
GENERATOR_BASE_URL = os.environ.get("OPENAI_BASE_URL")

OPENAI_API_KEY = os.environ.get("OPENAI_API_KEY")

#### Set RAG configuration

In [5]:
rag_cfg = {
    # Node parser config
    "chunk_size": 256,
    "chunk_overlap": 0,

    # Embedding model config
    "embed_model_type": "hf",
    "embed_model_name": "BAAI/bge-base-en-v1.5",

    # LLM config
    "llm_type": "kscope",
    "llm_name": "Meta-Llama-3.1-8B-Instruct",
    "max_new_tokens": 256,
    "temperature": 0.0,
    "top_p": 1.0,
    "top_k": 50,
    "do_sample": False,

    # Vector DB config
    "vector_db_type": "weaviate", # "weaviate"
    "vector_db_name": "Pubmed_QA",
    # MODIFY THIS
    "weaviate_url": "https://4akrlvlhscazehiee5ez1q.c0.us-east1.gcp.weaviate.cloud",

    # Retriever and query config
    "retriever_type": "vector_index", # "vector_index"
    "retriever_similarity_top_k": 5,
    "query_mode": "hybrid", # "default", "hybrid"
    "hybrid_search_alpha": 0.0, # float from 0.0 (sparse search - bm25) to 1.0 (vector search)
    "response_mode": "compact",
    "use_reranker": False,
    "rerank_top_k": 3,

    # Evaluation config
    "eval_llm_type": "kscope",
    "eval_llm_name": "Meta-Llama-3.1-8B-Instruct",
}

#### Read Weaviate Key

In [6]:
try:
    f = open(Path.home() / ".weaviate.key", "r")
    f.close()
except Exception as err:
    print(f"Could not read your Weaviate key. Please make sure this is available in plain text under your home directory in ~/.weaviate.key: {err}")

#### Preliminary config checks

In [7]:
validate_rag_cfg(rag_cfg)
pprint(rag_cfg)

{'chunk_overlap': 0,
 'chunk_size': 256,
 'do_sample': False,
 'embed_model_name': 'BAAI/bge-base-en-v1.5',
 'embed_model_type': 'hf',
 'eval_llm_name': 'Meta-Llama-3.1-8B-Instruct',
 'eval_llm_type': 'kscope',
 'hybrid_search_alpha': 0.0,
 'llm_name': 'Meta-Llama-3.1-8B-Instruct',
 'llm_type': 'kscope',
 'max_new_tokens': 256,
 'query_mode': 'hybrid',
 'rerank_top_k': 3,
 'response_mode': 'compact',
 'retriever_similarity_top_k': 5,
 'retriever_type': 'vector_index',
 'temperature': 0.0,
 'top_k': 50,
 'top_p': 1.0,
 'use_reranker': False,
 'vector_db_name': 'Pubmed_QA',
 'vector_db_type': 'weaviate',
 'weaviate_url': 'https://4akrlvlhscazehiee5ez1q.c0.us-east1.gcp.weaviate.cloud'}


## STAGE 1 - Load dataset and documents

#### 1. Load PubMed QA dataset
PubMedQA ([github](https://github.com/pubmedqa/pubmedqa)) is a biomedical question answering dataset. Each instance consists of a question, a context (extracted from PubMed abstracts), a long answer and a yes/no/maybe answer. We make use of the test split of [this](https://huggingface.co/datasets/bigbio/pubmed_qa) huggingface dataset for this notebook.

**The context for each instance is stored as a text file** (referred to as documents), to align the task as a standard RAG use-case.

In [8]:
print('Loading PubMed QA data ...')
pubmed_data = PubMedQATaskDataset('bigbio/pubmed_qa')
print(f"Loaded data size: {len(pubmed_data)}")
pubmed_data.mock_knowledge_base(output_dir='./data', one_file_per_sample=True)

Loading PubMed QA data ...


You can avoid this message in future by passing the argument `trust_remote_code=True`.
Passing `trust_remote_code=True` will be mandatory to load this dataset from the next major release of `datasets`.
Preparing data: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 500/500 [00:00<00:00, 1282.07it/s]


Loaded data size: 500


#### 2. Load documents
All metadata is excluded by default. Set the *exclude_llm_metadata_keys* and *exclude_embed_metadata_keys* flags to *false* for including it. Please refer to [this](https://docs.llamaindex.ai/en/stable/module_guides/loading/documents_and_nodes/usage_documents.html) and the *DocumentReader* class from *rag_utils.py* for further details.

In [9]:
print('Loading documents ...')
reader = DocumentReader(input_dir="./data/pubmed_doc")
docs = reader.load_data()
print(f'No. of documents loaded: {len(docs)}')

Loading documents ...
No. of documents loaded: 500


## STAGE 2 - Load node parser, embedding, LLM and set service context

#### 1. Load node parser to split documents into smaller chunks

In [10]:
print('Loading node parser ...')
node_parser = SentenceSplitter(chunk_size=rag_cfg['chunk_size'], chunk_overlap=rag_cfg['chunk_overlap'])
nodes = node_parser.get_nodes_from_documents(docs)

Loading node parser ...


#### 2. Load embedding model
LlamaIndex supports embedding models from OpenAI, Cohere, HuggingFace, etc. Please refer to [this](https://docs.llamaindex.ai/en/stable/module_guides/models/embeddings.html#custom-embedding-model) for building a custom embedding model.

In [11]:
embed_model = RAGEmbedding(model_type=rag_cfg['embed_model_type'], model_name=rag_cfg['embed_model_name']).load_model()

Loading hf embedding model ...


#### 3. Load LLM for generation
LlamaIndex supports LLMs from OpenAI, Cohere, HuggingFace, AI21, etc. Please refer to [this](https://docs.llamaindex.ai/en/stable/module_guides/models/llms/usage_custom.html#example-using-a-custom-llm-model-advanced) for loading a custom LLM model for generation.

In [12]:
llm = RAGLLM(
    llm_type=rag_cfg['llm_type'],
    llm_name=rag_cfg['llm_name'],
    api_base=GENERATOR_BASE_URL,
    api_key=OPENAI_API_KEY,
).load_model(**rag_cfg)

Configuring kscope LLM model ...


#### 4. Use ```Settings``` to set the node parser, embedding model, LLM, etc.

In [13]:
Settings.text_splitter = node_parser
Settings.llm = llm
Settings.embed_model = embed_model

## STAGE 3 - Create index using the appropriate vector store
All vector stores supported by LlamaIndex along with their available features are listed [here](https://docs.llamaindex.ai/en/stable/module_guides/storing/vector_stores.html).

If you are using LangChain, the supported vector stores can be found [here](https://python.langchain.com/docs/modules/data_connection/vectorstores/).

In [14]:
index = RAGIndex(
    db_type=rag_cfg['vector_db_type'],
    db_name=rag_cfg['vector_db_name'],
).create_index(docs, weaviate_url=rag_cfg["weaviate_url"])

Loading index from ./.weaviate_index_store/ ...


## STAGE 4 - Build query engine

Now build a query engine using *retriever* and *response_synthesizer*. LlamaIndex also supports different types of [retrievers](https://docs.llamaindex.ai/en/stable/api_reference/query/retrievers.html) and [response modes](https://docs.llamaindex.ai/en/stable/module_guides/querying/response_synthesizers/root.html#configuring-the-response-mode) for various use-cases.

[Weaviate hybrid search](https://weaviate.io/blog/hybrid-search-explained) explains how dense and sparse search is combined.

In [15]:
def set_query_engine_args(rag_cfg, docs):
    query_engine_args = {
        "similarity_top_k": rag_cfg['retriever_similarity_top_k'], 
        "response_mode": rag_cfg['response_mode'],
        "use_reranker": False,
    }
    
    if (rag_cfg["retriever_type"] == "vector_index") and (rag_cfg["vector_db_type"] == "weaviate"):
        query_engine_args.update({
            "query_mode": rag_cfg["query_mode"], 
            "hybrid_search_alpha": rag_cfg["hybrid_search_alpha"]
        })
    elif rag_cfg["retriever_type"] == "bm25":
        nodes = Settings.text_splitter.get_nodes_from_documents(docs)
        tokenizer = Settings.embed_model._tokenizer
        query_engine_args.update({"nodes": nodes, "tokenizer": tokenizer})
        
    if rag_cfg["use_reranker"]:
        query_engine_args.update({"use_reranker": True, "rerank_top_k": rag_cfg["rerank_top_k"]})

    return query_engine_args

In [16]:
query_engine_args = set_query_engine_args(rag_cfg, docs)
pprint(query_engine_args)

{'hybrid_search_alpha': 0.0,
 'query_mode': 'hybrid',
 'response_mode': 'compact',
 'similarity_top_k': 5,
 'use_reranker': False}


In [17]:
query_engine = RAGQueryEngine(
    retriever_type=rag_cfg['retriever_type'],
    vector_index=index,
).create(**query_engine_args)

## STAGE 5 - Finally query the model !
**Note:** We are using keyword based search or sparse search since *hybrid_search_alpha* is set to 0.0 by default.

#### [TODO] Change seed to experiment with a different sample

In [18]:
random.seed(237)

In [19]:
sample_idx = random.randint(0, len(pubmed_data)-1)
sample_elm = pubmed_data[sample_idx]
pprint(sample_elm)

{'answer': ['no'],
 'context': 'Human immunodeficiency virus (HIV)-infected patients have '
            'generally been excluded from transplantation. Recent advances in '
            'the management and prognosis of these patients suggest that this '
            'policy should be reevaluated. To explore the current views of '
            'U.S. transplant centers toward transplanting asymptomatic '
            'HIV-infected patients with end-stage renal disease, a written '
            'survey was mailed to the directors of transplantation at all 248 '
            'renal transplant centers in the United States. All 148 responding '
            'centers said they require HIV testing of prospective kidney '
            'recipients, and 84% of these centers would not transplant an '
            'individual who refuses HIV testing. The vast majority of '
            'responding centers would not transplant a kidney from a cadaveric '
            '(88%) or a living donor (91%) into an asymp

In [20]:
query = sample_elm['question']

response = query_engine.query(query)

delim = "".join(["-"]*25)
print(f'QUERY: {query}\n')
print(f'RESPONSE:\n{delim}\n{response.response}\n{delim}\n')
print(f'YES/NO: {extract_yes_no(response.response)}\n')
print(f'GT ANSWER: {sample_elm["answer"][0]}\n')
print(f'GT LONG ANSWER:\n{delim}\n{sample_elm["long_answer"]}\n{delim}')

QUERY: Should all human immunodeficiency virus-infected patients with end-stage renal disease be excluded from transplantation?

RESPONSE:
-------------------------
Based on the context information, the vast majority of responding centers (88% for cadaveric donors and 91% for living donors) would not transplant a kidney into an asymptomatic HIV-infected patient who is otherwise a good candidate for transplantation. This suggests that the prevailing view among U.S. transplant centers is that HIV-infected patients with end-stage renal disease should be excluded from transplantation. However, the text also mentions that recent advances in the management and prognosis of HIV-infected patients suggest that this policy should be reevaluated.

Therefore, considering the current views of U.S. transplant centers, the answer to the query is: no.
-------------------------

YES/NO: no

GT ANSWER: no

GT LONG ANSWER:
-------------------------
The great majority of U.S. renal transplant centers will

#### [OPTIONAL] [Ragas](https://docs.ragas.io/en/latest/) evaluation
Following are the commonly used metrics for evaluating a RAG workflow:
* [Faithfulness](https://docs.ragas.io/en/latest/concepts/metrics/available_metrics/faithfulness/): Measures the factual correctness of the generated answer based on the retrived context. Value lies between 0 and 1. **Evaluated using a LLM.**
* [Answer Relevance](https://docs.ragas.io/en/latest/concepts/metrics/available_metrics/answer_relevance/): Measures how relevant the answer is to the given query. Value lies between 0 and 1. **Evaluated using a LLM.**
* [Context Precision](https://docs.ragas.io/en/latest/concepts/metrics/available_metrics/context_precision/): Precision of the retriever as measured using the retrieved and the ground truth context. Value lies between 0 and 1. LLM can be used for evaluation.
* [Context Recall](https://docs.ragas.io/en/latest/concepts/metrics/available_metrics/context_recall/): Recall of the retriever as measured using the retrieved and the ground truth context. Value lies between 0 and 1. LLM can be used for evaluation.

Note: If you are planning to use **OpenAI models as evaluation LLMs**, store your OpenAI API key in ```~/.ragas_openai.env``` using the following format:

```bash
   export RAGAS_OPENAI_BASE_URL="https://api.openai.com/v1"
   export RAGAS_OPENAI_API_KEY=<openai_api_key>
```

Once done, **uncomment the next cell** to load these environment variables

In [21]:
# from utils.load_secrets import load_env_file_ragas
# load_env_file_ragas()

In [22]:
retrieved_nodes = query_engine.retriever.retrieve(query)

eval_data = [dict({
    "user_input": query,
    "response": response.response,
    "retrieved_contexts": [node.text for node in retrieved_nodes],
    "reference": sample_elm['long_answer'],
    "reference_contexts": [sample_elm["context"]],
})]
pprint(eval_data)

[{'reference': 'The great majority of U.S. renal transplant centers will not '
               'transplant kidneys to HIV-infected patients with end-stage '
               'renal disease, even if their infection is asymptomatic. '
               'However, advances in the management of HIV infection and a '
               'review of relevant ethical issues suggest that this approach '
               'should be reconsidered.',
  'reference_contexts': ['Human immunodeficiency virus (HIV)-infected patients '
                         'have generally been excluded from transplantation. '
                         'Recent advances in the management and prognosis of '
                         'these patients suggest that this policy should be '
                         'reevaluated. To explore the current views of U.S. '
                         'transplant centers toward transplanting asymptomatic '
                         'HIV-infected patients with end-stage renal disease, '
                

In [23]:
eval_obj = RagasEval(
    metrics=["faithfulness", "relevancy", "recall", "precision"],
    eval_llm_type=rag_cfg["eval_llm_type"],
    eval_llm_name=rag_cfg["eval_llm_name"],
    embed_model_name=rag_cfg['embed_model_name'],
    max_tokens=1024,
)

In [24]:
eval_result = eval_obj.evaluate(eval_data)
pprint(eval_result)

  user_id = json.load(open(uuid_filepath))["userid"]


Evaluating:   0%|          | 0/4 [00:00<?, ?it/s]

{'faithfulness': 0.8000, 'answer_relevancy': 0.9749, 'non_llm_context_recall': 1.0000, 'non_llm_context_precision_with_reference': 1.0000}


### 5.1 - Dense Search
Set *hybrid_search_alpha* to 1.0 for dense vector search.

In [25]:
rag_cfg["hybrid_search_alpha"] = 1.0

In [26]:
# Recreate query engine
query_engine_args = set_query_engine_args(rag_cfg, docs)
pprint(query_engine_args)
query_engine = RAGQueryEngine(
    retriever_type=rag_cfg['retriever_type'],
    vector_index=index
).create(**query_engine_args)

# Get response
response = query_engine.query(query)

# Print response
print(f'\n\nQUERY: {query}\n')
print(f'RESPONSE:\n{delim}\n{response.response}\n{delim}\n')
print(f'YES/NO: {extract_yes_no(response.response)}\n')
print(f'GT ANSWER: {sample_elm["answer"][0]}\n')
print(f'GT LONG ANSWER:\n{delim}\n{sample_elm["long_answer"]}\n{delim}')

{'hybrid_search_alpha': 1.0,
 'query_mode': 'hybrid',
 'response_mode': 'compact',
 'similarity_top_k': 5,
 'use_reranker': False}


QUERY: Should all human immunodeficiency virus-infected patients with end-stage renal disease be excluded from transplantation?

RESPONSE:
-------------------------
Based on the context information, it seems that the current views of U.S. transplant centers are generally against transplanting asymptomatic HIV-infected patients with end-stage renal disease. The survey mentioned in the first text indicates that the vast majority of responding centers would not transplant a kidney from a cadaveric or living donor into an asymptomatic HIV-infected patient who is otherwise a good candidate for transplantation. This suggests that the prevailing opinion is to exclude HIV-infected patients with end-stage renal disease from transplantation, at least for now.

However, the context also mentions that recent advances in the management and prognosis of HIV-infected pa

#### [OPTIONAL] Ragas evaluation

In [27]:
retrieved_nodes = query_engine.retriever.retrieve(query)

eval_data = [dict({
    "user_input": query,
    "response": response.response,
    "retrieved_contexts": [node.text for node in retrieved_nodes],
    "reference": sample_elm['long_answer'],
    "reference_contexts": [sample_elm["context"]],
})]

eval_result = eval_obj.evaluate(eval_data)
pprint(eval_result)

Evaluating:   0%|          | 0/4 [00:00<?, ?it/s]

Exception raised in Job[0]: LLMDidNotFinishException(The LLM generation was not completed. Please increase try increasing the max_tokens and try again.)


{'faithfulness': nan, 'answer_relevancy': 0.8393, 'non_llm_context_recall': 1.0000, 'non_llm_context_precision_with_reference': 1.0000}


### 5.2 - Hybrid Search
Set *hybrid_search_alpha* to 0.5 for hybrid search with equal weightage for dense and sparse (keyword-based) search.

In [28]:
rag_cfg["hybrid_search_alpha"] = 0.5

In [29]:
# Recreate query engine
query_engine_args = set_query_engine_args(rag_cfg, docs)
pprint(query_engine_args)
query_engine = RAGQueryEngine(
    retriever_type=rag_cfg['retriever_type'],
    vector_index=index
).create(**query_engine_args)

# Get response
response = query_engine.query(query)

# Print response
print(f'\n\nQUERY: {query}\n')
print(f'RESPONSE:\n{delim}\n{response.response}\n{delim}\n')
print(f'YES/NO: {extract_yes_no(response.response)}\n')
print(f'GT ANSWER: {sample_elm["answer"][0]}\n')
print(f'GT LONG ANSWER:\n{delim}\n{sample_elm["long_answer"]}\n{delim}')

{'hybrid_search_alpha': 0.5,
 'query_mode': 'hybrid',
 'response_mode': 'compact',
 'similarity_top_k': 5,
 'use_reranker': False}


QUERY: Should all human immunodeficiency virus-infected patients with end-stage renal disease be excluded from transplantation?

RESPONSE:
-------------------------
No

The context information suggests that recent advances in the management and prognosis of HIV-infected patients have led to a reevaluation of the policy of excluding them from transplantation. While the majority of transplant centers surveyed would not transplant a kidney from a cadaveric or living donor into an asymptomatic HIV-infected patient, the reasons cited are largely based on fear of harm to the individual and the potential waste of precious organs, rather than any conclusive evidence that transplantation would be contraindicated. This implies that the policy of exclusion may be overly cautious and that individual cases should be considered on a case-by-case basis.
----------------

#### [OPTIONAL] Ragas evaluation

In [30]:
retrieved_nodes = query_engine.retriever.retrieve(query)

eval_data = [dict({
    "user_input": query,
    "response": response.response,
    "retrieved_contexts": [node.text for node in retrieved_nodes],
    "reference": sample_elm['long_answer'],
    "reference_contexts": [sample_elm["context"]],
})]

eval_result = eval_obj.evaluate(eval_data)
pprint(eval_result)

Evaluating:   0%|          | 0/4 [00:00<?, ?it/s]

{'faithfulness': 0.5714, 'answer_relevancy': 0.9438, 'non_llm_context_recall': 1.0000, 'non_llm_context_precision_with_reference': 1.0000}


### 5.3 - Using Re-ranker
Set *use_reranker* to *True* to re-rank the context after retrieving it from the vector database.

In [31]:
rag_cfg["use_reranker"] = True
rag_cfg["hybrid_search_alpha"] = 1.0 # Using dense search

In [32]:
# Recreate query engine
query_engine_args = set_query_engine_args(rag_cfg, docs)
pprint(query_engine_args)
query_engine = RAGQueryEngine(
    retriever_type=rag_cfg['retriever_type'],
    vector_index=index
).create(**query_engine_args)

# Get response
response = query_engine.query(query)

# Print response
print(f'\n\nQUERY: {query}\n')
print(f'RESPONSE:\n{delim}\n{response.response}\n{delim}\n')
print(f'YES/NO: {extract_yes_no(response.response)}\n')
print(f'GT ANSWER: {sample_elm["answer"][0]}\n')
print(f'GT LONG ANSWER:\n{delim}\n{sample_elm["long_answer"]}\n{delim}')

{'hybrid_search_alpha': 1.0,
 'query_mode': 'hybrid',
 'rerank_top_k': 3,
 'response_mode': 'compact',
 'similarity_top_k': 5,
 'use_reranker': True}


QUERY: Should all human immunodeficiency virus-infected patients with end-stage renal disease be excluded from transplantation?

RESPONSE:
-------------------------
Based on the context information, the answer to the query is no. The survey results indicate that the vast majority of responding centers would not transplant a kidney from a cadaveric (88%) or a living donor (91%) into an asymptomatic HIV-infected patient who is otherwise a good candidate for transplantation. However, this does not necessarily mean that all HIV-infected patients with end-stage renal disease should be excluded from transplantation. The survey results suggest that some centers may consider transplanting an HIV-infected patient, and the authors of the study suggest that the policy of excluding HIV-infected patients from transplantation should be reevaluated. T

#### [OPTIONAL] Ragas evaluation

In [33]:
retrieved_nodes = query_engine.retriever.retrieve(query)

eval_data = [dict({
    "user_input": query,
    "response": response.response,
    "retrieved_contexts": [node.text for node in retrieved_nodes],
    "reference": sample_elm['long_answer'],
    "reference_contexts": [sample_elm["context"]],
})]

eval_result = eval_obj.evaluate(eval_data)
pprint(eval_result)

Evaluating:   0%|          | 0/4 [00:00<?, ?it/s]

{'faithfulness': 0.8182, 'answer_relevancy': 0.9831, 'non_llm_context_recall': 1.0000, 'non_llm_context_precision_with_reference': 1.0000}


Traceback (most recent call last):
  File "/pkgs/python-3.10.12/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/pkgs/python-3.10.12/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/fs01/projects/aieng/public/rag_bootcamp/envs/rag_pubmed_qa/lib/python3.10/site-packages/ipykernel_launcher.py", line 18, in <module>
    app.launch_new_instance()
  File "/fs01/projects/aieng/public/rag_bootcamp/envs/rag_pubmed_qa/lib/python3.10/site-packages/traitlets/config/application.py", line 1075, in launch_instance
    app.start()
  File "/fs01/projects/aieng/public/rag_bootcamp/envs/rag_pubmed_qa/lib/python3.10/site-packages/ipykernel/kernelapp.py", line 739, in start
    self.io_loop.start()
  File "/fs01/projects/aieng/public/rag_bootcamp/envs/rag_pubmed_qa/lib/python3.10/site-packages/tornado/platform/asyncio.py", line 205, in start
    self.asyncio_loop.run_forever()
  File "/pkgs/python-3.10.12