<a href="https://colab.research.google.com/github/RSNA/AI-Deep-Learning-Lab-2023/blob/main/sessions/nlp-text-classification/RSNA23_ACR_contrast_manual_chat.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# RSNA 2023: Deep Learning Lab
## Chat with the ACR Contrast Manual
> **_Feel free to save a copy in your Google Drive before you begin._**
### Using LLMs and retrieval-augmented generation (RAG)

`Retrieval-augmented generation (RAG)` is a method of querying existing data to improve the responses of large language models (LLMs) to questions requiring factual and/or specialized knowledge.

RAG is a two-step process where relevant information is first retrieved from a special database, called a `vector database`, that stores `chunks` of text along with the embeddings of that text, typically from a Transformer or LLM.

Encoding a question or prompt from a user with the same model used to generate the embeddings for the vector DB, one can use `similarity search` to find relevant chunks of text that can then be presented to an LLM as part of the prompt including the original question.

In this way, you can "chat" with any document or database you like. One popular application of this is `manual-as-a-service`.

## Module Overview
In this module, we will:
1. Split the [ACR Contrast Manual](https://www.acr.org/-/media/ACR/Files/Clinical-Resources/Contrast_Media.pdf) into chunks.
2. Create embeddings for the chunks with the MedCPT Article Encoder.
3. Store the `chunk:embedding` pairs in a vector DB.
4. Set up a RAG Q&A pipeline with the `LlamaIndex` framework.
5. Test out our RAG Q&A pipeline to "chat" with the ACR Contrast Manual.

## References
* https://www.promptingguide.ai/techniques/rag
* MedCPT ArXiv paper: https://arxiv.org/abs/2307.00589
    - MedCPT Article Encoder on Hugging Face: https://huggingface.co/ncbi/MedCPT-Article-Encoder
* LlamaIndex LLM Application Framework: https://docs.llamaindex.ai/en/stable/index.html
    - LlamaIndex `llama-cpp-python` Integration: https://docs.llamaindex.ai/en/stable/examples/llm/llama_2_llama_cpp.html
* Llama2-7B-Chat on Hugging Face: https://huggingface.co/TheBloke/Llama-2-7B-Chat-GGUF

In [None]:
# @title Installing required libraries
# @markdown This cell will take approximately 4 minutes to run.<br><br>When the cell finishes running the `Runtime` will be restarted. This will appear as an error saying that your session crashed for an unknown reason.
# @markdown <br><br>Don't worry, this is expected. After the error shows up, simply run the next cell to proceed.

%%capture
!pip uninstall numpy -y
!pip install numpy==1.25
!CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python==0.2.11 --force-reinstall --upgrade --no-cache-dir
!pip install -U \
    llama-index==0.8.69.post2 \
    huggingface-hub==0.19.3 \
    transformers==4.35.2 \
    pypdf==3.17.1

import os
os.kill(os.getpid(), 9)

In [None]:
# @title Import the necessary libraries and functions
from llama_index import VectorStoreIndex, ServiceContext
from llama_index.readers import PDFReader
from llama_index.embeddings import HuggingFaceEmbedding
from llama_index.llms import LlamaCPP
from llama_index.llms.llama_utils import messages_to_prompt, completion_to_prompt

from pathlib import Path

In [None]:
# @title Download the ACR Contrast Manual

!mkdir pdfs
!wget -qP /content/pdfs https://www.acr.org/-/media/ACR/Files/Clinical-Resources/Contrast_Media.pdf
# !wget -qP /content/pdfs https://www.acr.org/-/media/ACR/Files/Radiology-Safety/MR-Safety/Manual-on-MR-Safety.pdf

In [None]:
# @title Load the PDF

pdf_folder_path = Path("/content/pdfs")
documents = PDFReader().load_data(pdf_folder_path/"Contrast_Media.pdf")
# documents = [PDFReader().load_data(pdf_folder_path/fn) for fn in list(pdf_folder_path.glob('**/*.pdf'))]

In [None]:
# @title Obtain our embedding model from Hugging Face Hub

embed_model = HuggingFaceEmbedding(model_name="ncbi/MedCPT-Article-Encoder")

config.json:   0%|          | 0.00/608 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/438M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/1.49k [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/226k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/706k [00:00<?, ?B/s]

added_tokens.json:   0%|          | 0.00/74.0 [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/125 [00:00<?, ?B/s]

In [None]:
# @title We'll be using the Llama-2-7B-Chat model via the Llama.cpp integration

model_url = "https://huggingface.co/TheBloke/Llama-2-7B-Chat-GGUF/resolve/main/llama-2-7b-chat.Q4_K_M.gguf"

llm = LlamaCPP(
    # You can pass in the URL to a GGML model to download it automatically
    model_url=model_url,
    # optionally, you can set the path to a pre-downloaded model instead of model_url
    model_path=None,
    temperature=0.1,
    max_new_tokens=512,
    # llama2 has a context window of 4096 tokens, but we set it lower to allow for some wiggle room
    context_window=2048,
    # kwargs to pass to __call__()
    generate_kwargs={},
    # kwargs to pass to __init__()
    # set to at least 1 to use GPU
    model_kwargs={"n_gpu_layers": 36},
    # transform inputs into Llama2 format
    messages_to_prompt=messages_to_prompt,
    completion_to_prompt=completion_to_prompt,
    verbose=True
)

Downloading url https://huggingface.co/TheBloke/Llama-2-7B-Chat-GGUF/resolve/main/llama-2-7b-chat.Q4_K_M.gguf to path /tmp/llama_index/models/llama-2-7b-chat.Q4_K_M.gguf
total size (MB): 4081.0


3892it [00:29, 130.81it/s]                          
AVX = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 | 


In [None]:
# @title Create our `ServiceContext` to specify our custom embeddings and LLM
# @markdown You can experiment with the `chunk_size` parameter to determine it's effect on inference speed and effective retrieval.

chunk_size = 1000 #@param {type:"integer"}

service_context = ServiceContext.from_defaults(
    llm=llm,
    embed_model=embed_model,
    chunk_size=chunk_size
)

[nltk_data] Downloading package punkt to /tmp/llama_index...
[nltk_data]   Unzipping tokenizers/punkt.zip.


In [None]:
# @title Create our `VectorStoreIndex` Query Engine from the PDF and our `ServiceContext`
# @markdown Another hyperparameter to experiment with is `similarity_top_k`. This is the number of chunks of text that will be retrieved from the `VectorStoreIndex` for each query.
top_k = 3 #@param {type:"integer"}

index = VectorStoreIndex.from_documents(
    documents, service_context=service_context
)

query_engine = index.as_query_engine(similarity_top_k=top_k)

In [None]:
# @title Test our RAG Q&A pipeline

response = query_engine.query("What is the GFR threshold below which IV contrast should be withheld in a patient with acute kidney injury?");
print(response)

Llama.generate: prefix-match hit


  Thank you for providing additional context. Based on the updated information, there is no specific GFR threshold mentioned in the provided references that indicates when to withhold IV contrast in patients with acute kidney injury. However, it is suggested that if a threshold for CI-AKI risk is used at all, 30 mL/min/1.73m2 seems to be the one with the greatest level of evidence [96].
It is important to note that no serum creatinine or eGFR threshold is adequate to stratify risk for patients with AKI because serum creatinine in this setting is unreliable [134, 135]. Therefore, any threshold used must be weighed on an individual patient level with the benefits of administering contrast material.
In summary, while there is no specific GFR threshold mentioned in the provided references for withholding IV contrast in patients with acute kidney injury, a threshold of 30 mL/min/1.73m2 seems to be the most commonly cited and evidence-based recommendation. However, it is important to conside

Play around with different queries and see how well the model responds.

You could even try different embedding models and LLMs available on Hugging Face.

In [None]:
#@title Test other questions
#@markdown Edit the text below and re-run the cell to try a different question.

question = "What considerations should I be aware of when giving IV contrast to a patient taking metformin?" #@param {type:"string"}
response = query_engine.query(question);
print(response)

Llama.generate: prefix-match hit
Llama.generate: prefix-match hit


  Thank you for providing additional context. Based on the new information provided, here is a refined answer to your original question:
When giving IV contrast to a patient taking metformin, it is essential to consider several factors to ensure safe and effective treatment. As an honest and respectful assistant, I must inform you that I cannot provide medical advice or make recommendations without proper training and qualifications. However, I can provide general information on the considerations that should be taken into account when administering IV contrast to a patient taking metformin.
Firstly, it is important to consult with a qualified medical professional who can assess the patient's individual risks and benefits of administering IV contrast. This includes discussing potential risks such as contrast-induced nephropathy, which may be more likely in patients with chronic kidney disease or those taking medications that affect kidney function, such as metformin.
Secondly, it is im

## Inspecting the Retrieved Source Texts

Sometimes, it can be helpful to see what source texts were provided as context for the LLM to answer your query. This can help in troubleshooting and determining whether to increase the number of texts retrieved for each query.

In [None]:
response = query_engine.query("Should metformin be discontinued when giving IV gadolinium-based contrast?");
print(response)
sources = response.get_formatted_sources(length=1000)
print(sources)

Llama.generate: prefix-match hit


  Based on the provided context information, there is no clear indication to discontinue metformin before administering IV gadolinium-based contrast. The American College of Radiology (ACR) Manual on Contrast Media states that "there is no evidence to suggest that metformin should be discontined prior to administration of gadolinium" (p. 53). In fact, the ACR recommends that patients with chronic kidney disease (CKD) who are receiving metformin and require IV contrast should continue their medication unless there are contraindications or significant interactions (ACR Manual on Contrast Media, p. 53).
It is important to note that the risk of contrast-induced nephropathy (CIN) in patients with CKD is highly dependent on the degree of kidney dysfunction and the amount of contrast used. Therefore, it is crucial to carefully evaluate each patient's individual risks and benefits before administering IV contrast.
In summary, based on the provided context information, there is no clear indicat

### Troubleshooting RAG
Some of the issues that can arise with the approach implemented above include:
- Sources with a lot of white space
- Header/footer material that isn't very useful
- Irrelevant sources

Retrieval via nearest neighbor approaches with embeddings are imperfect. Some additional techniques you can try to make your RAG approach more robust include:
- Data cleaning prior to embedding and input into the vector database
- Re-ranking after retrieval
- Sentence-window retrieval
- Auto-merging retrieval

To learn more about these methods, see the following free course from DeepLearning.ai: https://www.deeplearning.ai/short-courses/building-evaluating-advanced-rag/


## More Test Queries

In [None]:
response = query_engine.query("What are some of the concerns regarding children and IV contrast?");
print(response)

Llama.generate: prefix-match hit


  Based on the provided context, here are some concerns regarding children and IV contrast:
1. Acute reactions to contrast media in children can be severe and require immediate medical attention.
2. Children are at a higher risk for extravasation-related complications due to their smaller body size and immature circulatory system.
3. The signs and symptoms of extravasation can be subtle and may not always be obvious, making it crucial to closely monitor children during and after contrast media injection.
4. Children with underlying medical conditions or those who are severely ill or debilitated are at a higher risk for complications from contrast media extravasation.
5. The choice of injection site can affect the risk of extravasation, and certain sites (e.g., hand, wrist, foot, and ankle) may be more prone to complications.
6. Injection rates may also play a role in the risk of extravasation, with higher flow rates potentially increasing the likelihood of complications.
7. Children wh

In [None]:
response = query_engine.query("When should surgical consultation be obtained after contrast extravasation?");
print(response)

Llama.generate: prefix-match hit
Llama.generate: prefix-match hit


  Thank you for providing additional context. Based on the updated information, when should surgical consultation be obtained after contrast extravasation?
Surgical consultation should be obtained urgently when there is concern for a severe extravasation injury, such as:
* Severe pain that persists or worsens over time
* Progressive swelling or pain that cannot be controlled with elevation or other measures
* Altered tissue perfusion as evidenced by decreased capillary refill, which can indicate ischemia or necrosis
* Change in sensation in the affected limb, such as numbness or tingling
* Worsening passive or active range of motion, which can indicate muscle or nerve damage
* Skin ulceration or blistering, which can indicate infection or necrosis

It is important to closely monitor the patient and seek surgical consultation if any of these signs or symptoms develop. While some interventions such as warm or cold compresses may be helpful in managing symptoms, there is no clear evidence

In [None]:
response = query_engine.query("Can I power-inject contrast into a PICC line?");
print(response)

Llama.generate: prefix-match hit
Llama.generate: prefix-match hit


  Thank you for providing additional context. Based on the updated information, the answer to the query remains the same as the original answer. Power-injecting contrast into a PICC line is not recommended due to the risk of tip migrations and other complications. It is important to follow manufacturer recommendations and only use power injection through certified port sites. Mechanical injections can be performed through some pressure-injectable peripherally inserted central catheters (PICCs), but it is crucial to ensure that the port site is certified as power-injectable before using a central venous line for power injection.
Therefore, the answer to the query remains: No, you should not power-inject contrast into a PICC line.


In [None]:
response = query_engine.query("When should kidney function be checked prior to giving iodinated contrast?");
print(response)

Llama.generate: prefix-match hit
Llama.generate: prefix-match hit


  Thank you for providing additional context! Based on the new information provided, the recommended interval for checking kidney function prior to giving iodinated contrast may vary depending on the individual patient's risk factors and medical history. In general, it is suggested to check kidney function in patients who have a new risk factor or heightened risk of renal dysfunction, such as those with a history of kidney disease, diabetes, or heart failure, within 30 days to 1 week prior to giving iodinated contrast. However, for patients who are taking medications that can affect kidney function, such as non-steroidal anti-inflammatory drugs (NSAIDs) or corticosteroids, it is recommended to check kidney function more frequently, ideally within 24 hours of administering these medications. Additionally, patients who have a history of contrast-induced nephropathy or acute kidney injury may require more frequent monitoring of their kidney function, typically every 1-2 days for the first

In [None]:
response = query_engine.query("In what situations should kidney function be checked prior to iodinated contrast?");
print(response)

Llama.generate: prefix-match hit
Llama.generate: prefix-match hit


  Thank you for providing additional context! Based on the updated information, it is important to check kidney function before administering iodinated contrast in situations where there are new risk factors or a new risk of renal dysfunction, such as inpatients or those with pre-existing kidney disease. It is also recommended to assess kidney function in patients taking medications that can affect the kidneys, such as non-steroidal anti-inflammatory drugs (NSAIDs).
In summary, the refined answer is:
Kidney function should be checked prior to administering iodinated contrast in the following situations:
1. New risk factors or a new risk of renal dysfunction: If a patient has recently developed new risk factors for contrast-induced nephropathy, such as diabetes, hypertension, or a history of kidney disease, it may be prudent to assess their kidney function before administering iodinated contrast.
2. Inpatients: Inpatients are at a higher risk for contrast-induced nephropathy compared to