# What is this notebook about?
Gen AI powered question answer application in a notebook to learn about Medicare Part D medication spend 

# Motivation
In healthcare, adoption of LLM powered applications is slow in comparision to other industries. 
Pragmatic and cautionous approach in Gen AI adoption is more common in health care, in contrast to 'AI first approach' in other industries like media. Maturity of tools that build trust e.g. explainable AI, guardrails etc. is the key for increased adoption. 

This notebook explores the use of NEMO guardrail tool for a RAG application answering questions on medications spend with a publicly available data.

# Goals
Use healthcare related public data from CMS 

Guardrail the LLM response to the subject

Use NVIDIA and Open Source Tools

# Tools
LLM : Open source mixtral AI LLM - mistralai/mixtral-8x7b-instruct-v0.1

Access :  Nviida end point access with the chat interface integrated with langchain

Vector Store : DocArrayInMemorySearch

Embeddings : NVIDIA Embeddings

Guardrail on chain :  NEMO Guardrails 

# High Level Diagram 
Below diagram depicts a high level overview of the application
![Highlevel Overview](HighLevelOverview.svg)

# What is demonstrated
When the question is related to medication spend, the answer is related with and without guardrails.
When you ask the llm to tell a joke, without guardrail it tells a joke, but politely refuse with guardrails.

# Data
Source : https://data.cms.gov/summary-statistics-on-use-and-payments/medicare-medicaid-spending-by-drug/medicare-part-d-spending-by-drug/data

Downloaded and curated for the scope of this notebook.

# Author
Jayanthi Suryanarayana, MN

In [1]:
pip install  -r requirements.txt

Note: you may need to restart the kernel to use updated packages.


In [2]:
import os
from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv()) # read local .env file


In [3]:
from langchain.prompts import ChatPromptTemplate
from langchain.chat_models import ChatOpenAI
from langchain.schema.output_parser import StrOutputParser

In [4]:
from langchain.vectorstores import DocArrayInMemorySearch

In [5]:
from langchain_nvidia_ai_endpoints import NVIDIAEmbeddings
embedder = NVIDIAEmbeddings(model="NV-Embed-QA")

In [6]:
from langchain.document_loaders import CSVLoader
file = 'Medicare_Part_D_Spending_mod.csv'
loader = CSVLoader(file_path=file)

In [7]:

data = loader.load()

In [8]:
vectorstore = DocArrayInMemorySearch.from_documents(data, embedding=embedder)
retriever = vectorstore.as_retriever()



In [9]:
retriever.get_relevant_documents("Please list medications with maximum spend")

  warn_deprecated(


[Document(page_content='Brnd_Name: Abacavir\nGnrc_Name: Abacavir Sulfate\nTot_Mftr: 6\nMftr_Name: Overall\nTot_Spndng_2021: 7036063.99\nTot_Dsg_Unts_2021: 2500817\nTot_Clms_2021: 30527\nTot_Benes_2021: 4254\nAvg_Spnd_Per_Dsg_Unt_Wghtd_2021: 3.187345204\nAvg_Spnd_Per_Clm_2021: 230.4865853\nAvg_Spnd_Per_Bene_2021: 1653.987774\nOutlier_Flag_2021: 0', metadata={'source': 'Medicare_Part_D_Spending_mod.csv', 'row': 4}),
 Document(page_content='Brnd_Name: 1st Tier Unifine Pentips Plus\nGnrc_Name: Pen Needle, Diabetic\nTot_Mftr: 1\nMftr_Name: Owen Mumford Us\nTot_Spndng_2021: 131927.33\nTot_Dsg_Unts_2021: 566872\nTot_Clms_2021: 4564\nTot_Benes_2021: 1766\nAvg_Spnd_Per_Dsg_Unt_Wghtd_2021: 0.232811541\nAvg_Spnd_Per_Clm_2021: 28.90607581\nAvg_Spnd_Per_Bene_2021: 74.70403737\nOutlier_Flag_2021: 0', metadata={'source': 'Medicare_Part_D_Spending_mod.csv', 'row': 3}),
 Document(page_content='Brnd_Name: 1st Tier Unifine Pentips Plus\nGnrc_Name: Pen Needle, Diabetic\nTot_Mftr: 1\nMftr_Name: Overall\nTo

In [10]:
template = """Answer the question based only on the following context:
{context}

Question: {question}
"""
prompt = ChatPromptTemplate.from_template(template)

In [11]:
from langchain_nvidia_ai_endpoints import ChatNVIDIA
llm = ChatNVIDIA(model="mistralai/mixtral-8x7b-instruct-v0.1",)

In [12]:
output_parser = StrOutputParser()

In [13]:
from langchain.schema.runnable import RunnableMap

In [14]:
chain = RunnableMap({
    "context": lambda x: retriever.get_relevant_documents(x["question"]),
    "question": lambda x: x["question"]
}) | prompt | llm | output_parser

In [15]:
chain.invoke({"question": "tell a joke"})

" Sure, here's a joke for you:\n\nWhy did the tomato turn red?\n\nBecause it saw the salad dressing!"

In [16]:
chain.invoke({"question":"List medications of maximum spend"})

' Based on the provided documents, the medications with the maximum spend in 2021 are:\n\n1. Abacavir Sulfate (Brand Name: Abacavir) with a total spending of $7036063.99.\n2. 1st Tier Unifine Pentips Plus (Brand Name: 1st Tier Unifine Pentips Plus) with a total spending of $131927.33, listed twice with different manufacturers.\n3. 1st Tier Unifine Pentips (Brand Name: 1st Tier Unifine Pentips) with a total spending of $102280.76.\n\nThese are the only medications listed in the documents with the "Outlier\\_Flag\\_2021" set to 0, indicating that they are not considered outliers and have the highest spending in 2021.'

In [17]:
#llm.get_available_models()

In [18]:
from nemoguardrails import RailsConfig
from nemoguardrails.integrations.langchain.runnable_rails import RunnableRails

config = RailsConfig.from_path("./config")
guardrails = RunnableRails(config,input_key='question')

  from .autonotebook import tqdm as notebook_tqdm
Fetching 5 files: 100%|██████████| 5/5 [00:00<00:00, 41282.52it/s]


In [19]:
chain_with_guardrails = guardrails | chain

In [20]:
import nest_asyncio

nest_asyncio.apply()

In [21]:
await chain_with_guardrails.ainvoke({"question": "Please list medications with maximum spend"})


' Based on the provided documents, the medications with the maximum spend in 2021 are:\n\n1. "Abacavir" (brand name) manufactured by "Overall" with a total spending of 7036063.99\n2. "1st Tier Unifine Pentips Plus" (brand name) manufactured by "Owen Mumford Us" with a total spending of 131927.33\n3. "Abacavir" (brand name) manufactured by "Rising Pharm" with a total spending of 123118.64\n\nThese are the three medications with the highest total spending in 2021.'

In [22]:
await chain_with_guardrails.ainvoke({"question": "tell me a funny joke"})


' I\'m sorry for the misunderstanding, but I\'m unable to tell you a joke as I\'m here to provide information based on the given context. Here is some information about the documents you provided:\n\nThe documents contain data about two products, "1st Tier Unifine Pentips" and "1st Tier Unifine Pentips Plus", both of which are Pen Needles for Diabetic use. The manufacturer of both products is "Owen Mumford Us". The documents provide information on several metrics for the year 2021, including total spending, total dosage units, total claims, total beneficiaries, average spending per dosage unit (weighted), average spending per claim, and average spending per beneficiary. There\'s also an "Outlier Flag" for the year 2021, which is 0 for both products, indicating that there are no outliers detected.\n\nFor the same products, there are also metrics provided for "Overall" manufacturer. The metrics for the "1st Tier Unifine Pentips" and "1st Tier Unifine Pentips Plus" for "Overall" manufactu