This notebook is a modification of the notebook provided by the MLC group that is available [here](https://mlc.ai/mlc-llm/docs/get_started/try_out.html).

An obvious application of LLMs in healthcare is the simplification and or summarization of text. This summarization may be targeted for a clinician to help orient themselves to a complex patient's case or to a patient who is trying to understand her circumstances. It is the later question that we are going to address in this tutorial.

Perhaps the most problematic text for a patient to undertsand are nursing notes, because the nursing notes are terse, often oacking context, and filled with cryptic abbreviations and acronyms.

This notebook makes use of a SQLite3 notebook (`nursing.sqlite`) that has a single table (`nursing`) with the following columns:

- `condition`: the primarcy diagnosis for the patient: one of `brainca`, `sepsis`, or `assult`
- `id`: the ID of the patient
- `note`: the text of the note (in addition to nursing notes, there might be ntoes from social workers, respiratory therapists, dieticians, etc.)

Because of privacy laws, it would be inappropriate to use tools like ChatGPT to process actual clinical texts and large LLMs, like the publicly available 70 Gbyte Llama 2, could not be hosted by small healthcare organizations. What we want to explore here is whether smaller LLMs that might be feasibily hosted by a small healthcare organization, can provide adequate results.

We will start with the smaller of the available Llama 2 models (`mlc-chat-Llama-2-7b-chat-hf-q4f16_1`). If you have time, you can try the larger Llama 2 model, but if running on Colab beware that you might run out of GPU credits.

In order to save GPU credits, it might be beneficial to initially switch to change "Runtime type" to CPU until the models are downloaded.

__Note__: The are other language models available from MLC. Feel free to explore different models.

# Getting Started with MLC-LLM using the Llama 2 Model

Here's a quick overview of how to get started with the MLC-LLM `ChatModule` in Python. In this tutorial, we will chat with the [Llama2](https://ai.meta.com/llama/) model. For the easiest setup, we recommend trying this out in a Google Colab notebook. Click the button below to get started!

<a target="_blank" href="https://colab.research.google.com/github/mlc-ai/notebooks/blob/main/mlc-llm/tutorial_chat_module_getting_started.ipynb">
  <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
</a>

## Environment Setup

Let's set up your environment, so you can successfully run the `ChatModule`. First, let's set up the Conda environment which we will be running this notebook in (not required if running in Google Colab).

```bash
conda create --name mlc-llm python=3.10
conda activate mlc-llm
```

**Google Colab:** If you are running this in a Google Colab notebook, be sure to change your runtime to GPU by going to Runtime > Change runtime type and setting the Hardware accelerator to be "GPU". Select "Connect" on the top right to instantiate your GPU session.

If you are using CUDA, you can run the following command to confirm that CUDA is set up correctly, and check the version number.

In [None]:
!nvidia-smi

Next, let's download the MLC-AI and MLC-Chat nightly build packages. Go to https://mlc.ai/package/ and replace the command below with the one that is appropriate for your hardware and OS.

In [None]:
#!pip install --pre --force-reinstall mlc-ai-nightly-cu118 mlc-chat-nightly-cu118 -f https://mlc.ai/wheels
#!pip install --pre --force-reinstall mlc-ai-nightly mlc-chat-nightly -f https://mlc.ai/wheels

**Google Colab:** If in Google Colab, you may see a message warning you to restart the runtime. Simply run the following code in a new code cell to restart the runtime.

```python
import os
os.kill(os.getpid(), 9)
```

In [None]:
import os
os.kill(os.getpid(), 9)

In [None]:
from google.colab import drive
drive.mount("/content/drive")

import sqlite3 as sq

Next, let's download the model weights for the Llama2 model and the prebuilt model libraries from Github. In order to download the large weights, we'll have to use `git lfs`.

Note: If you are NOT running in **Google Colab** you may need to run this line `!conda install git git-lfs` to install `git` and `git-lfs` before running the following cell to fully install `git lfs`.

### Select which model you want to download


In [None]:
%%bash
git lfs install
mkdir -p dist/prebuilt
git clone https://github.com/mlc-ai/binary-mlc-llm-libs.git dist/prebuilt/lib
cd dist/prebuilt && git clone https://huggingface.co/mlc-ai/mlc-chat-Llama-2-7b-chat-hf-q4f16_1
#cd dist/prebuilt && git clone https://huggingface.co/mlc-ai/mlc-chat-Llama-2-13b-chat-hf-q4f16_1

These commands will download many prebuilt libraries as well as the chat configuration for Llama-2-7b that `mlc_chat` needs, which may take a long time. If in **Google Colab** you can verify that the files are being downloaded by clicking on the folder icon on the left and navigating to the `dist` and then `prebuilt` folders which should be updating as the files are being downloaded.

#### Make sure Runtime type is set to GPU now

## Let's Chat!

Before we can chat with the model, we must first import a library and instantiate a `ChatModule` instance. The `ChatModule` must be initialized with the appropriate model name.

In [None]:
model7b = "Llama-2-7b-chat-hf-q4f16_1"
model13b = "Llama-2-13b-chat-hf-q4f16_1"

#### If you downloaded a different model, change the model below

In [None]:
from mlc_chat import ChatModule
from mlc_chat.callback import StreamToStdout

cm = ChatModule(model=model7b)

In [None]:
import sqlite3 as sq
#conn = sq.connect("/content/drive/MyDrive/COMP90089/nursing.sqlite")
conn = sq.connect("./nursing.sqlite")

In [None]:
import ipywidgets as ipw
import markdown
import sqlite3 as sq
report = ""

In [None]:
def update_case(change):
    dr.value = markdown.markdown(dd.value)
    resp.value = ""
    
def submit_sql(b):
    global report
    conn = sq.connect("nursing.sqlite")
    cur = conn.cursor()
    cur.execute(dd.value)
    data = cur.fetchone()
    report = data[2]
    dr.value = markdown.markdown(data[2])

def submit_query(b):
    global status
    global rsp
    global query_history
    status.value = "<h4>Submitted query</h4>"
    q = to.value + report
    
    
    print("execute query")
    response = cm.generate(
        prompt=q,
        progress_callback=StreamToStdout(callback_interval=2),
    )

    
    print("update widgets")
    rsp.value = markdown.markdown(response)
    status.value = "<h4>Awaiting query</h4>"
    
def submit_conversation(b):
    global status
    global rsp
    global query_history
    status.value = "<h4>Submitted query</h4>"
    
    
    
    print("execute query")
    response = cm.generate(
        prompt=conv.value,
        progress_callback=StreamToStdout(callback_interval=2),
    )

    
    print("update widgets")
    rsp.value = markdown.markdown(response)
    status.value = "<h4>Awaiting query</h4>"

In [None]:
#submit_sql(None)

In [None]:
# Define widgets


dd = ipw.Textarea("""SELECT * FROM nursing ORDER BY RANDOM() LIMIT 1""")
conv = ipw.Textarea("""continue generating""")

dr = ipw.HTML(markdown.markdown("Text here"))
qr = ipw.HTML()
subcon = ipw.Button(
    description = "Continue chat",
    icon = "bullseye")
subsql = ipw.Button(
    description="Run SQL",
    icon="bullseye")
submit = ipw.Button(
    description='Submit to Chat',
    disabled=False,
    button_style='', # 'success', 'info', 'warning', 'danger' or ''
    tooltip='Click me',
    icon='bullseye' # (FontAwesome names without the `fa-` prefix)
)
to = ipw.Textarea(
    value="""Identify and enumerate all the anatomic nouns in the following text. For each noun provide a brief definition in layman terms.

Text to process: """,
    placeholder='Type something',
    description='Prompt:',
    layout=ipw.Layout(height="auto", width="auto"),
    disabled=False
)
rsp = ipw.HTML()
status = ipw.HTML("<h4>Awaiting query</h4>")


# Define Observers

submit.on_click(submit_query)
subsql.on_click(submit_sql)
subcon.on_click(submit_conversation)

#dd.observe(update_case, names="value", type="change")

# Define Layout

grid = ipw.GridspecLayout(5, 2, height="512px")
grid[0,:] = status
grid[1,0] = subsql
grid[1,1] = submit
grid[2,0] = dd
grid[2,1] = to
grid[3,0] = dr
grid[3,1] = rsp
grid[4,1] = subcon
grid[4,2] = conv

In [None]:
grid

That is all what needed to set up the `ChatModule`. You can now chat with the model by entering any prompt you'd like. Try it out below!

You can also repeat running the code block below for multiple rounds to interact with the model in a chat style.

In [None]:
#prompt = input("Prompt: ")
prompt = to.value + report
output = cm.generate(prompt=prompt,  progress_callback=StreamToStdout(callback_interval=2))