### Dell Technologies Proof of Concept - CHAT, ANALYZE AND GRAPH YOUR CSV DATA WITH MISTRAL LLM
- Model:  Mistral instruct v2
- Workload:  CSV analysis using LLM
- Orchestrator:  Langchain csv agent
- Dataset:  hospital visits

Note: The software and sample files are provided “as is” and are to be used only in conjunction with this POC application. They should not be used in production and are provided without warranty or guarantees. Please use them at your own discretion.
    

### Huggingface tools

You will need to at least log in once to get the hub for tools and the embedding model.  After that you can comment this section out.

In [1]:
## code to auto login to hugging face, avoid the login prompt
# %pip install huggingface-hub==0.16.4
# %pip install --upgrade huggingface-hub

# get your account token from https://huggingface.co/settings/tokens
# this is a read-only test token

token = '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'

from huggingface_hub import login
login(token=token, add_to_git_credential=True)

Token is valid (permission: fineGrained).
[1m[31mCannot authenticate through git-credential as no helper is defined on your machine.
You might have to re-authenticate when pushing to the Hugging Face Hub.
Run the following command in your terminal in case you want to set the 'store' credential helper as default.

git config --global credential.helper store

Read https://git-scm.com/book/en/v2/Git-Tools-Credential-Storage for more details.[0m
Token has not been saved to git credential helper.
Your token has been saved to /home/daol/.cache/huggingface/token
Login successful


### Install python libraries and applications

Using % to ensure installation into this conda environment and not OS python

In [2]:
# %pip install accelerate  ## for use of device map feature
# %pip install transformers
# %pip install langchain
# %pip install langchain_experimental==0.0.51
# %pip install matplotlib

In [3]:
### Check installed GPU

In [4]:
!nvidia-smi

Mon Dec 30 08:22:03 2024       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.120                Driver Version: 550.120        CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|   0  NVIDIA L40S                    Off |   00000000:61:00.0 Off |                    0 |
| N/A   36C    P0             82W /  350W |    3927MiB /  46068MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   1  NVIDIA L40S                    Off |   00

### Assign GPU environment vars and ID order

NOTE:  to change which GPU you want visible, simply change the CUDA VISIBLE DEVICES ID to the GPU you prefer. 
This method guarantees no confusion or misplaced workloads on any GPUs.

In [5]:
## THESE VARIABLES MUST APPEAR BEFORE TORCH OR CUDA IS IMPORTED
## set visible GPU devices and order of IDs to the PCI bus order
## target the L40s that is on ID 1

import os
os.environ["CUDA_VISIBLE_DEVICES"]="0"   

## this integer corresponds to the ID of the GPU, for multiple GPU use "0,1,2,3"...
## to disable all GPUs, simply put empty quotes ""

os.environ["CUDA_DEVICE_ORDER"]="PCI_BUS_ID"

### Investigate our GPU and CUDA environment

NOTE:  If you are only using 1 single GPU in the visibility settings above, then the active CUDA device will always be 0 since it is the only GPU seen.

In [6]:
import torch
import sys
import os
from subprocess import call
print('_____Python, Pytorch, Cuda info____')
print('__Python VERSION:', sys.version)
print('__pyTorch VERSION:', torch.__version__)
print('__CUDA RUNTIME API VERSION')
#os.system('nvcc --version')
print('__CUDNN VERSION:', torch.backends.cudnn.version())
print('_____nvidia-smi GPU details____')
call(["nvidia-smi", "--format=csv", "--query-gpu=index,name,driver_version,memory.total,memory.used,memory.free"])
print('_____Device assignments____')
print('Number CUDA Devices:', torch.cuda.device_count())
print ('Current cuda device: ', torch.cuda.current_device(), ' **May not correspond to nvidia-smi ID above, check visibility parameter')
print("Device name: ", torch.cuda.get_device_name(torch.cuda.current_device()))

_____Python, Pytorch, Cuda info____
__Python VERSION: 3.12.8 (main, Dec  6 2024, 19:59:28) [Clang 18.1.8 ]
__pyTorch VERSION: 2.5.1+cu124
__CUDA RUNTIME API VERSION
__CUDNN VERSION: 90100
_____nvidia-smi GPU details____
index, name, driver_version, memory.total [MiB], memory.used [MiB], memory.free [MiB]
0, NVIDIA L40S, 550.120, 46068 MiB, 3927 MiB, 41663 MiB
1, NVIDIA L40S, 550.120, 46068 MiB, 4 MiB, 45586 MiB
_____Device assignments____
Number CUDA Devices: 1
Current cuda device:  0  **May not correspond to nvidia-smi ID above, check visibility parameter
Device name:  NVIDIA L40S


### Assign single GPU to device variable

This command assigns GPU ID 0 to the DEVICE variable called "cuda:0" if pytorch can actually reach and speak with the GPU using cuda language.  Else it will use the cpu.

In [7]:
DEVICE = "cuda" if torch.cuda.is_available() else "cpu"
print(DEVICE)

cuda


### View csv contents and structure

In [8]:
import pandas as pd
df = pd.read_csv('csv-files/healthcare.csv')

In [9]:
df_rows = df.head(3)
print(df_rows)

  Patient Name Date of Birth  Age Reason for Admission       Procedure  \
0    Patient_3      8/6/2004   19                COVID  antiviral meds   
1   Patient_39     3/19/2004   19                COVID  oxygen therapy   
2   Patient_33     12/8/2002   21                COVID  oxygen therapy   

       Room Date of Discharge  Length of Stay   Charges  Balance Remaining  
0  Room_237        12/27/2023               0    237.34              37.34  
1  Room_440         1/12/2024              22  29980.84            3018.84  
2  Room_298        12/31/2023               6   4028.77            2681.26  


### Import CSV agent dependencies

In [10]:
import transformers
from langchain_experimental.agents.agent_toolkits import create_csv_agent
from langchain.llms import HuggingFacePipeline
from transformers import (
    AutoModelForCausalLM,
    AutoTokenizer,
    BitsAndBytesConfig
)

In [11]:
# from langchain.globals import set_verbose, set_debug
# set_debug(True)
# set_verbose(True)

### Set model parameters

In [12]:
model = "mistralai/Mistral-7B-Instruct-v0.2"

In [13]:
tokenizer = transformers.AutoTokenizer.from_pretrained(model)

In [14]:
### https://huggingface.co/docs/transformers/en/main_classes/quantization

compute_dtype = getattr(torch, "float16")
bnb_config = BitsAndBytesConfig(
        load_in_4bit=True,
        bnb_4bit_quant_type="nf4",
        bnb_4bit_compute_dtype=compute_dtype,
#        bnb_4bit_use_double_quant=True,
) #! 원래 코드에 옵션으로 양자화 넣기
quantize = False
if quantize:
    model = AutoModelForCausalLM.from_pretrained(
            model, 
            quantization_config=bnb_config, 
            device_map="auto"
    )
#Configure the pad token in the model
# model.config.pad_token_id = tokenizer.pad_token_id

In [None]:
pipeline = transformers.pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
    torch_dtype=torch.bfloat16,
    device_map="auto",
    max_new_tokens=500
)

Downloading shards:   0%|          | 0/3 [00:00<?, ?it/s]

model-00002-of-00003.safetensors:  25%|##5       | 1.27G/5.00G [00:00<?, ?B/s]

model-00003-of-00003.safetensors:   0%|          | 0.00/4.54G [00:00<?, ?B/s]

In [None]:
local_llm = HuggingFacePipeline(pipeline=pipeline)

### Initiate CSV agent

In [None]:
agent = create_csv_agent(
    local_llm,
    "csv-files/healthcare.csv",
    verbose=True,
#    agent_type=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
)

### Generate responses

In [None]:
res = agent.invoke("What was the procedure for patient 3?")
print(res)

##### Compare to correct answer:  antiviral meds

In [None]:
res = agent.invoke("What was date of discharge and length of stay for patient 21?")
print(res)

##### Compare to correct answer:  11/15/2023 and the length of stay was 18 days

In [None]:
res = agent.invoke("What are the unique reasons for admission?")
print(res)

##### Compare to correct answer:  COVID', 'Flu', 'Pneumonia', 'Heart attack', 'Stroke

In [None]:
res = agent.invoke("Cross reference the charges with the reason for admission to determine which reason had the highest cost.")
print(res)

##### Correct answer:  highest cost was COVID, with a total cost of 29980.84

In [None]:
res = agent.invoke("What are the top three most costly procedures?")
print(res)

##### Correct answer:  most costly procedures are 'oxygen therapy', 'bypass surgery', and 'clot removal'

### Create visualizations of data using LLM and matplotlib

In [None]:
res = agent.invoke("Plot a histogram of the age distribution of the patients.")
print(res)

In [None]:
res = agent.invoke("Plot a pie chart of the number of procedures based on percentage.")
print(res)