In [1]:
import os
import json 

# Load the environment configuration JSON data
json_path = 'env_config.json'
with open(json_path, 'r') as file:
    env_config = json.load(file)

# Set the HF_HOME environment variable
os.environ['HF_HOME'] = env_config['HF_HOME']
# Set the access token to huggingface hub
access_token = env_config['access_token']

In [28]:
from transformers import LlavaNextProcessor, LlavaNextForConditionalGeneration
import torch
from PIL import Image
import requests

processor = LlavaNextProcessor.from_pretrained("llava-hf/llava-v1.6-mistral-7b-hf")

model = LlavaNextForConditionalGeneration.from_pretrained("llava-hf/llava-v1.6-mistral-7b-hf", torch_dtype=torch.float16, low_cpu_mem_usage=True) 
model.to("cuda:0")

# prepare image and text prompt, using the appropriate prompt template
# url = "https://github.com/haotian-liu/LLaVA/blob/1a91fc274d7c35a9b50b3cb29c4247ae5837ce39/images/llava_v1_5_radar.jpg?raw=true"
# image = Image.open(requests.get(url, stream=True).raw)




Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Loading checkpoint shards: 100%|██████████| 4/4 [00:00<00:00,  9.20it/s]
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


[INST]  
What is shown in this image? [/INST] The image shows a medical consultation document from a hospital. It appears to be a record of a patient's visit, with various sections filled out by a healthcare professional. The document includes the patient's personal information, medical history, and notes from the consultation. The text is in Spanish, and the document includes a section for the patient's signature, which is typically used to confirm that the information provided is accurate. The document also includes a section for the healthcare provider's signature, which


In [31]:
image = Image.open("handwritten_prescription.jpg")
prompt = "[INST] <image>\nReformat this document[/INST]"

inputs = processor(prompt, image, return_tensors="pt").to("cuda:0")

# autoregressively complete prompt
output = model.generate(**inputs, max_new_tokens=512)

print(processor.decode(output[0], skip_special_tokens=True))

Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


[INST]  
Reformat this document[/INST] The image you've provided appears to be a medical consultation document from a hospital in Mexico. The document is written in Spanish and includes various sections with handwritten notes. Here's a translation of the visible text:

```
HOSPITAL INFANTIL DE MÉXICO
FEDERICO GÓMEZ

CONSULTA EXTERIORA
REG. NUM. 849138

PACIENTE:
Nombre:
Edad: 5 años
Sexo: Femenino

HISTORIA CLÍNICA:

- Peso: 11,5 kg
- Talla: 134 cm
- FC: 62
- T°: 35,6°

OBSERVACIONES:

- En la cabeza se observa una lesión en la frente derecha.
- En la mano derecha se observa una lesión en la palma.
- En la mano izquierda se observa una lesión en la palma.
- En la pierna derecha se observa una lesión en la pantorrilla.
- En la pierna izquierda se observa una lesión en la pantorrilla.
- En la espalda se observa una lesión en la espalda.
- En la espalda se observa una lesión en la espalda.
- En la espalda se observa una lesión en la espalda.
- En la espalda se observa una lesión en la esp

# How to prompt Llama 3
The base models have no prompt format. Like other base models, they can be used to continue an input sequence with a plausible continuation or for zero-shot/few-shot inference. They are also a great foundation for fine-tuning your own use cases. The Instruct versions use the following conversation structure:
```bash
<|begin_of_text|><|start_header_id|>system<|end_header_id|>

{{ system_prompt }}<|eot_id|><|start_header_id|>user<|end_header_id|>

{{ user_msg_1 }}<|eot_id|><|start_header_id|>assistant<|end_header_id|>

{{ model_answer_1 }}<|eot_id|>

```

This format has to be exactly reproduced for effective use. We’ll later show how easy it is to reproduce the instruct prompt with the chat template available in transformers.

In [2]:
import transformers 
print(transformers.__version__)

from transformers import pipeline
import torch

from accelerate import Accelerator
from transformers import AutoTokenizer, AutoModelForCausalLM

accelerator = Accelerator()
device = accelerator.device

model_id = "meta-llama/Meta-Llama-3-8B-Instruct"

model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto",
)

tokenizer = AutoTokenizer.from_pretrained(model_id)

  from .autonotebook import tqdm as notebook_tqdm


4.41.0


Loading checkpoint shards: 100%|██████████| 4/4 [00:06<00:00,  1.58s/it]
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.


In [25]:
prompt = r"""
This document appears to be a medical record from a hospital visit at "Hospital Infantil de México Federico Gómez." It includes patient details and notes from an oncology consultation. Here’s a detailed explanation:
Header Information
Hospital Name: Hospital Infantil de México Federico Gómez
Department: Oncología (Oncology)
Consultation Type: Consulta Externa (Outpatient Consultation)
Date: 15 Aug 2021
Record Number: 847388
Patient Information
Age: 18 years
Weight: 110.5 kg
Height: 174 cm
Heart Rate (FC): 66 bpm
Respiratory Rate (FR): (not clearly mentioned)
Blood Pressure (TA): 119/84 mmHg
Temperature: 35.6°C
Clinical Notes
The patient is known for their age.
Currently in maintenance (sem S6, likely week 6 of a treatment regimen).
Attended an asymptomatic review.
Physical examination showed no signs of active neoplastic (cancerous) or infectious diseases.
The proposed treatment includes maintaining the current plan and additional measures (likely related to a catheter, port, and chemotherapy drugs).
Treatment Plan
Current Medications and Dosages:
Methotrexate (MTX): 90 mg IM weekly
Purineethol (Mercaptopurine): 150 mg/m² daily
Methotrexate (MTX): 90 mg IM weekly
Ara-C (Cytarabine): 1000 mg intrathecally
VCR (Vincristine): 2 mg IV
L-asparaginase (L-ASP): 173,000 UI IM daily
Dexamethasone (DEXA): 1.8 mg/m²/day for 6 days weekly
Ondansetron: 8 mg for nausea prophylaxis
TMP/SMX (Trimethoprim/Sulfamethoxazole): Prophylactic dose
Additional Notes
The next important date for the patient (likely a significant treatment or appointment) is scheduled for August 9.
Other unspecified medical or supportive measures.
Prescribing Doctors
Signatures or names of doctors involved in the treatment (Dr. Rafael Ballon).
Instructions
Use a new sheet for further progress notes.
Number the evolution sheets for easier review.
"""
messages = [
    {"role": "system", "content": prompt2},
    {"role": "user", "content": "What is the phase of the current treatment?"},
]

chat_template = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
input_dict = tokenizer.encode_plus(chat_template)

input_ids = torch.tensor(input_dict["input_ids"]).unsqueeze(0).to(device)
attention_mask = torch.tensor(input_dict["attention_mask"]).unsqueeze(0).to(device)
# input_ids = tokenizer.apply_chat_template(
#     messages,
#     add_generation_prompt=True,
#     return_tensors="pt"
# ).to(model.device)

terminators = [
    tokenizer.eos_token_id,
    tokenizer.convert_tokens_to_ids("<|eot_id|>")
]



In [26]:
outputs = model.generate(
    input_ids,
    attention_mask=attention_mask,
    max_new_tokens=512,
    eos_token_id=terminators,
    do_sample=True,
    temperature=0.6,
    top_p=0.9,
)
assistant_response = tokenizer.decode(outputs[0])
print(assistant_response)

Setting `pad_token_id` to `eos_token_id`:128009 for open-end generation.


<|begin_of_text|><|begin_of_text|><|start_header_id|>system<|end_header_id|>

([[611, 137], [1817, 137], [1817, 257], [611, 257]], 'HOSPITAL INFANTIL DE MEXICO', 0.6590712628819553) ([[853, 252], [1578, 252], [1578, 353], [853, 353]], 'FEDERICO GOMEZ', 0.8388057948285695) ([[1034, 380], [1206, 380], [1206, 411], [1034, 411]], '[rmkab', 0.0216942246446568) ([[188, 413], [631, 413], [631, 461], [188, 461]], 'COHSULTA EXTERnA', 0.3550443526218151) ([[1231, 391], [2058, 391], [2058, 483], [1231, 483]], 'Consultdpe EVOLUCION NUM ', 0.5406709292370818) ([[51, 511], [265, 511], [265, 555], [51, 555]], 'NOMOKE1', 0.03334764095554496) ([[1066, 516], [1192, 516], [1192, 555], [1066, 555]], 'SUY,', 0.1393755401012198) ([[1220, 470], [1680, 470], [1680, 574], [1220, 574]], 'Oncología', 0.9659505273910377) ([[1902, 493], [2119, 493], [2119, 547], [1902, 547]], 'REC, NuM.', 0.24361094081855053) ([[2102, 440], [2435, 440], [2435, 550], [2102, 550]], '841138', 0.9136440389702557) ([[135, 608], [793, 6

# Modeling explanation for Llama 3

In [None]:
import torch.nn as nn 
from models import MLP

class SimilarityMeasure(nn.Module):
    def __init__(self, input_ids, attention_mask, embed_size=512):
        super(SimilarityMeasure, self).__init__()

        self.pred_map = MLP(pred_hidden_size, 128, embed_size, num_blocks=2, bottleneck_dim=64)
        self.explain_map = MLP(explain_hidden_size, 128, embed_size, num_blocks=2, bottleneck_dim=64)

        self.logit_scale = nn.Parameter(torch.tensor(1.0))
    
    def forward(self, pred_feature, explain_features):
        """
        Forward pass of the model.

        Args:
            q (torch.Tensor): Query tensor of shape [N, pred_hidden_size].
            k (torch.Tensor): Key tensor of shape [N, L, explain_hidden_size].

        Returns:
            torch.Tensor: Similarity tensor of shape [N, L].
        """
        pred_feature = F.normalize(self.pred_map(pred_feature), p=2, dim=-1).unsqueeze(1)  # [N, 1, embed_size]
        explain_features = F.normalize(self.explain_map(explain_features), p=2, dim=-1)  # [N, L, embed_size]


        logit_scale = self.logit_scale.exp()

        similarity = torch.matmul(explain_features, pred_feature.transpose(-1, -2)).squeeze(-1) * logit_scale  # [N, L]

        return similarity  # [N, L]


class MaskGeneratingModel(nn.Module):
    def __init__(self, hidden_size):
        """ 
        hidden_size: int
            The hidden size of the output of the generative model
        """
        super().__init__()

        self.hidden_size = hidden_size
        # self.similarity_measure = SimilarityMeasure()
        self.explain_map = MLP(input_dim=hidden_size, 
                               hidden_dim=128, 
                               output_dim=1, 
                               num_blocks=2, 
                               bottleneck_dim=64) # takes [N, L, hidden_size] outputs [N, L, 1]
        
        self.similarity = MLP(input_dim=hidden_size, 
                               hidden_dim=128, 
                               output_dim=1, 
                               num_blocks=2, 
                               bottleneck_dim=64) # takes [N, hidden_size] outputs [N, 1]

        self.bce_loss = nn.BCELoss(reduction='none')
    
    def forward(self, pred_features):
        """ 
        pred_features: torch.Tensor of shape [N, L, hidden_size]
        """
        mask_logits = self.explain_map(pred_features) # [N, L, 1]
        return mask_logits 
    
    def compute_similarity(self, masked_concat_features):
        """ 
        masked_concat_features: torch.Tensor of shape [N, hidden_size]
        """
        similarity_logit = self.similarity(masked_concat_features) # [N, 1]
        return similarity_logit



In [34]:
tokenizer.encode_plus("<mask>")


{'input_ids': [128000, 27, 11508, 29], 'attention_mask': [1, 1, 1, 1]}

In [3]:
import easyocr
reader = easyocr.Reader(['es','en']) # this needs to run only once to load the model into memory
result = reader.readtext('handwritten_prescription.jpg')

In [18]:
prompt2 = ' '.join([str(result[i]) for i in range(len(result))])

In [15]:
prompt2

'HOSPITAL INFANTIL DE MEXICO FEDERICO GOMEZ [rmkab COHSULTA EXTERnA Consultdpe EVOLUCION NUM  NOMOKE1 SUY , Oncología REC, NuM. 841138 S0Lloy, Oadenes Y TRATAMIENTO Nota $ DE Evolucion 15 Af1 2021 Edad 12a sL2 1 1poñ05(mocido +wx LLQ 5 Lla a2 20i edad "na1Y En mantnimient Sm 56 FC_ Lc Oudc @IcvSIY asinlomance FR 0 Sin datos de øcg9 TID neqplisìccninteccosa SC cpauent p 35.6 piopane cCEI 4 Lehuss ~e 53 60 Plcu (mhog muknimlen mIX 90m Ixmand? 1v snauas 52-6 lunetho 1SOmXZsy Poim @DT cl MIX m xA C acoslZ Gbc ACO) Judossgic cvclas c Glame Lr Lasp 132000 UI\' Du7 Da {E Dexg Lmnyøn Quauls Bauar X TYmana @Hdaussvon Mprno clapod Imp/mx popkchco Uuncose una nueye hole de evolurlon haele haborus cerclorudo que le unlerior 8ata Mumoronao les holes 0o Dvolucion pars tochner |e rovlclon dol arpodionio Paso 6f La Tew 4k puitu 90m9 na JU 2m @ajao 8m Ilonn'