### <font color='purple'>__Process Images with bakllava__</font>

#### <font color='purple'>Run bakllava model with transformers</font>  
Here is some sample code that uses the bakllava model: 

In [None]:
import requests
from PIL import Image

import torch
from transformers import AutoProcessor, LlavaForConditionalGeneration

# https://huggingface.co/llava-hf/bakLlava-v1-hf
llm_model = "llava-hf/bakLlava-v1-hf"
llm_dir = "/kellogg/data/llm_models_opensource/bakLlava"

prompt = "USER: <image>\nYou are a new mom aged between 25 and 30 taking a marketing survey. Can you describe in details how the diaper packaging appeal to you?\nASSISTANT:"
image_file = "https://i5.walmartimages.com/seo/Baby-Diapers-Newborn-Size-0-10-lb-120-Count-Pampers-Swaddlers-ONEMONTH-SUPPLY-Packaging-May-Vary_95ea2d1c-769a-4ec2-8da8-7e6ea4829e55.3c890f8095b5b30ae263c996151f9575.jpeg"

model = LlavaForConditionalGeneration.from_pretrained(
    llm_model, 
    torch_dtype=torch.float16, 
    low_cpu_mem_usage=True, 
    cache_dir=llm_dir,
).to(0)

processor = AutoProcessor.from_pretrained(llm_model)

raw_image = Image.open(requests.get(image_file, stream=True).raw)
inputs = processor(prompt, raw_image, return_tensors='pt').to(0, torch.float16)

output = model.generate(**inputs, max_new_tokens=400, do_sample=False)
print(processor.decode(output[0][2:], skip_special_tokens=True))


Output:   
```
Loading checkpoint shards:   0%|          | 0/4 [00:00<?, ?it/s]
Loading checkpoint shards:  25%|██▌       | 1/4 [00:00<00:01,  1.61it/s]
Loading checkpoint shards:  75%|███████▌  | 3/4 [00:00<00:00,  4.81it/s]
Loading checkpoint shards: 100%|██████████| 4/4 [00:00<00:00,  5.02it/s]
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
ER:  
You are a new mom aged between 25 and 30 taking a marketing survey. Can you describe in details how the diaper packaging appeal to you?
ASSISTANT: The diaper packaging appeals to me as it features a heartwarming image of a parent holding a newborn baby, which evokes feelings of love, care, and the joy of parenthood. The baby is shown resting comfortably in a padded swaddler, which provides a sense of security and warmth for the infant. The swaddler also helps soothe colicky babies, ensuring better sleep for both the baby and the parents. The image on the packaging connects with new parents like me, as it represents the tender moments we experience with our newborns, and it makes me feel that the diaper brand understands the importance of bonding and comfort during this special time in our lives.
```


Here is a sample slurm script for runnning the above python code.

In [None]:
#!/bin/bash

#SBATCH -A your_quest_allocation_account
#SBATCH -p gengpu
#SBATCH --gres=gpu:a100:1
#SBATCH -N 1
#SBATCH -n 1
#SBATCH -t 0:30:00
#SBATCH --mem=40G

module purge
module load mamba/23.1.0
source /hpc/software/mamba/23.1.0/etc/profile.d/conda.sh
source activate /kellogg/software/envs/gpu-llama2

python test_bakllava.py

#### <font color='purple'>Run bakllava model with llama_cpp_python</font>  
Here is some sample code that uses the bakllava model: 

In [None]:
from llm_core.llm import LLaVACPPModel

model = "BakLLaVA-1-Q4_K_M.gguf"

llm = LLaVACPPModel(
    name=model,
    llama_cpp_kwargs={
        "logits_all": True,
        "n_ctx": 8000,
        "verbose": False,
        "n_gpu_layers": 1,  #: Set to 0 if you don't have a GPU
        "n_threads": 8,       #: Set to the number of available CPU cores
        "clip_model_path": "BakLLaVA-1-clip-model.gguf"
    }
)

llm.load_model()

history = [
    {
        'role': 'user',
        'content': [
            {'type': 'image_url', 'image_url': 'https://advanced-stack.com/assets/img/mappemonde.jpg'}
        ]
    }
]

response = llm.ask('Describe the image as accurately as possible', history=history)

print(response.choices[0].message.content)

Please save the bakllava models and this file to your home directory. You can run this file with the following SLURM script.

In [None]:
#!/bin/bash

#SBATCH -A your_quest_allocation_account
#SBATCH -p gengpu
#SBATCH --gres=gpu:a100:1
#SBATCH -N 1
#SBATCH -n 1
#SBATCH -t 0:30:00
#SBATCH --mem=40G

module purge
module use modulefiles
module load llama_cpp/2.38
python3 image_test.py
