## LlamaStack Vision API


## Setup

In [1]:
# Imports
import asyncio
import base64
import mimetypes
from llama_stack_client import LlamaStackClient
from llama_stack_client.lib.inference.event_logger import EventLogger
from llama_stack_client.types import UserMessage
from termcolor import cprint
import os

In [2]:
# Select model
model = 'sambanova/Llama-3.2-11B-Vision-Instruct'
# Select image path
image_path='../../../images/SambaNova-dark-logo-1.png'

In [3]:
# Initialize client
client = LlamaStackClient(
    base_url=f"http://localhost:{os.environ['LLAMA_STACK_PORT']}",
)

## Helper functions
Some utility functions to handle image processing and API interaction.

In [4]:
def encode_image_to_data_url(file_path: str) -> str:
    """
    Encode an image file to a data URL.

    Args:
        file_path: Path to the image file

    Returns:
        Data URL string
    """
    mime_type, _ = mimetypes.guess_type(file_path)
    if mime_type is None:
        raise ValueError("Could not determine MIME type of the file")

    with open(file_path, "rb") as image_file:
        encoded_string = base64.b64encode(image_file.read()).decode("utf-8")

    return f"data:{mime_type};base64,{encoded_string}"


## Chat with the image
The code below uses the Llama Stack Chat API to interact with the LLM.


In [5]:
def process_image_chat(client: LlamaStackClient, image_path: str, stream: bool = True):
    """
    Process an image through the LlamaStack Chat API.

    Args:
        client: Initialized client.
        image_path: Path to image file.capitalize.
        stream: Whether to stream the response.
    """
    data_url = encode_image_to_data_url(image_path)

    messages=[
        {
            'role': 'user', 'content': {'type': 'image', 'image': {'url': {'uri': data_url}}}
        },
        {
            'role': 'user',
            'content': 'What does this image represent?',
        },
    ]
    
    cprint("User> Sending image for analysis...", "green")
    response = client.inference.chat_completion(
        messages=messages,
        model_id=model,
        stream=stream,
    )

    if stream:
        text = ''
        for chunk in response:
            if chunk.event is not None:
                print(f'{chunk.event.delta.text}', end='', flush=True)
                text += chunk.event.delta.text
                print(chunk.event.delta.text)
        print(text)
    else:
        print(
            f'Type: {type(response.completion_message.content)}, '
            f'Value:{response.completion_message.content}'
        )

In [6]:
# Chat with the iamge
process_image_chat(client=client, image_path=image_path, stream=True)

[32mUser> Sending image for analysis...[0m

The image represents the logo of The image represents the logo of 
SambaNova Systems, a company that specializes in artificial SambaNova Systems, a company that specializes in artificial 
intelligence (AI) and machine learning (ML) solutions. intelligence (AI) and machine learning (ML) solutions. 
The logo features the company name The logo features the company name 
"SambaNova Systems" in large, bold letters, with the "SambaNova Systems" in large, bold letters, with the 
word "Systems" in smaller text underneath. The word "Systems" in smaller text underneath. The 
logo also includes a stylized letter "S" logo also includes a stylized letter "S" 
made up of curved lines, which is likely meant made up of curved lines, which is likely meant 
to represent the company's name and to represent the company's name and 
brand identity.

SambaNova Systems is a technology company that was brand identity.

SambaNova Systems is a technology company that