docs: Course section — Image Input and Vision Models

## Summary

Teach users how to send images to agents and configure vision model support. Vision capabilities enable agents to understand screenshots, diagrams, photos, and other visual content alongside text.

## Course Section Outline

- Content block format — mixing text and image_url blocks in messages
- Configuring vision model endpoints in agent.yaml
- Deploying Granite Vision 3.2-2B on vLLM with the correct launch flags
- Sending images via the API — base64 encoding and file_id references from the upload endpoint
- UI integration for image paste, drag-and-drop, and file picker upload
- Model capability considerations — not all models support vision, how to handle gracefully

## Lab Exercise

Deploy a vision model on vLLM, create an agent configured to use it, and test image understanding through several scenarios: describe a photograph, extract text from a screenshot, and interpret a simple diagram. Verify that non-vision requests still route correctly.

## Companion Issues

Companion issues filed on fips-agents/agent-template, fips-agents/gateway-template, fips-agents/ui-template, and fips-agents/fips-agents-cli.

## Size

S

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs: Course section — Image Input and Vision Models #15

Summary

Course Section Outline

Lab Exercise

Companion Issues

Size

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

docs: Course section — Image Input and Vision Models #15

Description

Summary

Course Section Outline

Lab Exercise

Companion Issues

Size

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions