<div align="center">
<p align="center" style="width: 100%;">
    <img src="https://raw.githubusercontent.com/vlm-run/.github/refs/heads/main/profile/assets/vlm-black.svg" alt="VLM Run Logo" width="80" style="margin-bottom: -5px; color: #2e3138; vertical-align: middle; padding-right: 5px;"><br>
</p>
<p align="center"><a href="https://docs.vlm.run"><b>Website</b></a> | <a href="https://docs.vlm.run/"><b>API Docs</b></a> | <a href="https://docs.vlm.run/blog"><b>Blog</b></a> | <a href="https://discord.gg/AMApC2UzVY"><b>Discord</b></a> | <a href="https://chat.vlm.run"><b>Chat</b></a>
</p>
</div>

# VLM Run Orion - Interior Design Ergonomics Audit

This notebook demonstrates how to use [VLM Run Orion's](https://vlm.run/orion) vision capabilities to act as a Certified Professional Ergonomist. You can upload a photo of a home office setup and ask the model to critique the ergonomics and suggest improvements.

For more details on the API, see the [Agent API docs](https://docs.vlm.run/agents/introduction).

## Prerequisites

- Python 3.10+
- VLM Run API key (get one at [app.vlm.run](https://app.vlm.run))
- VLM Run Python Client with OpenAI extra `vlmrun[openai]`

## Setup

First, install the required packages and configure the environment.

In [1]:
# Install required packages
%pip install vlmrun[openai] --upgrade --quiet
%pip install cachetools pillow requests numpy --quiet

Note: you may need to restart the kernel to use updated packages.



[notice] A new release of pip is available: 25.1.1 -> 25.3
[notice] To update, run: python.exe -m pip install --upgrade pip


Note: you may need to restart the kernel to use updated packages.



[notice] A new release of pip is available: 25.1.1 -> 25.3
[notice] To update, run: python.exe -m pip install --upgrade pip


In [2]:
import os
import getpass

VLMRUN_API_KEY = os.getenv("VLMRUN_API_KEY", None)
if VLMRUN_API_KEY is None:
    VLMRUN_API_KEY = getpass.getpass("Enter your VLM Run API key: ")

## Initialize the VLM Run Client

We use the OpenAI-compatible chat completions interface through the VLM Run SDK.

In [3]:
from vlmrun.client import VLMRun

BASE_URL = os.getenv("VLMRUN_BASE_URL", "https://agent.vlm.run/v1")
client = VLMRun(api_key=VLMRUN_API_KEY, base_url=BASE_URL)
print("VLM Run client initialized successfully!")
print(f"Base URL: {BASE_URL}")

VLM Run client initialized successfully!
Base URL: https://agent.vlm.run/v1


## Response Models

We define Pydantic models for structured outputs. The report will be returned as a structured object containing specific sections like Visual Ergonomic Assessment, Risk Identification, and Actionable Recommendations.

In [4]:
from typing import List
from pydantic import BaseModel, Field


class ErgonomicAssessment(BaseModel):
    """Detailed evaluation of key ergonomic elements."""
    monitor_placement: str = Field(..., description="Assessment of height, tilt, and distance relative to the user")
    seating_mechanics: str = Field(..., description="Critique of lumbar support, seat pan depth, and backrest angle")
    input_device_alignment: str = Field(..., description="Height and position of keyboard/mouse relative to armrests")
    lighting_and_glare: str = Field(..., description="Identification of lighting sources and potential glare zones")


class Recommendations(BaseModel):
    """Actionable adjustments to optimize the setup."""
    immediate_fixes: List[str] = Field(..., description="Adjustments that can be made instantly")
    behavioral_adjustments: List[str] = Field(..., description="Changes in user posture or habits")
    equipment_suggestions: List[str] = Field(..., description="New gear or furniture to consider")


class ErgonomicsAudit(BaseModel):
    """Structured report for an ergonomic workspace audit."""
    visual_assessment: ErgonomicAssessment = Field(..., description="Visual evaluation of the workspace components")
    risk_identification: List[str] = Field(..., description="Specific areas that may lead to symptoms like MSDs, eye strain, etc.")
    recommendations: Recommendations = Field(..., description="Categorized list of improvements")


print("Response models defined successfully!")

Response models defined successfully!


## Helper Functions

We create helper functions to simplify making chat completion requests with structured outputs.

In [5]:
import hashlib
import json
from typing import Any, Type, TypeVar

import cachetools
from vlmrun.common.image import encode_image
from PIL import Image


T = TypeVar('T', bound=BaseModel)


def custom_key(prompt: str, image_path: str | None = None, response_model: Type[T] | None = None, model: str = "vlmrun-orion-1:auto"):
    """Custom key for caching chat_completion."""
    response_key = hashlib.sha256(json.dumps(response_model.model_json_schema(), sort_keys=True).encode()).hexdigest() if response_model else ""
    image_key = hashlib.sha256(image_path.encode()).hexdigest() if image_path else ""
    return (prompt, image_key, response_key, model)


@cachetools.cached(cache=cachetools.TTLCache(maxsize=100, ttl=3600), key=custom_key)
def chat_completion(
    prompt: str,
    image_path: str | None = None,
    response_model: Type[T] | None = None,
    model: str = "vlmrun-orion-1:auto"
) -> tuple[BaseModel | str, str]:
    """
    Make a chat completion request with structured output for Ergonomics Audit.

    Args:
        prompt: The prompt describing the audit task
        image_path: Path to the image file (Home office setup)
        response_model: Pydantic model for structured output
        model: Model to use (default: vlmrun-orion-1:auto)

    Returns:
        Tuple of (parsed response model or text, session_id)
    """
    content = [{"type": "text", "text": prompt}]
    
    # Add image if provided
    if image_path:
        image = Image.open(image_path)
        image_data = encode_image(image, format="JPEG")
        content.append({"type": "image_url", "image_url": {"url": image_data}})

    kwargs = {
        "model": model,
        "messages": [{"role": "user", "content": content}]
    }

    if response_model:
        kwargs["response_format"] = {
            "type": "json_schema",
            "schema": response_model.model_json_schema()
        }

    response = client.agent.completions.create(**kwargs)
    response_text = response.choices[0].message.content
    session_id = response.session_id

    if response_model:
        result = response_model.model_validate_json(response_text)
        return result, session_id

    return response_text, session_id

print("Helper functions defined!")

Helper functions defined!


## Ergonomics Audit

Upload a photo of a home office setup and ask the VLM to critique the ergonomics.

In [6]:
# Prompt for Ergonomics Assessment
ERGONOMICS_PROMPT = """
Task: Act as a Certified Professional Ergonomist (CPE). Analyze the attached photo of a home office setup and provide a detailed critique of the workspace layout relative to human factors and physical health.

Instructions:

Visual Ergonomic Assessment: Evaluate the following key elements visible in the image:

Monitor Placement: Assess height relative to eye level, tilt, and estimated distance from the user's seated position. Check for potential neck strain or "tech neck" indicators.

Seating Mechanics: Critique the chair's lumbar support, seat pan depth, and the angle of the backrest.

Input Device Alignment: Observe the height and position of the keyboard and mouse relative to the armrests and desk surface. Look for potential wrist extension or shoulder shrugging.

Lighting and Glare: Identify potential sources of eye strain, such as unshielded windows or poorly placed lamps reflecting off screens.

Risk Identification: Highlight specific areas that may lead to Musculoskeletal Disorders (MSDs), such as carpal tunnel syndrome, lower back pain, or cervical tension.

Actionable Recommendations: Provide a numbered list of adjustments to optimize the setup. Categorize these into:

Immediate Fixes: (e.g., "Raise monitor by 3 inches using a riser.")

Behavioral Adjustments: (e.g., "Ensure feet are flat on the floor.")

Equipment Suggestions: (e.g., "Consider an external keyboard to allow for better screen positioning.")
"""

print("Ergonomics prompt prepared!")
print(f"\\nPrompt length: {len(ERGONOMICS_PROMPT)} characters")

Ergonomics prompt prepared!
\nPrompt length: 1441 characters


In [None]:
# Note: Ensure you have a 'home_office.jpg' in the 'files' directory or update the path below.
image_path = "home_office.jpeg" 

# Verify if image exists to avoid errors in this example execution
if os.path.exists(image_path):
    result, session_id = chat_completion(
        prompt=ERGONOMICS_PROMPT,
        image_path=image_path,
        response_model=ErgonomicsAudit,
        model="vlmrun-orion-1:auto"
    )

    print(">> RESPONSE")
    print(result)
    print(f"\\n>> SESSION ID: {session_id}")
    
    print("\\n>> ERGONOMICS AUDIT REPORT")
    print("=" * 80)
    print("Visual Assessment:")
    print(f"- Monitor: {result.visual_assessment.monitor_placement}")
    print(f"- Seating: {result.visual_assessment.seating_mechanics}")
    print(f"- Inputs: {result.visual_assessment.input_device_alignment}")
    print(f"- Lighting: {result.visual_assessment.lighting_and_glare}\\n")
    print("Risk Identification:")
    for risk in result.risk_identification:
        print(f"!! {risk}")
    print("\\nRecommendations:")
    print("Immediate Fixes:")
    for fix in result.recommendations.immediate_fixes:
        print(f"- {fix}")
    print("\\nBehavioral:")
    for beh in result.recommendations.behavioral_adjustments:
        print(f"- {beh}")
    print("\\nEquipment:")
    for eq in result.recommendations.equipment_suggestions:
        print(f"- {eq}")
    print("=" * 80)
else:
    print(f"Image not found at {image_path}. Please add a home office image to run this example.")

>> RESPONSE
visual_assessment=ErgonomicAssessment(monitor_placement='The main monitor is well-positioned on a riser, likely keeping the top of the screen at eye level. However, the laptop screen is significantly lower, necessitating harmful downward neck flexion.', seating_mechanics='The chair provides excellent lumbar support and a contoured mesh backrest. The seat pan depth and slightly reclined backrest angle are conducive to healthy circulation and spinal alignment.', input_device_alignment='Keyboard and mouse are placed directly on the desk surface. There is a high likelihood of wrist extension because the chair armrests appear low, potentially leaving the forearms unsupported during use.', lighting_and_glare='The large window to the right is a primary source of glare. While the blinds offer some control, unshielded light reflecting off the screen can lead to significant eye fatigue.') risk_identification=["Neck Strain ('Tech Neck'): High risk if the laptop screen is used frequent

---

## Conclusion

This notebook demonstrated how to use **VLM Run Orion** to perform an ergonomics audit of a workspace.

### Key Takeaways

1. **Human Factors Analysis**: The model can evaluate the physical relationship between a user and their environment using only visual cues.
2. **Preventative Healthcare**: By identifying risk factors for MSDs, this tool can serve as a first-line assessment for occupational health.
3. **Structured Advice**: Grouping recommendations into "Immediate," "Behavioral," and "Equipment" makes the advice practical and easy to implement.

### Next Steps

- Use the model to assess standing desk setups.
- Integrate with a chatbot to interactive ergonomic coaching.
- Explore the [VLM Run Documentation](https://docs.vlm.run) for more capabilities.

Stay healthy and comfortable! ðŸª‘