You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When using claude CLI with image selection, the vision preprocessing is skipped even though vision.enabled: true in config.yaml. Images are passed directly to Cursor API without OCR/vision processing, causing:
The image handling path in src/openai-handler.ts doesn't detect that CLI requests contain images
No vision mode logic executes before sending to Cursor
The Anthropic Messages API flow (used by claude CLI) sends images in the content array as ImageBlockParam objects. The current vision preprocessing in converter.ts only processes OpenAI-style image objects (with url or base64 fields in specific locations), not Anthropic-style image blocks.
src/index.ts routes /v1/messages requests directly to the converter without checking for image content first. The vision check should happen before protocol conversion, but currently happens only in openai-handler.ts (post-conversion).
Replace image blocks with text description in the prompt
Send text-only request to Cursor, inject vision results into system prompt
Why This Matters
The vision feature (v2.3.0) is only functional for OpenAI clients (ChatBox, LobeChat) but broken for the primary use case: Claude CLI integration with Claude Code. This defeats the purpose of image support in a Claude-focused proxy.
Solution Scope
Add a preprocessImages() function in converter.ts that:
Detects ImageBlockParam objects in Anthropic message format
Extracts and processes images before cursor-client.ts makes the API call
Handles both OCR and external vision API modes
Returns modified messages with vision results injected
Call this in the Anthropic message handler before converting to Cursor format.
Problem
When using
claudeCLI with image selection, the vision preprocessing is skipped even thoughvision.enabled: truein config.yaml. Images are passed directly to Cursor API without OCR/vision processing, causing:src/openai-handler.tsdoesn't detect that CLI requests contain imagesRoot Cause
The Anthropic Messages API flow (used by
claudeCLI) sends images in thecontentarray asImageBlockParamobjects. The current vision preprocessing inconverter.tsonly processes OpenAI-style image objects (withurlorbase64fields in specific locations), not Anthropic-style image blocks.src/index.tsroutes/v1/messagesrequests directly to the converter without checking for image content first. The vision check should happen before protocol conversion, but currently happens only inopenai-handler.ts(post-conversion).Expected Behavior
When Claude CLI sends a request with:
{ "messages": [{ "role": "user", "content": [ {"type": "image", "source": {"type": "base64", "media_type": "image/png", "data": "..."}}, {"type": "text", "text": "analyze this image"} ] }] }The system should:
messages[].content[]vision.modeconfig)Why This Matters
The vision feature (v2.3.0) is only functional for OpenAI clients (ChatBox, LobeChat) but broken for the primary use case: Claude CLI integration with Claude Code. This defeats the purpose of image support in a Claude-focused proxy.
Solution Scope
Add a
preprocessImages()function inconverter.tsthat:ImageBlockParamobjects in Anthropic message formatCall this in the Anthropic message handler before converting to Cursor format.
Contributed by Klement Gunndu