Skip to content

fix(browser): return MEDIA: path from screenshot for channel delivery#8

Merged
viettranx merged 1 commit intonextlevelbuilder:mainfrom
xthanhn91:fix/browser-screenshot-media-path
Feb 28, 2026
Merged

fix(browser): return MEDIA: path from screenshot for channel delivery#8
viettranx merged 1 commit intonextlevelbuilder:mainfrom
xthanhn91:fix/browser-screenshot-media-path

Conversation

@xthanhn91
Copy link
Copy Markdown
Contributor

Summary

  • handleScreenshot returned a truncated base64 string (first 100 chars of encoded data), effectively discarding the screenshot bytes
  • The media pipeline in loop.go requires MEDIA: prefix in tool results to collect files for channel delivery (e.g. sendPhoto on Telegram) — without it, screenshots were silently lost
  • Save screenshot bytes to a temp file and return MEDIA:/path, matching the pattern used by create_image and tts tools

Root Cause

// Before: truncated base64 text, no MEDIA: prefix → never delivered
encoded := base64.StdEncoding.EncodeToString(data)
return tools.NewResult(fmt.Sprintf("Screenshot captured (%d bytes). Base64: %s",
    len(data), encoded[:min(100, len(encoded))]))

parseMediaResult() in loop.go scans tool output for MEDIA: prefix to route files to channel delivery. Without this prefix, the screenshot was only returned as meaningless truncated text to the LLM.

Fix

// After: save to file + return MEDIA: path → delivered via sendPhoto/sendDocument
imagePath := filepath.Join(os.TempDir(), fmt.Sprintf("goclaw_screenshot_%d.png", time.Now().UnixNano()))
os.WriteFile(imagePath, data, 0644)
return &tools.Result{ForLLM: fmt.Sprintf("MEDIA:%s", imagePath)}

Test plan

  • Start browser → open URL → take screenshot → verify image delivered to Telegram/Discord
  • Verify temp file cleanup by dispatchOutbound after delivery
  • Verify fullPage: true screenshots also work

… base64

handleScreenshot returned a truncated base64 string (first 100 chars only),
which meant the screenshot data was effectively lost and never delivered
to channels (Telegram, Discord, etc.) via the media pipeline.

The media pipeline in loop.go looks for "MEDIA:" prefix in tool results
to collect files for channel delivery (e.g. sendPhoto on Telegram).
Without this prefix, screenshots were silently discarded.

Fix: save screenshot bytes to a temp file and return MEDIA:/path,
matching the pattern used by create_image and tts tools.
@xthanhn91 xthanhn91 force-pushed the fix/browser-screenshot-media-path branch from bcfb894 to 116d2eb Compare February 27, 2026 15:18
@viettranx viettranx merged commit e76aabb into nextlevelbuilder:main Feb 28, 2026
MiltonSilvaJr referenced this pull request in vellus-ai/argoclaw Mar 22, 2026
Sprint 0 — Security hardening before feature development.

HIGH fixes:
- #1: Whitelist table names in execMapUpdate() — prevents SQL injection
  via dynamic table name (store/pg/helpers.go)
- #2: Log invalid groupBy values in snapshot queries (store/pg/snapshot.go)
- #3: Validated shellEscape() — single-quote wrapping is correct;
  added PBT tests for shell injection (tools/dynamic_tool_security_test.go)

MEDIUM fixes:
- #4-5: Log security warnings for no-token and viewer-fallback auth
  (gateway/router.go)
- #6: Restrict CORS on OpenAPI endpoint — removed wildcard, allow only
  localhost origins (http/openapi.go)
- #7: Add CheckSSRFWithPinning() for DNS rebinding TOCTOU prevention
  (tools/web_shared.go)
- #8: Log warning when TLS verification is disabled
  (tracing/otelexport/exporter.go)
- #9: Pin all Python package versions in Dockerfile — prevents
  supply chain attacks via unpinned dependencies
- #10: Change HOME fallback from /tmp to /app — prevents temp dir
  abuse (tools/credentialed_exec.go)

Also fixes arargoclaw double-rename bug in 356 Go import paths.

Tests: PBT tests for table whitelist and shell escaping (testing/quick).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
MiltonSilvaJr referenced this pull request in vellus-ai/argoclaw Mar 22, 2026
Sprint 0 — Security hardening before feature development.

HIGH fixes:
- #1: Whitelist table names in execMapUpdate() — prevents SQL injection
  via dynamic table name (store/pg/helpers.go)
- #2: Log invalid groupBy values in snapshot queries (store/pg/snapshot.go)
- #3: Validated shellEscape() — single-quote wrapping is correct;
  added PBT tests for shell injection (tools/dynamic_tool_security_test.go)

MEDIUM fixes:
- #4-5: Log security warnings for no-token and viewer-fallback auth
  (gateway/router.go)
- #6: Restrict CORS on OpenAPI endpoint — removed wildcard, allow only
  localhost origins (http/openapi.go)
- #7: Add CheckSSRFWithPinning() for DNS rebinding TOCTOU prevention
  (tools/web_shared.go)
- #8: Log warning when TLS verification is disabled
  (tracing/otelexport/exporter.go)
- #9: Pin all Python package versions in Dockerfile — prevents
  supply chain attacks via unpinned dependencies
- #10: Change HOME fallback from /tmp to /app — prevents temp dir
  abuse (tools/credentialed_exec.go)

Also fixes arargoclaw double-rename bug in 356 Go import paths.

Tests: PBT tests for table whitelist and shell escaping (testing/quick).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@viettranx viettranx mentioned this pull request Mar 23, 2026
blackbirdzzzz365-gif pushed a commit to blackbirdzzzz365-gif/goclaw that referenced this pull request Apr 12, 2026
9 checkpoint documents covering the upgrade from 43% to ~85% pattern
matching with Claude Code's architectural patterns.

Checkpoints:
- CP-00: Current state analysis
- CP-01: Context defense 5 layers (Pattern nextlevelbuilder#9)
- CP-02: Concurrency-safe partitioning (Pattern nextlevelbuilder#4)
- CP-03: Streaming tool execution (Pattern nextlevelbuilder#5)
- CP-04: Escalating recovery (Pattern nextlevelbuilder#3)
- CP-05: Context modifier chain + fork isolation (Patterns nextlevelbuilder#6, nextlevelbuilder#8)
- CP-06: Permission classification pipeline (Pattern nextlevelbuilder#10)
- CP-07: Skill system upgrade (Patterns nextlevelbuilder#11-13)
- CP-08: Plugin ecosystem (Patterns nextlevelbuilder#14-16)

Based on analysis from "Giai phau mot Agentic Operating System"
(18 patterns from 513K LOC Claude Code source).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants