Skip to content

feat: add export-attachment command#4

Closed
chgeuer wants to merge 1 commit into
kenn-io:mainfrom
chgeuer:main
Closed

feat: add export-attachment command#4
chgeuer wants to merge 1 commit into
kenn-io:mainfrom
chgeuer:main

Conversation

@chgeuer
Copy link
Copy Markdown
Contributor

@chgeuer chgeuer commented Feb 2, 2026

Adds export-attachment command to export attachment binaries by content hash.

msgvault export-attachment <hash> -o file.pdf   # to file
msgvault export-attachment <hash> -o -          # binary stdout
msgvault export-attachment <hash> --base64      # base64 stdout
msgvault export-attachment <hash> --json        # JSON with base64

This e.g. allows me to easily store all attachments for a given e-mail:

msgvault show-message 45 --json | \
     jq -r '.attachments[] | "\(.content_hash)\t\(.filename)"' | \
     while IFS=$'\t' read -r hash name; do
       msgvault export-attachment "$hash" -o "$name"
     done

Closes #3

Add CLI command to export attachment binaries by content hash.
Users can get the hash from 'show-message --json' and extract
attachments without parsing raw MIME/EML files.

Supports multiple output modes:
- Binary file output (-o file.pdf)
- Binary stdout (-o - or default)
- Base64 stdout (--base64)
- JSON with embedded base64 (--json)

Resolves the need for programmatic attachment access without
external MIME parsing dependencies.
@wesm
Copy link
Copy Markdown
Member

wesm commented Feb 2, 2026

Makes sense. If you can bear with me, I'd like to holistically improve the CLI for interacting with file attachments. There is already the get_attachment MCP tool (https://www.msgvault.io/usage/chat/#available-tools), so I think it should be easy to list attachment hashes and export them via the CLI without having to wrangle JSON with jq.

@chgeuer
Copy link
Copy Markdown
Contributor Author

chgeuer commented Feb 2, 2026

Would be awesome to have some 'pure' (non-LLM/MCP) CLI mechanism. I'm calling it from an Elixir application; I didn't want my app to query the real GMail endpoint, but to have a locally cached copy.

@wesm
Copy link
Copy Markdown
Member

wesm commented Feb 4, 2026

I'm back looking at this. Since I just added an export_attachment MCP tool I'll make sure there is some feature parity on this

@wesm
Copy link
Copy Markdown
Member

wesm commented Feb 4, 2026

I don't want to force push your main branch so I created #56 to supersede this

@wesm wesm closed this Feb 4, 2026
wesm added a commit that referenced this pull request Feb 4, 2026
…e shared export logic (#56)

## Summary

- Adds `export-attachment` command to export single attachment binaries
by content hash (supersedes #4)
- Adds `export-attachments` command to export all attachments from a
message as individual files
- Consolidates attachment export logic into `internal/export` so TUI,
CLI, and MCP share one code path

```bash
# Single attachment by content hash
msgvault export-attachment <hash> -o file.pdf
msgvault export-attachment <hash> --base64
msgvault export-attachment <hash> --json

# All attachments from a message
msgvault export-attachments 45                  # all attachments → cwd
msgvault export-attachments 45 -o ~/Downloads   # all attachments → specific dir
msgvault export-attachments 18f0abc123def       # by Gmail ID
```

### Shared export package (`internal/export`)
- `AttachmentsToDir()` — export attachments as individual files to a
directory (streaming I/O, deduped filenames, `O_EXCL` file creation)
- `CreateExclusiveFile()` — atomic file creation with `_1`, `_2` suffix
on conflict
- `StoragePath()` — content-addressed path construction with hash
validation
- `ValidateContentHash()`, `SanitizeFilename()` — already existed, now
used by all code paths

### MCP consolidation
- Removed duplicated `sanitizeFilename`, `createExclusive`,
`pathConflict` from MCP handler (~50 lines)
- MCP now uses shared `export.SanitizeFilename`,
`export.CreateExclusiveFile`, `export.StoragePath`

Closes #3

## Test plan
- [x] `make test && make lint` pass
- [x] `internal/export`: 8 `TestAttachmentsToDir` subtests,
`TestCreateExclusiveFile` (4 subtests),
`TestAttachmentsToDir_FilePermissions`,
`TestAttachmentsToDir_DiskConflict`
- [x] `cmd/.../export_attachment_test.go`: binary, JSON, base64 output
modes; missing file; flag exclusivity; hash validation
- [x] `cmd/.../export_attachments_test.go`: full flow with real DB,
Gmail ID fallback, message not found, output dir validation,
not-a-directory
- [x] MCP `TestSanitizeFilename` updated to use shared function
- [x] Manual: `msgvault export-attachments <id>` with real message

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Christian Geuer-Pollmann <christian@geuer-pollmann.de>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
webgress added a commit to webgress/msgvault that referenced this pull request May 22, 2026
… fixes

External review batch.

BLOCKER kenn-io#9 — verify silently skipped its integrity check on PG.
- Updated cmd/msgvault/cmd/verify.go to print "Skipping database
  integrity check (PostgreSQL — use pg_amcheck out-of-band)." instead
  of silently swallowing the step. Rewrote the cobra Long description
  so both the SQLite branch ("runs PRAGMA integrity_check") and the
  PG branch ("prints a notice…") are visible to --help readers. Kept
  the IsPostgreSQL() guard inside runIntegrityCheck as belt-and-braces
  for any future caller.

IMPORTANT kenn-io#4 — legacy SearchMessages bypassed BuildFTSArg.
- internal/store/api.go SearchMessages bound the raw user query string
  straight into FTSSearchClause's placeholder. On PG that fed
  to_tsquery un-escaped input; multi-word or punctuated queries (the
  norm) errored at the parser. On SQLite raw FTS5 metacharacters in
  user input would reach MATCH the same way. Now SearchMessages
  splits on whitespace and delegates to SearchMessagesQuery so the
  dialect's BuildFTSArg sanitizes per backend and the just-landed
  FALSE-fallback handles tokenless inputs uniformly. Whitespace-only
  input short-circuits to zero hits before invoking the FTS pipeline.

NIT kenn-io#1, kenn-io#14 — dialect comments drifted from current behavior.
- internal/store/dialect.go BuildFTSArg interface comment now
  describes the actual prefix-match output shape on both backends and
  the empty-fallback contract.
- internal/store/dialect_pg.go BuildFTSArg comment trimmed of the
  past-tense plainto_tsquery history; now describes only the current
  to_tsquery shape and the empty-fallback rule.

Tests
- TestSearchMessages_LegacyRawString covers multi-word, single-word,
  pure punctuation, pure dashes, whitespace-only, and mixed
  punctuation inputs to the legacy entrypoint. Runs on both SQLite
  and PostgreSQL via storetest.New / testutil.NewTestStore.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add command to export attachments by content hash

2 participants