Skip to content

hir-94: encode MIME headers + bodies correctly for non-ASCII sends#10

Merged
pypesdev merged 3 commits intopypesdev:mainfrom
jaredzwick:hir-94/mime-rfc-encoding
May 3, 2026
Merged

hir-94: encode MIME headers + bodies correctly for non-ASCII sends#10
pypesdev merged 3 commits intopypesdev:mainfrom
jaredzwick:hir-94/mime-rfc-encoding

Conversation

@jaredzwick
Copy link
Copy Markdown
Collaborator

Summary

  • gmailService.createMimeMessage declared Content-Transfer-Encoding: 7bit while writing raw UTF-8 into both header values and body parts. RFC 2045 §6.2 forbids bytes > 127 in 7bit, so any subject or body with an em dash, curly quote, accented character, or emoji would either be rejected by Gmail or delivered as mojibake.
  • The shipped template library (src/lib/templates/catalog.ts) uses em dashes (— {{sender_name}}) — so this bug would silently corrupt every send made from a template.
  • This PR makes the MIME builder RFC-correct: RFC 2047 B-encoding for headers, RFC 2045 quoted-printable for non-ASCII bodies, proper address-header quoting/escaping, and CRLF discipline. ASCII inputs are emitted as 7bit so existing all-ASCII messages stay byte-equivalent.

Changes

  • src/lib/mime.ts (new): pure helpers — encodeMimeHeaderValue, formatAddressHeader, encodeQuotedPrintable, buildMimeMessage, toGmailRawString. RFC 2047 multi-word splitting respects UTF-8 boundaries and the 75-char per-encoded-word cap. Quoted-printable handles equals, high-bit bytes, trailing whitespace, and soft-wraps at 76 chars per RFC 2045 §6.7.
  • src/lib/emailTracking.ts (new): tracking pixel + click rewrite extracted to a pure transform. Only rewrites http(s) hrefs (mailto/tel/anchors preserved), only inserts the open-pixel once, idempotent.
  • src/lib/gmailService.ts: createMimeMessage is now a thin wrapper around the helpers. No changes to send semantics, token refresh, quota handling, or the public signature.

Tests

  • tests/int/mime.int.spec.ts — 29 specs covering isAscii, header encoding (ASCII passthrough, multi-word, UTF-8 boundary safety, 75-char cap, emoji round-trip), formatAddressHeader (no name, ASCII, special chars, quote/backslash escaping, non-ASCII names), encodeQuotedPrintable (equals, high-bit, em-dash bytes, CRLF preservation, trailing whitespace, soft-wrap), buildMimeMessage (empty body throws, ASCII→7bit, non-ASCII→QP, RFC 2047 subject, multi/single-part, CRLF only, header order, unique boundary), toGmailRawString (base64url shape + round-trip).
  • tests/int/emailTracking.int.spec.ts — 12 specs covering pixel + click URL builders, no-tracking-id passthrough, pixel placement (with/without </body>), href rewriting (double-quoted, single-quoted, attributes-before-href, mailto/tel/anchor preservation), no-double-rewrite, idempotency, multi-link.
  • All 41 new tests pass. Full pnpm test:int: 95/96 pass; the 1 failure is the pre-existing api.int.spec.ts (missing PAYLOAD_SECRET env on main, unrelated).
  • pnpm lint clean for all changed files.

Regression analysis

  • createMimeMessage is the only caller affected. Its public behaviour for ASCII-only inputs is preserved (7bit encoding, identical byte sequence except for boundary suffix randomization for uniqueness). Non-ASCII inputs that previously produced an RFC-violating message now produce a valid one.
  • Tracking pixel + click-rewrite contract is unchanged for the existing endpoint paths; attribute order around href is preserved by the new regex (the old one occasionally dropped the trailing space — explicitly tested).
  • No schema, migration, queue, or auth changes.

Test plan

  • Send a test campaign whose subject contains an em dash (Welcome — Jared) and whose body uses a curly apostrophe.
  • Verify the receiving inbox shows the subject correctly (no =?UTF-8?...?= or â artifacts).
  • Verify open-pixel and click-tracking still register events for HTML bodies.
  • Verify a plain-text-only ASCII campaign still sends and the message bytes are unchanged from the previous behaviour.

Out of scope

  • Per-recipient variable substitution from the recipient-parser (the campaign UI doesn't pass variables per recipient yet — separate increment).
  • Queue scheduling fixes (quota-rescheduling never updates scheduledFor — separate increment).

🤖 Generated with Claude Code

The previous gmailService createMimeMessage hard-coded
"Content-Transfer-Encoding: 7bit" and wrote raw UTF-8 into both header
values and body parts. This violates RFC 2045 (7bit forbids bytes >
127), so any subject or body containing an em dash, curly quote,
accented character, or emoji would either be rejected by Gmail or
delivered as mojibake. The shipped template library uses em dashes
("— sender_name"), so this would corrupt every send made from a
template.

This change:

- src/lib/mime.ts (new): pure RFC-correct helpers — encodeMimeHeaderValue
  (RFC 2047 B-encoding, multi-word splitting that respects UTF-8
  boundaries and the 75-char per-encoded-word cap), formatAddressHeader
  (quoting + escaping + non-ASCII encoding), encodeQuotedPrintable (RFC
  2045 §6.7 — equals, high-bit, trailing whitespace, soft-wrap at 76),
  buildMimeMessage (multipart/alternative, picks 7bit vs
  quoted-printable per part), toGmailRawString (base64url for the Gmail
  API raw field).
- src/lib/emailTracking.ts (new): tracking pixel + click rewrite as a
  pure transform. Only rewrites http(s) hrefs (mailto/tel/anchors are
  preserved), only inserts the open pixel once, and skips links that
  already point at our click tracker. Idempotent.
- src/lib/gmailService.ts: createMimeMessage is now a thin wrapper
  around buildMimeMessage + injectTracking + toGmailRawString. No
  changes to send semantics, token refresh, quota handling, or the
  function signature.

41 new vitest specs cover both modules: ASCII passthrough, multi-word
RFC 2047 splitting, UTF-8 boundary safety, header encoding, address
quoting/escaping, quoted-printable equals/high-bit/trailing-ws/soft-wrap,
buildMimeMessage 7bit-vs-QP selection, header order, CRLF discipline,
boundary uniqueness, base64url round-trip; tracking pixel placement,
href quoting preservation, attribute order preservation, non-http
skipping, idempotency, multi-link rewriting.

No schema changes, no migrations, no changes to queue/processor/auth
paths.

Co-Authored-By: Paperclip <noreply@paperclip.ing>
@vercel
Copy link
Copy Markdown
Contributor

vercel Bot commented May 3, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
coldflow-frontend Building Building Preview, Comment May 3, 2026 3:59am

- escape stray apostrophe in EmailAccountManagement Outlook copy
  (react/no-unescaped-entities was the only blocking eslint error)
- zod v4: z.record(z.string()) → z.record(z.string(), z.string())
- zod v4: rename ZodError.errors → ZodError.issues across 3 routes
- replace ad-hoc drizzle-orm import in email-accounts/[id] route
  with new getPendingCountForEmailAccount helper in @coldflow/db
- remove unreachable account.status === 'error' branch in
  gmailService.sendEmail (narrowed to 'connected' by earlier guard)

Verified locally: pnpm build now passes lint + typecheck and only
fails at page-data collection due to missing local DATABASE_URL /
BETTER_AUTH_SECRET / PAYLOAD_SECRET (set on Vercel).

Co-Authored-By: Paperclip <noreply@paperclip.ing>
@vercel
Copy link
Copy Markdown
Contributor

vercel Bot commented May 3, 2026

@jaredzwick is attempting to deploy a commit to the zwickidtek's projects Team on Vercel.

A member of the Team first needs to authorize it.

- prefix unused payload/req params with _ in 20251201_000948
- ignore src/migrations/ in eslint config (auto-generated by payload)

Co-Authored-By: Paperclip <noreply@paperclip.ing>
@pypesdev pypesdev merged commit 1e9a437 into pypesdev:main May 3, 2026
1 check failed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants