Pin response encoding to UTF-8 in API requests by tillo · Pull Request #36 · marph91/joppy

tillo · 2026-05-22T17:17:48Z

joppy decodes API responses through response.text, which relies on requests knowing the response charset.

Joplin always serves UTF-8, but several endpoints omit the charset from the Content-Type header — notably the Joplin Server item /content endpoint that returns the raw note (title + body + metadata). When the charset is absent, requests does not assume UTF-8: it falls back to charset detection (or, depending on the requests / charset_normalizer version, to the latin-1 default). Non-ASCII note content is then mis-decoded into mojibake.

Because the mis-decoded string is read back, edited, and written again, the corruption compounds on every round-trip: an em-dash — becomes â\x80\x94, then Ã¢Â\x80Â\x94, and so on. Any tool built on joppy that round-trips notes (sync helpers, MCP servers, migration scripts) silently corrupts every note it touches.

This pins response.encoding = "utf-8" immediately after each request in both ServerApi._request and Api._request, before response.text is ever accessed. Joplin's API is UTF-8 in every case, so this is always correct and makes decoding deterministic regardless of the installed requests / charset_normalizer version.

No behaviour change for ASCII content; non-ASCII titles and bodies now round-trip losslessly.

Joplin's REST API always returns UTF-8, but some responses — notably the Joplin Server item /content endpoint — omit the charset from the Content-Type header. requests then falls back to charset detection for `response.text`, which mis-decodes non-ASCII note content as mojibake. Set response.encoding = "utf-8" explicitly in both _request methods so note titles and bodies decode correctly.

The Joplin Server upstream now returns a totp_secret field on the user data endpoint (see laurent22/joplin database/types.ts). Without the matching dataclass field, UserData(**item) raises: TypeError: UserData.__init__() got an unexpected keyword argument 'totp_secret' This currently breaks test_get_current_user on every CI run and is independent of the UTF-8 encoding change in this PR -- it would fail on master too once master CI is re-triggered. Added as the last field, mirroring the chronological-extension pattern of sso_auth_code* added in earlier upstream revisions.

tillo · 2026-05-24T15:46:32Z

Heads-up that I've pushed an extra commit (a60eea3) to fix the CI failure on test_get_current_user. The Joplin Server upstream now returns a totp_secret field on the user data response, which made UserData(**item) raise TypeError: UserData.__init__() got an unexpected keyword argument 'totp_secret' regardless of this PR's UTF-8 change — master CI hits the same error once re-triggered.

If you'd rather keep this PR scoped to the encoding fix only, happy to split the totp_secret field into its own PR. Two flake-class failures remain (Joplin AppImage stop-timeout in tearDownModule, and a SQLITE_BUSY on test_get_notebook on the 3.10 job only) — those look environmental and aren't touched here.

marph91 · 2026-05-24T16:44:38Z

Thanks for the analysis and fix! I can reproduce the issue locally. I'm fine with the changes.

Two flake-class failures remain (Joplin AppImage stop-timeout in tearDownModule, and a SQLITE_BUSY on test_get_notebook on the 3.10 job only) — those look environmental and aren't touched here.

Yes, these failures are not caused by this PR. I need to find the root cause and fix them separately.

tillo · 2026-05-24T16:45:19Z

Update: a60eea32 resolved the real failure (test_get_current_user now passes on both 3.10 and 3.14). The two errors still showing are both pre-existing flakes:

tearDownModule — Joplin AppImage subprocess.communicate(timeout=10) times out on shutdown. Hits on every run; not test-or-PR-specific.
test_get_all_notes (3.14 only this time) — SQLITE_BUSY on the Joplin Server SQLite under suite concurrency. Same class as the test_get_notebook flake on the previous 3.10 run; just landed on a different test.

Neither touches the UTF-8 response-hook code or the totp_secret field. Happy to leave the flakes for separate work — if you'd rather see green CI here first, the cheapest fix would be @pytest.mark.flaky (or unittest's equivalent) on the SQLITE_BUSY-prone tests and a .kill() fallback after the 10s communicate timeout in setup_joplin.py::stop(). Let me know which you prefer.

marph91 · 2026-05-24T18:03:19Z

Joppy 1.0.4 was just released on Pypi. Feel free to open a new issue (or PR) if there are problems.

tillo added 2 commits May 22, 2026 19:17

marph91 merged commit 7743be1 into marph91:master May 24, 2026
2 of 4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pin response encoding to UTF-8 in API requests#36

Pin response encoding to UTF-8 in API requests#36
marph91 merged 2 commits into
marph91:masterfrom
tillo:fix-utf8-response-encoding

tillo commented May 22, 2026

Uh oh!

tillo commented May 24, 2026

Uh oh!

marph91 commented May 24, 2026

Uh oh!

Uh oh!

tillo commented May 24, 2026

Uh oh!

marph91 commented May 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

tillo commented May 22, 2026

Uh oh!

tillo commented May 24, 2026

Uh oh!

marph91 commented May 24, 2026

Uh oh!

Uh oh!

tillo commented May 24, 2026

Uh oh!

marph91 commented May 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants