Skip to content

Refactor ensureUtf8 to handle exceptions and improve logic#33

Merged
ralflang merged 1 commit into
FRAMEWORK_6_0from
fix/handling_8bit_unknown
May 25, 2026
Merged

Refactor ensureUtf8 to handle exceptions and improve logic#33
ralflang merged 1 commit into
FRAMEWORK_6_0from
fix/handling_8bit_unknown

Conversation

@TDannhauer
Copy link
Copy Markdown
Contributor

Handle unsupported charsets in ensureUtf8() without aborting sync

Summary

  • Catch RuntimeException from Horde_String::convertCharset() in Horde_ActiveSync_Utils::ensureUtf8() so charset conversion failures use the existing fallback path instead of propagating up the stack.
  • Return early when the primary conversion produces valid UTF-8.
  • Wrap alternate charset attempts (windows-1252, UTF-8) in the same exception handling so one unsupported label does not prevent later fallbacks from running.

Problem

ensureUtf8() assumed Horde_String::convertCharset() always returns a string. Since Horde Util 6.x, conversion failures throw RuntimeException (for example when the MIME part declares unknown-8bit, which PHP’s iconv/mbstring/intl stacks do not recognize).

ActiveSync calls ensureUtf8() when validating mail bodies (Horde_ActiveSync_Imap_MessageBodyData::_validateBodyData()), subjects, addresses, and iCalendar data. An uncaught exception during a Sync <Add> aborts the response; clients such as Outlook retry the same change indefinitely, which shows up as a sync loop in the logs.

The method already documented fallbacks for invalid UTF-8 and for forcing transliteration, but those branches were unreachable if the initial convertCharset() call threw.

Solution

  1. Try the declared $from_charset inside try/catch; return immediately on success with valid UTF-8.
  2. On exception or invalid UTF-8, run the existing alternate charset loop, also guarded by try/catch.
  3. If those fail, keep the existing 7-bit strip and forced conversion logic unchanged.

This restores the original intent of ensureUtf8(): best-effort UTF-8 output even when the declared charset is wrong or unsupported.

Related work

Normalizing pseudo-charsets such as unknown-8bit in horde/util (CharacterSets::normalize()) addresses the most common case at the source. This change is complementary: any future or unknown charset that still makes convertCharset() throw will no longer break ActiveSync export.

Test plan

  • Sync a mailbox containing a message with Content-Type: text/plain; charset=unknown-8bit via Outlook (or another EAS client); confirm the message is added without ERR: Unable to convert character set and without a sync loop.
  • Verify a message with a valid declared charset (e.g. iso-8859-1, utf-8) still syncs with correct body text.
  • Verify a message with intentionally broken binary body data still degrades via fallbacks (7-bit strip / forced conversion) rather than killing the request.
  • Check ActiveSync logs: no RuntimeException from ensureUtf8() during Sync for the test messages above.

@TDannhauer TDannhauer requested a review from ralflang May 24, 2026 19:30
@TDannhauer
Copy link
Copy Markdown
Contributor Author

depends on horde/Util#28

@ralflang ralflang merged commit ceb6337 into FRAMEWORK_6_0 May 25, 2026
0 of 8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants