fix weird chat formatting issue by aaravgarg · Pull Request #4201 · BasedHardware/omi

aaravgarg · 2026-01-12T23:23:17Z

No description provided.

gemini-code-assist

Code Review

This pull request addresses a chat formatting issue involving double-encoded UTF-8 strings. The proposed solution involves detecting and decoding these strings. However, the current _looksDoubleEncoded detection logic is overly broad, leading to false positives for valid strings with accented characters. This inefficiency forces the decodeString getter to rely on try-catch for normal operations. The suggested modification provides a more precise detection, improving correctness, performance, and reducing reliance on try-catch for non-exceptional cases.

gemini-code-assist · 2026-01-12T23:24:50Z

+  bool _looksDoubleEncoded() {
+    // Common UTF-8 leading byte patterns when misinterpreted as Latin-1:
+    // - Ã (0xC3) followed by another character = 2-byte UTF-8 sequence
+    // - â (0xE2) often starts 3-byte sequences (em-dash, curly quotes, etc.)
+    // These patterns are very unlikely in correctly-encoded text
+    for (int i = 0; i < length; i++) {
+      final code = codeUnitAt(i);
+      // Check for Latin-1 supplement range that looks like UTF-8 leading bytes
+      if (code >= 0xC0 && code <= 0xF4) {
+        // This could be a UTF-8 leading byte stored as Latin-1
+        return true;
+      }
+    }
+    return false;
+  }


The current implementation of _looksDoubleEncoded is too broad. The condition code >= 0xC0 && code <= 0xF4 will incorrectly return true for valid, single-encoded strings that contain common non-ASCII characters like 'é', 'à', or 'ü'. This forces the decodeString getter to rely on a try-catch block for normal program flow with valid inputs, which is inefficient.

A more robust approach is to check for the specific pattern of a double-encoded character: a character that looks like a UTF-8 leading byte followed by one that looks like a UTF-8 continuation byte. This pattern is extremely unlikely to occur in correctly encoded text.

I suggest replacing this method with a more precise check to avoid these false positives.

bool _looksDoubleEncoded() { // A more robust check for double-encoding is to look for a potential UTF-8 // leading byte (C2-F4) followed by a continuation byte (80-BF). // This pattern is very unlikely in correctly-encoded text. for (int i = 0; i < length - 1; i++) { final c1 = codeUnitAt(i); // Check for a potential multi-byte UTF-8 start character (excluding overlong C0/C1). if (c1 >= 0xC2 && c1 <= 0xF4) { final c2 = codeUnitAt(i + 1); // Check if it's followed by a continuation character. if (c2 >= 0x80 && c2 <= 0xBF) { return true; } } } return false; }

References

The current implementation forces a try-catch block for normal program flow, which is inefficient. Improving the detection logic will reduce the need for try-catch in non-exceptional cases, aligning with the principle of avoiding unnecessary code complexity for operations with a negligible chance of failure.

Reverts #4201

Reverts BasedHardware#4201

fix weird chat formatting issue

87bb1cd

aaravgarg merged commit 8f3145b into main Jan 12, 2026
1 check passed

aaravgarg deleted the chat-frmt-fix branch January 12, 2026 23:23

gemini-code-assist Bot reviewed Jan 12, 2026

View reviewed changes

aaravgarg mentioned this pull request Jan 12, 2026

Revert "fix weird chat formatting issue" #4202

Merged

aaravgarg added a commit that referenced this pull request Jan 12, 2026

Revert "fix weird chat formatting issue" (#4202)

4b36993

Reverts #4201

Glucksberg pushed a commit to Glucksberg/omi-local that referenced this pull request Apr 28, 2026

fix weird chat formatting issue (BasedHardware#4201)

9b98e54

Glucksberg pushed a commit to Glucksberg/omi-local that referenced this pull request Apr 28, 2026

Revert "fix weird chat formatting issue" (BasedHardware#4202)

1d495cf

Reverts BasedHardware#4201

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix weird chat formatting issue#4201

fix weird chat formatting issue#4201
aaravgarg merged 1 commit intomainfrom
chat-frmt-fix

aaravgarg commented Jan 12, 2026

Uh oh!

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Jan 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

aaravgarg commented Jan 12, 2026

Uh oh!

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Jan 12, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant