Skip to content

Do not write any escaped surrogate pairs to inference JSONs#888

Merged
Enkidu93 merged 2 commits intomainfrom
no_surrogate_pair_passing
Mar 9, 2026
Merged

Do not write any escaped surrogate pairs to inference JSONs#888
Enkidu93 merged 2 commits intomainfrom
no_surrogate_pair_passing

Conversation

@Enkidu93
Copy link
Copy Markdown
Collaborator

@Enkidu93 Enkidu93 commented Mar 9, 2026

Fixes sillsdev/machine.py#276

This is a follow-up on #744. Unbeknownst to me at the time, the solution in that PR only works as expected for characters within the BMP 🫤 . We recently had some text written using classical Chinese Characters that are not in the BMP.


This change is Reviewable

@Enkidu93 Enkidu93 requested a review from ddaspit March 9, 2026 16:46
@codecov-commenter
Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 67.01%. Comparing base (bbb6e5d) to head (53d0d8a).

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #888   +/-   ##
=======================================
  Coverage   67.01%   67.01%           
=======================================
  Files         384      384           
  Lines       21036    21036           
  Branches     2734     2734           
=======================================
  Hits        14098    14098           
  Misses       5962     5962           
  Partials      976      976           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Copy Markdown
Contributor

@ddaspit ddaspit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

@ddaspit reviewed 4 files and all commit messages, and made 1 comment.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved (waiting on Enkidu93).

@Enkidu93
Copy link
Copy Markdown
Collaborator Author

Enkidu93 commented Mar 9, 2026

Do you think this is worthy of a hotfix, @ddaspit?

@Enkidu93 Enkidu93 merged commit 97b5367 into main Mar 9, 2026
2 checks passed
@Enkidu93 Enkidu93 deleted the no_surrogate_pair_passing branch March 9, 2026 20:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

TypeError with TextEncodeInput

3 participants