Skip to content

Conversation

yedayak
Copy link
Collaborator

@yedayak yedayak commented Sep 28, 2025

Some of the files have unicode characters that can be represented in multiple ways, so we normalize them. Specifically: 'e\u0301' ('e' + 'Combining Acute Accent') doesn't equal '\xe9' ('LATIN SMALL LETTER E WITH ACUTE') unless you normalize them.

Cherry-picked from #1339

Some of the files have unicode characters that can be represented in
multiple ways, so we normalize them. Specifically:
'e\u0301' ('e' + 'Combining Acute Accent') doesn't equal '\xe9' (LATIN SMALL
LETTER E WITH ACUTE) unless you normalize them.
Copy link
Collaborator

@akinomyoga akinomyoga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yeah. I recently read an article about the problems caused by the UTF-8-MAC/NFD conversions performed by macOS filesystems.

@yedayak yedayak merged commit bf93a39 into scop:main Oct 4, 2025
6 of 7 checks passed
@yedayak yedayak mentioned this pull request Oct 9, 2025
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants