Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Source layer: unicode boundaries #37

Open
conartist6 opened this issue Jun 4, 2023 · 0 comments
Open

Source layer: unicode boundaries #37

conartist6 opened this issue Jun 4, 2023 · 0 comments
Labels
enhancement New feature or request

Comments

@conartist6
Copy link
Member

In a semantically layered architecture, the source layer should have responsibility for matters of unicode. These are matters computable at the character level, and indeed require no grammar to compute, only access to a unicode database. There is significant value in making the application of the unicode database to the text explicit. Now the text can be passed around between different runtimes, and it will be evident whether or not something has been lost in translation, i.e. one parser sees a grapheme cluster where another does not. This kind of situation is even possible between versions of the cst-tokens core since each core will have one and only one unicode database, upgrades to which are necessarily somewhat breaking.

@conartist6 conartist6 added the enhancement New feature or request label Jun 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant