Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Further tests on non-ascii processing #32

Open
2 tasks
ageorgou opened this issue Feb 9, 2022 · 0 comments
Open
2 tasks

Further tests on non-ascii processing #32

ageorgou opened this issue Feb 9, 2022 · 0 comments
Labels
data Ingestion and preprocessing of data

Comments

@ageorgou
Copy link
Contributor

ageorgou commented Feb 9, 2022

Some remaining things that were not checked when support for non-ASCII synonyms was introduced (from #27):

  • Make sure that the preprocessing does not remove any non-ASCII characters! (e.g. that Unicode sequences are understood correctly)
  • Test that other fields like cf.sort behave as before, i.e. don't use the new analyzer

See also #18 for a description of the original task.

@ageorgou ageorgou added the data Ingestion and preprocessing of data label Feb 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
data Ingestion and preprocessing of data
Projects
None yet
Development

No branches or pull requests

1 participant