New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Searches with non-ASCII characters like umlauts or accented characters are always case-sensitive #386

Closed
Archaeopteryx opened this Issue Oct 22, 2014 · 2 comments

Comments

Projects
None yet
3 participants
@pascalchevrel

This comment has been minimized.

Show comment
Hide comment
@pascalchevrel

pascalchevrel Oct 22, 2014

Member

I looked into it and one suggested solution on stackoverflow is to normalize the input and data through the Transliterator extension. I am leaving the pointers here for reference:
http://stackoverflow.com/questions/14072333/php-preg-grep-and-umlaut-accent
http://php.net/class.transliterator

Normalizing the input may be a solution but would need some big changes to how we store and retrieve data on all views, so I don't think I can provide a patch in the short term.

Member

pascalchevrel commented Oct 22, 2014

I looked into it and one suggested solution on stackoverflow is to normalize the input and data through the Transliterator extension. I am leaving the pointers here for reference:
http://stackoverflow.com/questions/14072333/php-preg-grep-and-umlaut-accent
http://php.net/class.transliterator

Normalizing the input may be a solution but would need some big changes to how we store and retrieve data on all views, so I don't think I can provide a patch in the short term.

flodolo added a commit to flodolo/transvision that referenced this issue Feb 2, 2015

flodolo added a commit to flodolo/transvision that referenced this issue Feb 2, 2015

@flodolo flodolo self-assigned this Feb 2, 2015

@flodolo

This comment has been minimized.

Show comment
Hide comment
@flodolo

flodolo Feb 2, 2015

Contributor

I'm trying to wrap my head around this. It definitely works by adding /u to regexps, but then str_replace fails to highlight the results, which probably means we're mixing up charsets.

Working on this for a bit.

Contributor

flodolo commented Feb 2, 2015

I'm trying to wrap my head around this. It definitely works by adding /u to regexps, but then str_replace fails to highlight the results, which probably means we're mixing up charsets.

Working on this for a bit.

flodolo added a commit to flodolo/transvision that referenced this issue Feb 2, 2015

flodolo added a commit to flodolo/transvision that referenced this issue Feb 2, 2015

flodolo added a commit to flodolo/transvision that referenced this issue Feb 2, 2015

flodolo added a commit to flodolo/transvision that referenced this issue Feb 2, 2015

Add /u to regexp to fix utf-8 characters searches, use multibyte func…
…tions to highlight

Also added tests for markString and highlightString.
Fixes issue #386

flodolo added a commit to flodolo/transvision that referenced this issue Feb 2, 2015

Add /u to regexps to fix searches with uppercase/lowercase utf-8 char…
…acters, use multibyte functions to highlight

Also added tests for markString and highlightString.
Fixes issue #386
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment