Permalink
3 comments
on commit
sign in to comment.
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Browse files
Use Unicode char ranges when processing names
Fixes #13 Use \p{...} Unicode char ranges when processing names, which fixes names being incorrectly split when they contain a non-[A-Z]. Moved splitting initials (AJ Bower) [A-Z] regexp above the line that lowercases the initials, causing them to never be split (>6 year old bug).
- Loading branch information
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This comment has been minimized.
Is that the right regex at line 230?
/\p{Lowercase}/ibecomes/\p{Cased}/. I believe\p{Letter}would make more sense.Especially since I doubt there'd be many names with U+2170 Small Roman Numeral One in (which matches
\p{Lowercase})This comment has been minimized.
Don't forget the apostrophe.
This comment has been minimized.
Is author search with wildcards possible? I don't think so.