-
Notifications
You must be signed in to change notification settings - Fork 28
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Enable lastname remapping - fixes #11
Names like "SUJAN MASTER", "JAMES J MA", "PETER K MA" had the 'Master' or 'Ma' parts as special parts (salutations, suffixes) where they should be lastnames. In "PAUL M LEWIS MR", the lastname should be 'Lewis Mr'. This change does three things to fix this: Firstly, it prevents parsing for salutations beyond the first half of words in the given string. It also introduces a `setMaxSalutationIndex()` method to allow overriding this with a fixed maximum word index. E.g. setting it to 2 will require salutations to appear in the first two words. Secondly, if the lastname mapper does not derive a lastname, but has skipped ignored parts like suffix, nickname or salutation, it will convert these into lastname parts. Thirdly, the lastname mapper will now map more than one lastname part if the already mapped lastname parts are shorter than 3 characters and there will be at least one part left after mapping. This effectively maps 'Lewis' in 'Paul M Lewis Mr' as lastname instead of previously as middlename.
- Loading branch information
Showing
6 changed files
with
153 additions
and
8 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters