New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Minor Bug in resolving full name if suffix matches a salutation #11
Comments
Hi @VinceG . Interesting. Yes, in it's current form the name parser expects the title up-front. The assumption is that if it's in the end, it would be separated by a comma. Do you have actual cases where the title is entered in the end without comma-separation? Theoretically (or rather from the mappers perspective) this should be possible to fix, but in the overall parse order scheme it could complicate things quite a bit as some of the mappers have to make assumptions about positioning. |
Allow having salutations in last position without comma-separation (fixes #11)
@wyrfel Thank You for the quick reply and fix. This seems to fix the issue where the last word matches a salutation in the first case. However this does not seem to fix the other issues Here are our test cases
the first one passes the second and third fail. It's worth noting that "PAUL M LEWIS MR" the MR at the end is not the salutation but rather part of his last name the way they write it.
"SUJAN MASTER" The above all fail since the last name is considered a salutation even though it's actually the last name. I am not entirely sure how to properly fix this without having breaking changes. One solution that came to mind is checking the number of parts, if we have 2 parts that is should be considered as first and lastname but then it might cause problems when the name has any other combination of two words such as salutation firstname or middle last name etc.. Thanks again for the help. |
Names like "SUJAN MASTER", "JAMES J MA", "PETER K MA" had the 'Master' or 'Ma' parts as special parts (salutations, suffixes) where they should be lastnames. In "PAUL M LEWIS MR", the lastname should be 'Lewis Mr'. This change does three things to fix this: Firstly, it prevents parsing for salutations beyond the first half of words in the given string. It also introduces a `setMaxSalutationIndex()` method to allow overriding this with a fixed maximum word index. E.g. setting it to 2 will require salutations to appear in the first two words. Secondly, if the lastname mapper does not derive a lastname, but has skipped ignored parts like suffix, nickname or salutation, it will convert these into lastname parts. Thirdly, the lastname mapper will now map more than one lastname part if the already mapped lastname parts are shorter than 3 characters and there will be at least one part left after mapping. This effectively maps 'Lewis' in 'Paul M Lewis Mr' as lastname instead of previously as middlename.
Names like "SUJAN MASTER", "JAMES J MA", "PETER K MA" had the 'Master' or 'Ma' parts as special parts (salutations, suffixes) where they should be lastnames. In "PAUL M LEWIS MR", the lastname should be 'Lewis Mr'. This change does three things to fix this: Firstly, it prevents parsing for salutations beyond the first half of words in the given string. It also introduces a `setMaxSalutationIndex()` method to allow overriding this with a fixed maximum word index. E.g. setting it to 2 will require salutations to appear in the first two words. Secondly, if the lastname mapper does not derive a lastname, but has skipped ignored parts like suffix, nickname or salutation, it will convert these into lastname parts. Thirdly, the lastname mapper will now map more than one lastname part if the already mapped lastname parts are shorter than 3 characters and there will be at least one part left after mapping. This effectively maps 'Lewis' in 'Paul M Lewis Mr' as lastname instead of previously as middlename.
@wyrfel Thank You. I'll give this a try. |
The following Name: "PAUL M LEWIS MR"
returns the following
making this break and getFirstname and getLastName returns nothing.
the issue seems to be the following line
https://github.com/theiconic/name-parser/blob/master/src/Language/English.php#L41
this probably going to happen with any of the salutations if they come at the end.
Edit:
Another Example:
"SUJAN MASTER"
"JAMES J MA"
"PETER K MA"
The text was updated successfully, but these errors were encountered: