Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Minor Bug in resolving full name if suffix matches a salutation #11

Closed
VinceG opened this issue Aug 28, 2018 · 4 comments
Closed

Minor Bug in resolving full name if suffix matches a salutation #11

VinceG opened this issue Aug 28, 2018 · 4 comments

Comments

@VinceG
Copy link

VinceG commented Aug 28, 2018

The following Name: "PAUL M LEWIS MR"

returns the following

Array
(
    [0] => PAUL
    [1] => TheIconic\NameParser\Part\Initial Object
        (
            [value:protected] => M
        )

    [2] => LEWIS
    [3] => TheIconic\NameParser\Part\Salutation Object
        (
            [normalized:protected] => Mr.
            [value:protected] => MR
        )

)

making this break and getFirstname and getLastName returns nothing.

the issue seems to be the following line
https://github.com/theiconic/name-parser/blob/master/src/Language/English.php#L41

this probably going to happen with any of the salutations if they come at the end.

Edit:

Another Example:

"SUJAN MASTER"
"JAMES J MA"
"PETER K MA"

@VinceG VinceG changed the title Minor Bug in resolving suffix if it matches a prefix Minor Bug in resolving full name if suffix matches a salutation Aug 28, 2018
@wyrfel
Copy link
Contributor

wyrfel commented Aug 28, 2018

Hi @VinceG . Interesting. Yes, in it's current form the name parser expects the title up-front. The assumption is that if it's in the end, it would be separated by a comma. Do you have actual cases where the title is entered in the end without comma-separation? Theoretically (or rather from the mappers perspective) this should be possible to fix, but in the overall parse order scheme it could complicate things quite a bit as some of the mappers have to make assumptions about positioning.

@wyrfel wyrfel closed this as completed in f7f4888 Aug 28, 2018
wyrfel added a commit that referenced this issue Aug 28, 2018
Allow having salutations in last position without comma-separation (fixes #11)
@VinceG
Copy link
Author

VinceG commented Aug 28, 2018

@wyrfel Thank You for the quick reply and fix.

This seems to fix the issue where the last word matches a salutation in the first case. However this does not seem to fix the other issues

Here are our test cases

/** @test */
    public function it_can_parse_saluation_at_end()
    {
        $parsed = (new NameParser)->parse('PAUL M LEWIS MR');
        
        $this->assertEquals('Paul', $parsed->getFirstName());
        $this->assertEquals('Lewis', $parsed->getLastName());
    }

    /** @test */
    public function it_can_parse_saluation_lastname()
    {
        $parsed = (new NameParser)->parse('SUJAN MASTER');

        $this->assertEquals('Sujan', $parsed->getFirstName());
        $this->assertEquals('Master', $parsed->getLastName());
    }

    /** @test */
    public function it_can_parse_saluation_lastname_with_middlename()
    {
        $parsed = (new NameParser)->parse('JAMES J MA');
        
        $this->assertEquals('James', $parsed->getFirstName());
        $this->assertEquals('Ma', $parsed->getLastName());
    }

the first one passes the second and third fail. It's worth noting that "PAUL M LEWIS MR" the MR at the end is not the salutation but rather part of his last name the way they write it.

1) Tests\Unit\ExampleTest::it_can_parse_saluation_lastname
Failed asserting that two strings are equal.
--- Expected
+++ Actual
@@ @@
-'Sujan'
+''

/Users/vincentgabriel/Sites/fhalicenses/tests/Unit/ExampleTest.php:33

2) Tests\Unit\ExampleTest::it_can_parse_saluation_lastname_with_middlename
Failed asserting that two strings are equal.
--- Expected
+++ Actual
@@ @@
-'Ma'
+''
TheIconic\NameParser\Name Object
(
    [parts:protected] => Array
        (
            [0] => TheIconic\NameParser\Part\Lastname Object
                (
                    [value:protected] => SUJAN
                )

            [1] => TheIconic\NameParser\Part\Salutation Object
                (
                    [normalized:protected] => Mr.
                    [value:protected] => MASTER
                )

        )

)

TheIconic\NameParser\Name Object
(
    [parts:protected] => Array
        (
            [0] => TheIconic\NameParser\Part\Firstname Object
                (
                    [value:protected] => JAMES
                )

            [1] => TheIconic\NameParser\Part\Initial Object
                (
                    [value:protected] => J
                )

            [2] => TheIconic\NameParser\Part\Suffix Object
                (
                    [normalized:protected] => MA
                    [value:protected] => MA
                )

        )

)

"SUJAN MASTER"
"JAMES J MA"
"PETER K MA"

The above all fail since the last name is considered a salutation even though it's actually the last name. I am not entirely sure how to properly fix this without having breaking changes. One solution that came to mind is checking the number of parts, if we have 2 parts that is should be considered as first and lastname but then it might cause problems when the name has any other combination of two words such as salutation firstname or middle last name etc..

Thanks again for the help.

wyrfel added a commit that referenced this issue Sep 16, 2018
Names like "SUJAN MASTER", "JAMES J MA", "PETER K MA" had the 'Master'
or 'Ma' parts as special parts (salutations, suffixes) where they
should be lastnames. In "PAUL M LEWIS MR", the lastname should be
'Lewis Mr'.

This change does three things to fix this:
Firstly, it prevents parsing for salutations beyond the first half of words in
the given string. It also introduces a `setMaxSalutationIndex()` method
to allow overriding this with a fixed maximum word index. E.g. setting
it to 2 will require salutations to appear in the first two words.

Secondly, if the lastname mapper does not derive a lastname, but has skipped
ignored parts like suffix, nickname or salutation, it will convert these into
lastname parts.

Thirdly, the lastname mapper will now map more than one lastname part if
the already mapped lastname parts are shorter than 3 characters and there
will be at least one part left after mapping. This effectively maps
'Lewis' in 'Paul M Lewis Mr' as lastname instead of previously as middlename.
wyrfel added a commit that referenced this issue Sep 16, 2018
Names like "SUJAN MASTER", "JAMES J MA", "PETER K MA" had the 'Master'
or 'Ma' parts as special parts (salutations, suffixes) where they
should be lastnames. In "PAUL M LEWIS MR", the lastname should be
'Lewis Mr'.

This change does three things to fix this:
Firstly, it prevents parsing for salutations beyond the first half of words in
the given string. It also introduces a `setMaxSalutationIndex()` method
to allow overriding this with a fixed maximum word index. E.g. setting
it to 2 will require salutations to appear in the first two words.

Secondly, if the lastname mapper does not derive a lastname, but has skipped
ignored parts like suffix, nickname or salutation, it will convert these into
lastname parts.

Thirdly, the lastname mapper will now map more than one lastname part if
the already mapped lastname parts are shorter than 3 characters and there
will be at least one part left after mapping. This effectively maps
'Lewis' in 'Paul M Lewis Mr' as lastname instead of previously as middlename.
wyrfel added a commit that referenced this issue Sep 16, 2018
@wyrfel
Copy link
Contributor

wyrfel commented Sep 16, 2018

@VinceG All these cases should now be working in version 1.2.6.
I had misunderstood the issue originally, thank you for the clarifications.
I hope this solves the problem more generally for you.
Please see #16 for details.

@VinceG
Copy link
Author

VinceG commented Sep 16, 2018

@wyrfel Thank You. I'll give this a try.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants