Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

people() seems to get confused by commas #1111

Open
sandro-pasquali opened this issue Jun 3, 2024 · 1 comment
Open

people() seems to get confused by commas #1111

sandro-pasquali opened this issue Jun 3, 2024 · 1 comment

Comments

@sandro-pasquali
Copy link

Great library!

Loving the various entity extraction utilities. They work great. One I use is people(). However, it seems to be unable to separate a list of comma-separated names into individual names, at least in this case.

This is what I'm seeing [ NodeJs 22, OSX, "compromise": "^14.13.0" ]:

import Nlp from 'compromise';

const text = `The NAACP’s founding members included white progressives Mary White Ovington, Henry Moskowitz, William English Walling and Oswald Garrison Villard, along with such African Americans as W.E.B. Du Bois, Ida B. Wells, Archibald Grimke and Mary Church Terrell.`;

const processed = Nlp(text);
console.log(processed.people().out('array'));

// [
//     'Mary White Ovington, Henry Moskowitz, William English Walling',
//     'Oswald Garrison Villard,',
//     'Ida B. Wells, Archibald Grimke',
//     'Mary Church Terrell.'
// ]

As a side note, you can also see it isn't catching W.E.B Du Bois but that seems a complex pattern, and prob best here would be to add to the custom lexicon I'm guessing.

Thanks again for compromise!

@spencermountain
Copy link
Owner

hey Sandro - good catch!
Happy to fix this for the next release.
cheers

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants