Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Anaphora resolution? #116

Open
chrisspen opened this issue Jul 1, 2017 · 3 comments
Open

Anaphora resolution? #116

chrisspen opened this issue Jul 1, 2017 · 3 comments

Comments

@chrisspen
Copy link

This is more of a general question or feature request.

The textacy.extract.subject_verb_object_triples() is really interesting and useful, but I notice for a lot of texts, it ends up returning triples with pronouns in the subject or object. For most NLP tasks, these anaphora need to be resolved to one of the discrete nouns seen earlier. Is there anything in textacy to accomplish this?

A naive approach would be to iterate over the results and track the last non-ananphora entity, and replace all subsequent anaphora with that entity. This will mis cases where the anaphora refers to the object or verb, but it's better than nothing.

@bdewilde
Copy link
Collaborator

bdewilde commented Jul 2, 2017

Hey @chrisspen , thanks for the feature request. I feel your pain... I've actually tried the "naive approach" you mentioned, but found its results too poor to include in textacy. And doing anaphora resolution well is sufficiently hard that I never got around to tackling it.

So, I'll add this back into my backlog. It would be a very useful thing to have! If you have any ideas / resources, don't hesitate to post here.

@bdewilde
Copy link
Collaborator

bdewilde commented Jul 7, 2017

Good news: relevant code built on spacy was recently open-sourced. It's on my to-read list...
https://medium.com/huggingface/state-of-the-art-neural-coreference-resolution-for-chatbots-3302365dcf30

@chartbeat-labs chartbeat-labs locked and limited conversation to collaborators Jul 7, 2017
@chartbeat-labs chartbeat-labs unlocked this conversation Jul 7, 2017
@cpetroaca
Copy link

I'm also interested in this, Is there a plan to integrate it in the subject_verb_object_triples() functionality?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants