Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

1.4.0 is buggy when it comes to some dependency parsing tasks, however, 1.3.0 works correctly #1160

Open
apsyio opened this issue Dec 7, 2022 · 3 comments
Labels

Comments

@apsyio
Copy link

apsyio commented Dec 7, 2022

I am using the dependency parser and noticed 1.4.0 has bugs that do not exist in 1.3.0. Here is an example:

If B is true and if C is false, perform D; else, perform E and perform F

in 1.3.0, 'else' is correctly detected as a child of the 'perform' coming after it; however, in 1.4.0, it is detected as a child of the 'perform' before it.

How can I force Stanza to load 1.3.0 instead of the latest version, so I can move forward with what I am doing now?

@apsyio apsyio added the bug label Dec 7, 2022
@apsyio apsyio changed the title I am using the dependency parser and noticed 1.4.0 has bugs that does do not exist in 1.3.0. Here is an examples: 1.4.0 is buggy when it comes to some dependency parsing tasks, however, 1.3.0 works correctly Dec 7, 2022
@AngledLuffa
Copy link
Collaborator

Technically you can just install an earlier version of Stanza. I'm not sure there's another great way to fix this. There are a couple instances of "or else" in the EWT training data, in which the "or else" has a head later in the sentence, but everything else is like "everything else" in that "else" depends on the previous word. You could suggest a couple sentences with "else" used in a different way and we can add those to the training data

@AngledLuffa
Copy link
Collaborator

I tried updating the dependency parser to use the pretrained character model rather than training its own, as the previous versions do, and while it improved LAS from 87.8 to 88.5, it didn't help that particular sentence. If you'd be interested in brainstorming a couple other examples of "else" used in this context instead of the more common "anyone else", "somewhere else", etc, we can add those to the supplemental training data and build a new model next week

@AngledLuffa
Copy link
Collaborator

I trained a model using electra-large as the input embedding, and it gets 91.95 on the EWT test set. A significant improvement! It also gets this particular example correct. It's not the default model because the transformer based models are a lot more expensive in general, but you can easily load it with the package parameter when creating a Pipeline

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants