-
Notifications
You must be signed in to change notification settings - Fork 75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Brackets (punct) are not properly tagged to its heads in show tables (english-ewt-ud-2.12-230717) #175
Comments
I confirm the right brackets following FDA and NDA are attached to a wrong parent (i.e. not to FDA and NDA, respectively), when parsing this very long sentence with english-ewt-ud-2.12-230717. You can use udapy -s ud.FixPunct < in.conllu > out.conllu to fix it. However, the output in "Show Trees" is exactly the same as in "Show Table" (and as the CoNLL-U in "Output Text"), so there is no bug in UDPipe. These GitHub issues are for reporting bugs in the software. You cannot expect 100% parsing accuracy from all models. BTW: When using e.g. the english-gum-ud-2.12-230717 model, the brackets enclosing FDA and NDA are attached correctly. This suggest GUM is better training data then EWT in this aspect. Indeed, when applying |
Thanks @martinpopel for your detailed answer 😊 @Shasetty UDPipe is a statistical tool, so its performance depends both on (a) its ability to effectively train on the UD training data and correctly generalizing on user inputs, and (b) the correctness of the training data. It is expected that it makes errors, but we cannot easily fix them one by one (so it makes little use to report them to us); but you can definitely try improving the training datain the repository @martinpopel suggested. |
Text :
MALVERN, Pa., Aug. 09, 2023 (GLOBE NEWSWIRE) -- Galera Therapeutics, Inc. (Nasdaq: GRTX), a clinical-stage biopharmaceutical company focused on developing and commercializing a pipeline of novel, proprietary therapeutics that have the potential to transform radiotherapy in cancer, today announced that it has received a Complete Response Letter (CRL) from the U.S.Food and Drug Administration (FDA) regarding the Company’s New Drug Application (NDA) for avasopasem manganese (avasopasem) for radiotherapy-induced severe oral mucositis (SOM) in patients with head and neck cancer undergoing standard-of-care treatment.
correct output in "show trees"
wrong outputs in "show tables" & output text : (FDA) , (NDA)
https://lindat.mff.cuni.cz/services/udpipe/
The text was updated successfully, but these errors were encountered: