Skip to content

SUTime output differs with Java version and stanford-corenlp version #1137

@kwalcock

Description

@kwalcock

Hi,

While running unit tests on some code that uses SUTime, we noticed that all tests passed with Java 1.8 but that one failed with Java 11.  In both cases we were using the same stanford-corenlp 3.9.2.  We tracked it down to a discrepancy in SUTime, which produces different output for the different Java versions.  The example SUTimeDemo program from https://nlp.stanford.edu/software/sutime.shtml produces for the sentence

The Food and Agriculture Organization of the United Nations (FAO), the United Nations Children's Fund (UNICEF) and the World Food Programme (WFP) stressed that while the deteriorating situation coincides with an unusually long and harsh annual lean season, when families have depleted their food stocks and new harvests are not expected until August, the level of food insecurity this year is unprecedented.

output with Java 8 that includes the word "annual".  Java 11 output is without it.

annual [from char offset 237 to 243] --> P1Y
August [from char offset 343 to 349] --> 2013-08
this year [from char offset 380 to 389] --> 2013

For what it is worth, if the same test is run using standord-corenlp 4.2.0, "annual" appears in the output of both runs.  However, we are reluctant to switch to the newer version because of other changes that it has.

Does anyone know the reason for this behavior?  Is there something we can backport to the older version so that we can get consistent output?

Thank you,

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions