-
Notifications
You must be signed in to change notification settings - Fork 3.5k
Description
Date parsing has changed between java 8 and 11, as @yaauie has described in another issue:
From JDK-8211750:
JDK 11 is updated to use the CLDR v33 (refer release notes : https://www.oracle.com/technetwork/java/javase/11-relnote-issues-5012449.html ➜ Updated Locale Data to Unicode CLDR v33 ) . So, there are changes with respect to some locales. You can check if the abbreviated names for the months have changed in CLDR v33 for the locale that you are using . (http://cldr.unicode.org/index/downloads/cldr-33.)
The charts for CLDR v33 do not seem to be available, but CLDR v38 do have the Months - abbreviated - Formatting entry for october in de is Okt. (note: trailing decimal). I am unsure why jodatime uses the Months - abbreviated - Formatting entry and not Months - abbreviated - standalone for MMM, but that appears to be the case.
Further on the JDK ticket:
If you want the behavior as was in JDK 8, then you can use the option
-Djava.locale.providers=COMPAT, this uses the JRE locale, which was used prior to addition of CLDR in JDK . As CLDR is a standard , so it has been made as default from JDK 9 onwards, but to maintain backward compatibility this option has been provided.
The fix can be easily confirmed to work:
/tmp/logstash-7.9.3
❯ export JAVA_HOME=/Users/joaoduarte/.jenv/versions/1.8 && jenv global 1.8
/tmp/logstash-7.9.3
❯ echo "2020-Okt-29"| bin/logstash -e 'filter { date { locale => "de-DE" match => [ "message", "yyyy-MMM-dd"]}}'
[2020-11-03T15:09:11,551][INFO ][logstash.runner ] Starting Logstash {"logstash.version"=>"7.9.3", "jruby.version"=>"jruby 9.2.13.0 (2.5.7) 2020-08-03 9a89c94bcc OpenJDK 64-Bit Server VM 25.232-b09 on 1.8.0_232-b09 +indy +jit [darwin-x86_64]"}
{
"@version" => "1",
"@timestamp" => 2020-10-29T00:00:00.000Z,
"type" => "stdin",
"message" => "2020-Okt-29",
"host" => "joaos-mbp.lan"
}
/tmp/logstash-7.9.3
❯ export JAVA_HOME=/Users/joaoduarte/.jenv/versions/11.0 && jenv global 11.0
/tmp/logstash-7.9.3
❯ echo "2020-Okt-29"| bin/logstash -e 'filter { date { locale => "de-DE" match => [ "message", "yyyy-MMM-dd"]}}'
[2020-11-03T15:09:44,706][INFO ][logstash.runner ] Starting Logstash {"logstash.version"=>"7.9.3", "jruby.version"=>"jruby 9.2.13.0 (2.5.7) 2020-08-03 9a89c94bcc OpenJDK 64-Bit Server VM 11.0.5+10 on 11.0.5+10 +indy +jit [darwin-x86_64]"}
{
"host" => "joaos-mbp.lan",
"type" => "stdin",
"@version" => "1",
"message" => "2020-Okt-29",
"@timestamp" => 2020-11-03T15:09:47.283Z,
"tags" => [
[0] "_dateparsefailure"
]
}
/tmp/logstash-7.9.3
❯ echo "2020-Okt-29"| LS_JAVA_OPTS="-Djava.locale.providers=COMPAT" bin/logstash -e 'filter { date { locale => "de-DE" match => [ "message", "yyyy-MMM-dd"]}}'
[2020-11-03T15:10:15,465][INFO ][logstash.runner ] Starting Logstash {"logstash.version"=>"7.9.3", "jruby.version"=>"jruby 9.2.13.0 (2.5.7) 2020-08-03 9a89c94bcc OpenJDK 64-Bit Server VM 11.0.5+10 on 11.0.5+10 +indy +jit [darwin-x86_64]"}
{
"@timestamp" => 2020-10-29T00:00:00.000Z,
"type" => "stdin",
"host" => "joaos-mbp.lan",
"message" => "2020-Okt-29",
"@version" => "1"
}
The proposal here is that for 7.x the java.locale.providers flag should be added to remove this bwc bug, then for 8.0.0 we can discuss what's the best course of action.