Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Core: Migrating from joda time to java.time #27330

Open
spinscale opened this issue Nov 9, 2017 · 14 comments

Comments

Projects
None yet
7 participants
@spinscale
Copy link
Member

commented Nov 9, 2017

Objective: In order to support nanosecond resolution and to remove the joda time dependency from Elasticsearch, we need to migrate a lot of code over to java.time.

This is a meta issue detailing the steps that need to be taken to migrate away from joda time and to java.time.

There are a few things to keep in mind and discuss before outlining the single steps

Retaining our gazillion date formats

Do we want to retain the tons of our named date formats, named very well like strictOrdinalDateTimeNoMillis or strictWeekDateTimeNoMillis (also we support snake and camel case for each of those!)? I think we could built a layer that is still able to read those, but then maybe convert it to formatted dates like xxxx-'W'ww-e'T'HH:mm:ssZZ (just an example). This way we could get rid of this mapping entirely or at least just maintain it for a few examples like epoch_millis, epoch_second and dateOptionalTime as the text representation are nowhere near as readable as the date time formats, once internalized.

You can find the full list in the Joda class

BWC breaking

Some classes used by builders to create queries are using joda time classes. As we move towards the REST client and away from the Transport Client this will be less of an BWC issue.

Steps

  • Ensure time zones are parsed the same way in joda and java.time, other wise the timezone parameters i.e. from the query string query/aggregations could act differently
  • Make DateMathParser use java.time
  • Make IndexNameExpressionResolver.DateMathExpressionResolver use java.time, also remove the check for the date_math_expression_resolver.default_time_zone and date_math_expression_resolver.default_date_format settings here (they are not registered anyway). #34507
  • Ensure new java time IndexNameExpressionResolver is BWC compatible when formatters are used
  • Convert field mapper types to use ZonedDateTime #36363
  • Convert DateTimeUnit, Rounding, TimeValue classes
  • Convert XContentBuilder, StreamInput, StreamOutput classes (the serialized version of this is just a long and a timezone in our streams, so those are easy to convert. Same applies for the XContentBuilder)
  • Convert the cat API classes (super easy, mostly just formatters, that get their input as milliseconds)
  • SortingNumericDocValues and classes extending from it in the fielddata package need to be converted (or removed?) #36363
  • Replace joda with java.time in painless
  • Replace joda with java.time in ingest #38088
  • Mapping: Replace in DocValueFormat and DateFieldMapper #36363
  • Remove jodatime dependency
  • Add support to nanosecond resolution, add new date format time that supports nanoseconds and ensure that the formatter supports it properly #32601
  • Remove usages of joda in x-pack/ML package #36363 (partially)
  • Remove usages of joda in Monitoring #36297
  • Remove usages of joda in Watcher #35809
  • Refactor DateHistograms bucket aggregations to use java.time #36363
@clintongormley

This comment has been minimized.

Copy link
Member

commented Nov 16, 2017

This way we could get rid of this mapping entirely or at least just maintain it for a few examples like epoch_millis, epoch_second and dateOptionalTime as the text representation are nowhere near as readable as the date time formats, once internalized.

++

We can deprecate their use in 6.x and say ("please replace with YYYY..."), then on upgrade to 7.0, we can auto-upgrade the mappings to use the equivalent date pattern format.

spinscale added a commit to spinscale/elasticsearch that referenced this issue Dec 19, 2017

Streamable: use java.time instead of joda time in Streamables
StreamInput/Streamoutput now use java.time when a timezone is serialized
over the wire. There is one special case that needs to be taken care of
and that is when java.time is sending 'Z' instead of 'UTC', which joda
time does not understand.

Relates elastic#27330
@spinscale

This comment has been minimized.

Copy link
Member Author

commented Dec 21, 2017

some more discussion about BWC.

Painless

Right now calling doc['foo'].value returns a joda DateTime object (due to ScriptDocValues in core returning one), which makes it non-trivial to just convert to java time, as many users will have put this kind of code in their scripts to deal with dates.

A possible work around for this would be to have a doc['foo'].date_value object which returns a java time.

Client side runtime casting of bucket keys

InternalDateHistogram.Bucket.getKey() implements the Bucket interface, which returns Object for the getKey() method. In the case of the date histogram a joda DateTime is returned, which clients might just be casting to DateTime object, because it has always been like that. This might only be uncovered in runtime, and not in compile time for people using Java.

We need to check if the high level REST client is affected by this as well.

Different symbols used in date formatting

There are a couple of differences when date time formatters are used, according to the java docs. joda time has a century of era symbol, using the C character. This has been removed fully from java time. Also the YYYY formatter is year of era in joda time, but week based year in java time (need to check how much of an impact this really is).

One idea was to have a setting which allows to set which date implementation would be used, but this would mean to not have support rolling upgrades as having to have a hard move from one implementation to the other, so we need to support both time libraries at least in scripting.

@spinscale

This comment has been minimized.

Copy link
Member Author

commented Jan 19, 2018

There are some differences in TZ handling we have to cater for. Jodatime has 593 tz ids, java time 600. java time adds a few new ones (starting with SystemV which seem not to be used at all), and those can be ignored. However, java time removes EST, GMT+0, GMT-0, HST, MST, ROC.

The general recommendation seems to be to use either offsets, like UTC-08:00 or those region based names like America/New York rather than the above as those are ambiguous.

We might need to add deprecation warnings for the above timezones.

spinscale added a commit to spinscale/elasticsearch that referenced this issue Jan 22, 2018

Infra: Move to java.time in Streamable/XContent
- StreamInput/StreamOutput now use java.time to interpret date times.
- XContentBuilder uses java.time for its formatters.
- Cutover to using java time zones in MappedFieldType.rangeQuery (and
  thus RangeFieldMapper, DateFieldMapper, SimpleMappedFieldType)
- QueryStringQueryBuilder and RangeQueryBuilder uses java.time time zone
- A few tests were moved to java.time where simply any time was needed

Relates elastic#27330
@jasontedor

This comment has been minimized.

Copy link
Member

commented Aug 3, 2018

Relates #10005

@rjernst

This comment has been minimized.

Copy link
Member

commented Aug 15, 2018

I've been investigating how we might add support for a bwc layer within 6.x so that using existing joda format strings is backed by java time. This looks completely possible to do with a sysprop that controls whether a mapping layer on parsing formats is enabled, except for the lack of century-of-era support @spinscale mentioned in an earlier comment. This raises the question of whether this extremely esoteric option could be broken in a minor (since there is no migration path anyways, and IMO the likelihood of any user out there using it is almost non-existent). The alternative to this is to build an abstraction layer internally for all datetime uses, so that java/joda use can be swapped by the same sysprop, determine which underlying implementation of the abstraction to use (this would rely on replacing all uses of parsing/formatting on the abstraction). I'd like others opinions on the tradeoffs between these two approaches.

@rjernst

This comment has been minimized.

Copy link
Member

commented Aug 15, 2018

Note: A third alternative that I do not see as viable is implementing the century-of-era in java time. While it is possible, it would require quite a bit of code for something that I don't think anyone is actually using.

@spinscale

This comment has been minimized.

Copy link
Member Author

commented Aug 16, 2018

to add some more background: here is a link what century of era refers to in an ISO chronology, i.e. 19th century for all the years from 1900 till 1999: http://joda-time.sourceforge.net/field.html#CenturyOfEra_and_YearOfCentury - which basically leaves you with a super broad/inexact timestamp

My unscientific feeling also says, there are not a lot of users of this.

@MovGP0

This comment has been minimized.

Copy link

commented Aug 23, 2018

Support for JodaTime and NodaTime time formats is really important for me.

For all business applications I have worked on, it is way more important to have validateable data types and algorithmic correctness, than issues like having very „high precision“, „better performance“, or „no dependency on external libraries“.

Having additional datatypes that support higher precision is ok with me. But I would not invest the work in reinventing the wheel in cases where the JodaTime library works fine. It might make more sense to implement the high-precision time datatypes in the JodaTime library instead.

@rjernst

This comment has been minimized.

Copy link
Member

commented Aug 27, 2018

@MovGP0 The joda time project encourages users to migrate to the the java time api. From a maintainability perspective, we are moving to java time to both eliminate an external dependency and ensure bugs are fixed. From a format support perspective, joda time and java time are very similar. There are a only a handful of format specifiers that changed, and the year of century mentioned above is the only one that does not exist in java time. Do you have a use for it specifically, or are you generally concerned with needing to update your formats?

@MovGP0

This comment has been minimized.

Copy link

commented Sep 2, 2018

@rjernst I'm mostly concert about reinventing the wheel (aka. 'not invented here syndrome'). Usually it's better to just use and contribute to a library where possible, rather than developing something as complex as handling international time calculations yourself.

Not sure how far the Java Time API has evolved to handle Joda Time's use cases.

@jasontedor

This comment has been minimized.

Copy link
Member

commented Sep 2, 2018

@MovGP0 As @rjernst mentioned, Joda is deprecated, no longer developed, and users are encouraged to move to java.time:

Note that from Java SE 8 onwards, users are asked to migrate to java.time (JSR-310) - a core part of the JDK which replaces this project.

Note that Joda-Time is considered to be a largely “finished” project. No major enhancements are planned. If using Java SE 8, please migrate to java.time (JSR-310).

@rjernst

This comment has been minimized.

Copy link
Member

commented Sep 7, 2018

We have been looking carefully at how to provide a migration path and deprecation warnings for those using joda time specific behavior. There are 2 types of places within Elasticsearch APIs the can be found: scripting and anywhere date formats are used.

For scripting, we previously attempted to provide migration through #31441. However, that turned out to be problematic with rolling upgrades. The new approach agreed on is to change the scripting api for date fields to return a ZonedDateTime compatible object, but with augmentation methods for the missing joda time methods, which will trigger deprecation warnings when used. This is implemented in #31441.

For date formats, the problem is a few characters for which the identifier has changed, or no longer exists in java time. For these the plan is to provide a special prefix that may be used for date formats that want to force the use of java time format specifiers (probably 8:). For most users, formats are compatible between the two, so nothing will change. For those using specifiers that have changed meanings or no longer exist, a deprecation warning will be emitted when parsing the format. The special prefix to force the new formats will continue to exist through 7.x to provide time for users affected by this to switch back to a format string without the prefix.

@spinscale

This comment has been minimized.

Copy link
Member Author

commented Sep 25, 2018

Two more BWC things Dave mentioned to me:

  1. Java 8 and Java9/10 do have different timezone handling. When creating a parser with a time zone in Java 8, it will overwrite the timezone parsed in the text input. This is fixed in java 9/10, but may lead to different results in java8 when used in combination with formatters. We need to verify this for the ingest processors
public static void main(String[] argv) {
    DateTimeFormatter formatter = new DateTimeFormatterBuilder().appendPattern("yyyy-MM-dd'T'HH:mm:ss,SSSXX").toFormatter(Locale.ROOT);
    String date = "2018-05-15T16:14:56,374Z";

    System.out.println(formatter.withZone(ZoneId.of("Europe/Berlin")).parse(date));
    System.out.println(formatter.withZone(ZoneOffset.UTC).parse(date));
    System.out.println(formatter.parse(date));

    System.out.println(Instant.from(formatter.withZone(ZoneId.of("Europe/Berlin")).parse(date)));
    System.out.println(Instant.from(formatter.withZone(ZoneOffset.UTC).parse(date)));
    System.out.println(Instant.from(formatter.parse(date)));
  }

Under java 10 this returns

{InstantSeconds=1526400896, OffsetSeconds=0},ISO,Europe/Berlin resolved to 2018-05-15T16:14:56.374
{InstantSeconds=1526400896, OffsetSeconds=0},ISO,Z resolved to 2018-05-15T16:14:56.374
{InstantSeconds=1526400896, OffsetSeconds=0},ISO resolved to 2018-05-15T16:14:56.374
2018-05-15T16:14:56.374Z
2018-05-15T16:14:56.374Z
2018-05-15T16:14:56.374Z

Under java 8 this returns

{OffsetSeconds=0, InstantSeconds=1526393696},ISO,Europe/Berlin resolved to 2018-05-15T16:14:56.374
{OffsetSeconds=0, InstantSeconds=1526400896},ISO,Z resolved to 2018-05-15T16:14:56.374
{OffsetSeconds=0, InstantSeconds=1526400896},ISO resolved to 2018-05-15T16:14:56.374
2018-05-15T14:14:56.374Z
2018-05-15T16:14:56.374Z
2018-05-15T16:14:56.374Z

The third last line is the important one. One time it is 14:14 and one time 16:14

  1. Appending a fraction using .appendFraction(MILLI_OF_SECOND, 3, 3, true) used to work with either a dot or a localized decimal point symbol in joda time, but this requires an explicit configuration in java time. This might be an issue for ingest pipelines as well.
@robertdumitrescu

This comment has been minimized.

Copy link

commented Oct 9, 2018

What's the plan for solving this issue in terms of releases timeline?

spinscale added a commit to spinscale/elasticsearch that referenced this issue Jan 31, 2019

Replace joda time in ingest-common module
This commit fully replaces any remaining joda time time classes with
java time implementations.

Relates elastic#27330

spinscale added a commit that referenced this issue Feb 1, 2019

Replace joda time in ingest-common module (#38088)
This commit fully replaces any remaining joda time time classes with
java time implementations.

Relates #27330

spinscale added a commit that referenced this issue Feb 4, 2019

Add nanosecond field mapper (#37755)
This adds a dedicated field mapper that supports nanosecond resolution -
at the price of a reduced date range.

When using the date field mapper, the time is stored as milliseconds since the epoch
in a long in lucene. This field mapper stores the time in nanoseconds
since the epoch - which means its range is much smaller, ranging roughly from
1970 to 2262.

Note that aggregations will still be in milliseconds.
However docvalue fields will have full nanosecond resolution

Relates #27330

pgomulka added a commit that referenced this issue Feb 4, 2019

Core: Migrating from joda to java.time. Monitoring plugin (#36297)
monitoring plugin migration from joda to java.time

refers #27330

pgomulka added a commit that referenced this issue Feb 4, 2019

Migrating from joda to java.time. Watcher plugin (#35809)
part of the migrating joda time work. Migrating watcher plugin to use JDK's java-time

refers #27330

pgomulka added a commit that referenced this issue Feb 5, 2019

XPack: core/ccr/Security-cli migration to java-time (#38415)
part of the migrating joda time work.
refactoring x-pack plugins usages of joda to java-time
refers #27330

pgomulka added a commit that referenced this issue Feb 6, 2019

Fix HistoryIntegrationTests timestamp comparsion (#38505)
When the millisecond part of a timestamp is 0 the toString
representation in java-time is omitting the millisecond part (joda was
not). The Search response is returning timestamps formatted with
WatcherDateTimeUtils, therefore comparisons of strings should be done
with the same formatter

relates #27330

pgomulka added a commit to pgomulka/elasticsearch that referenced this issue Feb 7, 2019

Fix HistoryIntegrationTests timestamp comparsion (elastic#38505)
When the millisecond part of a timestamp is 0 the toString
representation in java-time is omitting the millisecond part (joda was
not). The Search response is returning timestamps formatted with
WatcherDateTimeUtils, therefore comparisons of strings should be done
with the same formatter

relates elastic#27330

pgomulka added a commit to pgomulka/elasticsearch that referenced this issue Feb 7, 2019

Fix HistoryIntegrationTests timestamp comparsion (elastic#38505)
When the millisecond part of a timestamp is 0 the toString
representation in java-time is omitting the millisecond part (joda was
not). The Search response is returning timestamps formatted with
WatcherDateTimeUtils, therefore comparisons of strings should be done
with the same formatter

relates elastic#27330

pgomulka added a commit to pgomulka/elasticsearch that referenced this issue Feb 8, 2019

Fix HistoryIntegrationTests timestamp comparsion (elastic#38505)
When the millisecond part of a timestamp is 0 the toString
representation in java-time is omitting the millisecond part (joda was
not). The Search response is returning timestamps formatted with
WatcherDateTimeUtils, therefore comparisons of strings should be done
with the same formatter

relates elastic#27330

pgomulka added a commit to pgomulka/elasticsearch that referenced this issue Feb 8, 2019

Fix HistoryIntegrationTests timestamp comparsion (elastic#38505)
When the millisecond part of a timestamp is 0 the toString
representation in java-time is omitting the millisecond part (joda was
not). The Search response is returning timestamps formatted with
WatcherDateTimeUtils, therefore comparisons of strings should be done
with the same formatter

relates elastic#27330

pgomulka added a commit that referenced this issue Feb 8, 2019

Fix HistoryIntegrationTests timestamp comparsion (#38566) Backport(#3…
…8505)

When the millisecond part of a timestamp is 0 the toString
representation in java-time is omitting the millisecond part (joda was
not). The Search response is returning timestamps formatted with
WatcherDateTimeUtils, therefore comparisons of strings should be done
with the same formatter

relates #27330
backport#38505

pgomulka added a commit that referenced this issue Feb 11, 2019

Fix HistoryIntegrationTests timestamp comparison #38565 Backport#38505
When the millisecond part of a timestamp is 0 the toString
representation in java-time is omitting the millisecond part (joda was
not). The Search response is returning timestamps formatted with
WatcherDateTimeUtils, therefore comparisons of strings should be done
with the same formatter

relates #27330
BackPort #38505

@jimczi jimczi added the >enhancement label Feb 11, 2019

dimitris-athanasiou added a commit to dimitris-athanasiou/elasticsearch that referenced this issue Feb 12, 2019

Add nanosecond field mapper (elastic#37755)
This adds a dedicated field mapper that supports nanosecond resolution -
at the price of a reduced date range.

When using the date field mapper, the time is stored as milliseconds since the epoch
in a long in lucene. This field mapper stores the time in nanoseconds
since the epoch - which means its range is much smaller, ranging roughly from
1970 to 2262.

Note that aggregations will still be in milliseconds.
However docvalue fields will have full nanosecond resolution

Relates elastic#27330

dimitris-athanasiou added a commit to dimitris-athanasiou/elasticsearch that referenced this issue Feb 12, 2019

Core: Migrating from joda to java.time. Monitoring plugin (elastic#36297
)

monitoring plugin migration from joda to java.time

refers elastic#27330

dimitris-athanasiou added a commit to dimitris-athanasiou/elasticsearch that referenced this issue Feb 12, 2019

Migrating from joda to java.time. Watcher plugin (elastic#35809)
part of the migrating joda time work. Migrating watcher plugin to use JDK's java-time

refers elastic#27330

dimitris-athanasiou added a commit to dimitris-athanasiou/elasticsearch that referenced this issue Feb 12, 2019

XPack: core/ccr/Security-cli migration to java-time (elastic#38415)
part of the migrating joda time work.
refactoring x-pack plugins usages of joda to java-time
refers elastic#27330

dimitris-athanasiou added a commit to dimitris-athanasiou/elasticsearch that referenced this issue Feb 12, 2019

Fix HistoryIntegrationTests timestamp comparsion (elastic#38505)
When the millisecond part of a timestamp is 0 the toString
representation in java-time is omitting the millisecond part (joda was
not). The Search response is returning timestamps formatted with
WatcherDateTimeUtils, therefore comparisons of strings should be done
with the same formatter

relates elastic#27330

@jasontedor jasontedor closed this Apr 6, 2019

@jasontedor jasontedor reopened this Apr 6, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.