Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create a custom parser for parsing ISO8601 datetime variants #106486

Merged
merged 21 commits into from
May 14, 2024

Conversation

thecoop
Copy link
Member

@thecoop thecoop commented Mar 19, 2024

Create and use a custom date-time parser for predefined ISO8601 variant date formatters. This resolves #102063.

This shows a ~10% increase in ingest throughput on http_logs rally track, and ~25% increase in ingest throughput on so rally track, both tested using esbench.

For the moment, it only supports strict_ date-time variants, where it is possible to specify all datetime fields. Future work could easily extend this to date formatters where some fields are forbidden, eg date_hour or date_hour_minute_second (adding an extra Set<ChronoField> forbiddenFields parameter and checking appropriately).

I have tried very hard to keep the new parser as close as possible to the behaviour of the old java.time-based parser, however there are some edge cases that the new parser does not support. If the new parser fails to parse a string, the old parser will then be tried. A JVM option es.datetime.java_time_parsers is added to force the use of the old parser regardless, if that is desired for any reason.

@thecoop thecoop added :Core/Infra/Core Core issues without another label >refactoring labels Mar 19, 2024
@thecoop thecoop force-pushed the fast-datetime branch 4 times, most recently from f5078af to cce3e9c Compare March 25, 2024 11:56
@thecoop thecoop marked this pull request as ready for review March 26, 2024 13:30
@thecoop thecoop requested a review from a team March 26, 2024 13:30
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-core-infra (Team:Core/Infra)

@elasticsearchmachine elasticsearchmachine added the Team:Core/Infra Meta label for core/infra team label Mar 26, 2024
@thecoop
Copy link
Member Author

thecoop commented Mar 28, 2024

@elasticmachine update branch

Copy link
Contributor

@pgomulka pgomulka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks good to me.
I left few questions asking for more javadocs
can we also link the original issue? I wonder if we should also comment there why this approach was taken instead of that user's solution or 3rd party library (https://github.com/ethlo/itu/ )

@thecoop you mentioned that there are some cases where this would be broken? can we have them as testcases? We likely should document them under breaking changes section and let support team know too

@elasticsearchmachine
Copy link
Collaborator

Hi @thecoop, I've created a changelog YAML for you.

@elasticsearchmachine
Copy link
Collaborator

Hi @thecoop, I've updated the changelog YAML for you.

@thecoop
Copy link
Member Author

thecoop commented Apr 11, 2024

@pgomulka There are test cases for the invalid strings in testStrangeParses method

@thecoop thecoop requested a review from a team April 12, 2024 12:34
Copy link
Contributor

@pgomulka pgomulka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@thecoop
Copy link
Member Author

thecoop commented Apr 25, 2024

@elasticmachine update branch

@thecoop thecoop requested a review from a team April 26, 2024 08:42
@thecoop
Copy link
Member Author

thecoop commented Apr 26, 2024

Updated with feedback from breaking-changes committee

@thecoop thecoop requested a review from pgomulka May 13, 2024 10:07
Copy link
Contributor

@pgomulka pgomulka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

if (seconds == null) return null;
if (len == pos) return ofHoursMinutesSeconds(hours, minutes, seconds, positive);

// there's some text left over...
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it be worth rephrasing to? // there's some text left over... will return error

Copy link
Contributor

@ldematte ldematte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 2! Good test coverage, nice performance improvements, and a failsafe to go back to the old one 👍

Copy link
Contributor

@mosche mosche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work, LGTM 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Core/Infra/Core Core issues without another label >enhancement release highlight Team:Core/Infra Meta label for core/infra team v8.15.0
Projects
None yet
8 participants