Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

uucore: add datetime parser #4917

Closed
wants to merge 1 commit into from
Closed

uucore: add datetime parser #4917

wants to merge 1 commit into from

Conversation

Benjscho
Copy link
Contributor

Add a relaxed datetime parser to uucore for use in date utility (and potentially others). This datetime parser functions by using chronos own parsing utilities and a try/succeed approach to parsing. This commit adds date -d functionality to the date utility by enabling parsing of a number of date formats.

This implementation of the datetime parser has some drawbacks and some positives. On the positive side:

  • it was easy to implement
  • it is easy to add more datetime formats to

In order to add additionally supported formats, a developer can add the required format string to the format mod in parse_datetime.rs, and then add it as a potential format to the relevant fmts vec.

On the negative:

  • It is not easily customiseable beyond the supported chrono parsing formats. E.g., chrono does not currently support parsing offsets without trailing zeros. date -d "UTC+1" should return a valid response but chrono fails to parse this.
  • Because it is an attempt driven parser, it is likely not that performant. I have not done any performance testing as part of this change, but I would expect a custom parser to perform much better.

Add a relaxed datetime parser to uucore for use in `date` utility (and
potentially others). This datetime parser functions by using `chrono`s
own parsing utilities and a try/succeed approach to parsing. This commit
adds `date -d` functionality to the `date` utility by enabling parsing
of a number of date formats.

This implementation of the datetime parser has some drawbacks and some
positives. On the positive side:
- it was easy to implement
- it is easy to add more datetime formats to

In order to add additionally supported formats, a developer can add the
required format string to the `format` mod in `parse_datetime.rs`, and
then add it as a potential format to the relevant `fmts` vec.

On the negative:
- It is not easily customiseable beyond the supported `chrono` parsing
  formats. E.g., `chrono` does not currently support parsing offsets
without trailing zeros. `date -d "UTC+1"` should return a valid response
but `chrono` fails to parse this.
- Because it is an attempt driven parser, it is likely not that
  performant. I have not done any performance testing as part of this
change, but I would expect a custom parser to perform much better.
@cakebaker cakebaker linked an issue May 30, 2023 that may be closed by this pull request
@sylvestre
Copy link
Sponsor Contributor

Nice, what would you think about making a crate like https://github.com/uutils/humantime_to_duration/ ?

@Benjscho
Copy link
Contributor Author

That would be great! I can give that a shot

@tertsdiepraam
Copy link
Member

@sylvestre I actually think this should be part of humantime_to_datetime instead of a separate crate. That crate was supposed to be the replacement for GNU's parse_datetime wasn't it? Maybe we can take this code over to a PR on that repository?

This little test shows that all the complex modifiers should also work when reading from a file:

echo "yesterday" > dates.txt
date -f dates.txt

@Benjscho
Copy link
Contributor Author

Benjscho commented Jun 1, 2023

I think that a new humantime_to_datetime crate would be best. Duration doesn't quite fit the use case as a name, but we could use that crate to parse durations. If it then isn't that performant due to some of the issues mentioned in this PR, I'd be interested in attempting a full custom parser

@tertsdiepraam
Copy link
Member

tertsdiepraam commented Jun 2, 2023

Oh yeah I misspelled it. I meant to say that they should be the same. Duration is indeed a bit weird because I think the main use case is to create datetimes

@sylvestre
Copy link
Sponsor Contributor

happy to rename them (we don't have any other users of humantime_to_duration)
so, we are good.

However, humantime_to_duration uses the time crate, this PR uses chrono. We should use only one :)
i don't have a strong opinion on this

@tertsdiepraam
Copy link
Member

My preference is for chrono, because it handles timezones better and is actively maintained again, but whatever works is good I suppose :)

@sylvestre
Copy link
Sponsor Contributor

ok, let me do the port, should be easy :)

@Benjscho
Copy link
Contributor Author

Benjscho commented Jun 3, 2023

Ace, I see that's been merged! Should I move this PR to the humantime crate? Seems there's a few clippy spelling warnings I need to fix too

@sylvestre
Copy link
Sponsor Contributor

It would be great, thanks

Please try to follow the same naming and structure :)

@cakebaker
Copy link
Contributor

Closing this PR as it has been moved to uutils/parse_datetime#12

@cakebaker cakebaker closed this Jun 5, 2023
@sylvestre
Copy link
Sponsor Contributor

@Benjscho @tertsdiepraam ok to rename humantime_to_duration to parse_datetime?
if you have a better name, don't hesitate

@Benjscho
Copy link
Contributor Author

Benjscho commented Jun 6, 2023

Sounds good to me!

I think when renaming the crate it will need some reorganization to make it make more sense. I think there should be a main lib function parse_str that combines the duration and datetime parsing to present one entry point that parses both in combination to present behaviour that's equivalent to GNU coreutils parse-datetime. Then we could provide modules for parse_datetime and parse_duration that each expose parse_str methods for parsing individually.

I'm happy to do some of the work for that reorg in a follow up PR after renaming if that plan sounds like it makes sense

mvo5 added a commit to mvo5/coreutils that referenced this pull request Mar 30, 2024
This commit is a trivial followup for:
uutils#4917
and
uutils/parse_datetime#12

The functionality to parse the datetime was moved into the parse_datetime
crate and the only (tiny) piece left is to call it from `date`.

It also adds the two tests from the original PR#4917.

Closes: uutils#4657

Thanks to Ben Schofield
mvo5 added a commit to mvo5/coreutils that referenced this pull request Mar 30, 2024
This commit is a trivial followup for:
uutils#4917
and
uutils/parse_datetime#12

The functionality to parse the datetime was moved into the parse_datetime
crate and the only (tiny) piece left is to call it from `date`.

It also adds the test-case from the original issue. I did not include
the two tests from PR#4917 because they appear to work even without
this change. I am happy to include them of course if prefered.

Closes: uutils#4657

Thanks to Ben Schofield
cakebaker pushed a commit that referenced this pull request Mar 30, 2024
* date: fix `date -f dates.txt is failing`

This commit is a trivial followup for:
#4917
and
uutils/parse_datetime#12

The functionality to parse the datetime was moved into the parse_datetime
crate and the only (tiny) piece left is to call it from `date`.

It also adds the test-case from the original issue. I did not include
the two tests from PR#4917 because they appear to work even without
this change. I am happy to include them of course if prefered.

Closes: #4657

Thanks to Ben Schofield

* tests: tweak changes to test_date.rs to be more idiomatic

Co-authored-by: Sylvestre Ledru <sylvestre@debian.org>

---------

Co-authored-by: Sylvestre Ledru <sylvestre@debian.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

date -f dates.txt is failing
4 participants