Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Factorize automatic generation from IANA database #10

Open
woshilapin opened this issue Apr 27, 2022 · 3 comments
Open

Factorize automatic generation from IANA database #10

woshilapin opened this issue Apr 27, 2022 · 3 comments

Comments

@woshilapin
Copy link

Sorry for the not so short text below, but I thought some context was needed before exposing what I'd like to do.

Context

In the beginning of the year 2022, reacting to CVE-2020-26235, I migrated some of the repositories I was working on, from chrono to time, until I stumbled on some functionalities that needed to parse timezones. Then, I opened chronotope/chrono-tz#93.

That said, the PR I proposed is not really satisfying:

  1. with the name chrono_tz, you would expect to have chrono in it (so making chrono an optional feature sounds dumb)
  2. it looks like the extract from the IANA Database is a functionality by itself that might live in its own repository

It's the second item that made me opened this issue (see below).

What I'd like to do?

I only found out today about your crate time-tz. I'm pretty happy about it because I'm pretty sure I'm gonna use it:smile:

I quickly parsed the code from build.rs and it seems mostly similar to chrono-tz-build. I was actually already trying to extract that part into an independent crate (when working on chronotope/chrono-tz#93). Would you be interested in such a modification? I'd be interested to actually do the work but I'd like to know first if the idea is sound and acceptable in your project?

@Yuri6037
Copy link
Owner

I'm not sure I understand what you're trying to do. In my understanding you want to avoid download and build of the entire IANA database by moving this part to it's own crate. Is that what you're trying to do?

I understand that this part can be very slow if you don't need it but currently there's no way to avoid download of the IANA database. In the current version there's already a feature which enables the IANA database (which is enabled by default).

The build.rs script is indeed very similar to chrono-tz, that's because I based my code on it except that I've included windows timezone name database (windows uses a less-precise database than IANA) and I've preferred to respect rust naming conventions.

Now if we agree on what you're trying to do that could be indeed better. However, there are several points to be aware of in the current setup:

  • You might need to move the FixedTimeSpan and FixedTimeSpanSet structs to that new crate because the build script is dependent over these (these are not dependent over the time-rs crate).
  • Do you plan on moving the windows name database as well?
  • Depending on how we chose to handle the 2 prior points what would be the shape of the API we expose from that crate?

All things considered, if we do end up extracting this to it's own crate I'd like it to be time crate agnostic, meaning it could be used with any time crate.

@woshilapin
Copy link
Author

I'm not sure I understand what you're trying to do. In my understanding you want to avoid download and build of the entire IANA database by moving this part to it's own crate. Is that what you're trying to do?

I'm sorry if I was not precise enough, I'll try to answer your questions below.

I understand that this part can be very slow if you don't need it but currently there's no way to avoid download of the IANA database. In the current version there's already a feature which enables the IANA database (which is enabled by default).

I do not want to avoid downloading the IANA database. I actually want to extract the Rust code generation (in build.rs) from the IANA database... but in an agnostic way (no dependency to time or chrono).

The build.rs script is indeed very similar to chrono-tz, that's because I based my code on it except that I've included windows timezone name database (windows uses a less-precise database than IANA) and I've preferred to respect rust naming conventions.

Nice, if I understand well, what you did is actually an improvement for Windows user over the build.rs script from chrono-tz?

Now if we agree on what you're trying to do that could be indeed better. However, there are several points to be aware of in the current setup:

* You might need to move the `FixedTimeSpan` and `FixedTimeSpanSet` structs to that new crate because the build script is dependent over these (these are not dependent over the `time-rs` crate).

Thanks for the hint, that's actually one of the thing I was not sure. If both chrono-tz and time-tz use similar structures and traits, they sure should be both part of this new crate.

That said, aren't FixedTimeSpan and FixedTimeSpanSet part of the parse_zoneinfo? Is there a need to redefine them? Are they different from the one in time-tz?

* Do you plan on moving the windows name database as well?

Yes, sure. It seems that this is an improvement over what has been done in chrono-tz, so I intend to keep it.

* Depending on how we chose to handle the 2 prior points what would be the shape of the API we expose from that crate?

I didn't yet think about the details. From chrono-tz, I thought I would expose the enum Tz and probably associated traits like TimeSpans. But I parsed a bit time-tz and it seems relatively different... which is probably good. It's gonna be harder to figure out the common needs between both, but having very different solutions might help to extract the real business logic. If you have any suggestion, I'm happy to hear them.

All things considered, if we do end up extracting this to it's own crate I'd like it to be time crate agnostic, meaning it could be used with any time crate.

Yes, that's exactly the goal: to create a new crate, agnostic from chrono or time, and that can be used by both time-tz and chrono-tz in the end.

I hope it does answer some of your questions? Since I don't know yet all the implementation details, I might still miss things and be not precise enough, but I hope to figure out the details as I do it.

@Yuri6037
Copy link
Owner

I see, I think separating IANA database generation to a different crate could indeed be better in the long term: that would avoid me having to periodically check for updates with the official IANA database.

Nice, if I understand well, what you did is actually an improvement for Windows user over the build.rs script from chrono-tz?

Indeed my build.rs includes support to convert the system timezone name which is given by WinAPI functions into an IANA timezone. Support for reading system timezone is available with the system feature of this crate and requires this windows timezone name database (actually downloaded from the unicode project).

I've thought of a possible API which could leave the FixedTimeSpan and FixedTimeSpanSet in this crate with a build script only API:

  • Create a struct like IanaDbReader which handles reading from the IANA files and returns all timezones in an iterator, this way both chrono-tz and time-tz could use the same iterator to generate different structures depending on their needs. Of course the iterator should provide a built-in way to access timespans (maybe use lazy loading, check my other crate bpx on a possible way to implement lazy loading without shared mut hell). Additionally, this IanaDbReader should also provide a member function to construct the tree of all IANA timezones (like the ModuleTree struct in my current build.rs). That tree would allow simple generation of a rust module tree if the crate needs it (currently both time-tz and chrono-tz needs it).
  • Create a struct like WindowsDbReader which handles reading from the windows timezone name database and proposes an iterator which then would be used by time-tz to generate the static map used to convert between windows timezone names and IANA names.

Such API would allow moving the tz submodule, the win_cldr_data and the update_repo.sh script to that new crate. This would also greatly simplify the build.rs script.

I also think we should create a new repository for this library.

What do you think?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants