Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mktime in a non-UTC zone? #26

Closed
scottlamb opened this issue Apr 1, 2022 · 4 comments
Closed

mktime in a non-UTC zone? #26

scottlamb opened this issue Apr 1, 2022 · 4 comments

Comments

@scottlamb
Copy link

scottlamb commented Apr 1, 2022

First, thank you for writing this crate. I hate libc's time API for reasons beyond the soundness problem, so it's great to have pure Rust implementations of time zones.

The crate's blurb says this:

A pure Rust reimplementation of libc functions localtime, gmtime and mktime.

but I can't figure out how to replace mktime when operating in a non-UTC time zone. I want to go from a calendar representation (YYYYmmddHTT:MM:SS) in a political time zone like America/Los_Angeles to seconds since epoch.

[editing to add: also in particular, my code here relies on mktime accepting a "non-normalized" time struct, as described in the man page:

The mktime() function modifies the fields of the tm structure as follows: ... if structure members are outside their valid interval, they will be normalized (so that, for example, 40 October is changed into 9 November)

to find the boundary of the next day in both seconds since epoch and in proper calendar terms. I'm not sure if you consider that in-scope for tz-rs or day math is better handled by some other more full-featured datetime crate built on top of it.]

I see how to create a UtcDateTime from this kind of spec (UtcDateTime::new is mentioned in the crate-level doc even) but not a DateTime.

Is this possible today? If not, could it be? Ideally with a way of resolving ambiguity like the Javascript TC39 Temporal API offers.

For context, I'm trying to replace Moonfire NVR's time 0.1-based code here, here, and here.

@x-hgg-x
Copy link
Owner

x-hgg-x commented Apr 2, 2022

The hard part is not the normalization of mktime, but the automatic deduction of DST when you specify tm_isdst = -1.

In fact, even glibc is not coherent with itself with the following code:

Code
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <time.h>

void print_time(int tm_hour, int tm_min)
{
    struct tm t = {
        .tm_sec = 0,
        .tm_min = tm_min,
        .tm_hour = tm_hour,
        .tm_mday = 31,
        .tm_mon = 9,
        .tm_year = 121,
        .tm_isdst = -1,
    };

    time_t unix_time = mktime(&t);
    char *s = asctime(&t);
    s[strlen(s) - 1] = '\0';

    printf("asctime: %s, mktime: %li, tm_gmtoff: %li\n", s, unix_time, t.tm_gmtoff);
}

int main(int argc, char *argv[])
{
    setenv("TZ", "Europe/Paris", 1);

    print_time(1, 30);
    print_time(2, 0);
    print_time(2, 30);
    print_time(3, 0);
    print_time(3, 30);
    print_time(3, 0);
    print_time(2, 30);
    print_time(2, 0);
    print_time(1, 30);

    return 0;
}

In Linux, it prints the following output (Godbolt):

Output
asctime: Sun Oct 31 01:30:00 2021, mktime: 1635636600, tm_gmtoff: 7200
asctime: Sun Oct 31 02:00:00 2021, mktime: 1635638400, tm_gmtoff: 7200
asctime: Sun Oct 31 02:30:00 2021, mktime: 1635640200, tm_gmtoff: 7200
asctime: Sun Oct 31 03:00:00 2021, mktime: 1635645600, tm_gmtoff: 3600
asctime: Sun Oct 31 03:30:00 2021, mktime: 1635647400, tm_gmtoff: 3600
asctime: Sun Oct 31 03:00:00 2021, mktime: 1635645600, tm_gmtoff: 3600
asctime: Sun Oct 31 02:30:00 2021, mktime: 1635643800, tm_gmtoff: 3600
asctime: Sun Oct 31 02:00:00 2021, mktime: 1635642000, tm_gmtoff: 3600
asctime: Sun Oct 31 01:30:00 2021, mktime: 1635636600, tm_gmtoff: 7200

You can see that we have different results of mktime for 02:00:00 and 02:30:00 after repeating the function call, because the determination of DST in glibc depends on some global variables which are modified by each call of mktime.

Related Chrono issue: chronotope/chrono#668.

I am current working on the next version of the library, where I will add the following functions:

Code
impl DateTime {
     /// Construct a date time
    ///
    /// ## Inputs
    ///
    /// * `year`: Year
    /// * `month`: Month in `[1, 12]`
    /// * `month_day`: Day of the month in `[1, 31]`
    /// * `hour`: Hours since midnight in `[0, 23]`
    /// * `minute`: Minutes in `[0, 59]`
    /// * `second`: Seconds in `[0, 60]`, with a possible leap second
    /// * `nanoseconds`: Nanoseconds in `[0, 999_999_999]`
    /// * `local_time_type`: Local time type associated to a time zone
    ///
    pub fn new(
        year: i32,
        month: u8,
        month_day: u8,
        hour: u8,
        minute: u8,
        second: u8,
        nanoseconds: u32,
        local_time_type: LocalTimeType,
    ) -> Result<Self, TzError> {
        todo!()
    }

    /// Find the date time correponding to the inputs
    ///
    /// ## Inputs
    ///
    /// * `year`: Year
    /// * `month`: Month in `[1, 12]`
    /// * `month_day`: Day of the month in `[1, 31]`
    /// * `hour`: Hours since midnight in `[0, 23]`
    /// * `minute`: Minutes in `[0, 59]`
    /// * `second`: Seconds in `[0, 60]`, with a possible leap second
    /// * `nanoseconds`: Nanoseconds in `[0, 999_999_999]`
    /// * `time_zone_ref`: Reference to a time zone
    ///
    pub fn find(
        year: i32,
        month: u8,
        month_day: u8,
        hour: u8,
        minute: u8,
        second: u8,
        nanoseconds: u32,
        time_zone_ref: TimeZoneRef,
    ) -> Result<FoundDateTimeKind, TzError> {
        todo!()
    }
}

/// Output of [`DateTime::find`] function
pub enum FoundDateTimeKind {
    /// A unique valid and unambiguous date time was found
    Unique(DateTime),
    /// Found date time is invalid or ambiguous (at least one transition includes the date time)
    InvalidOrAmbiguous(Vec<OnTransition>),
}

/// Type of transitions including the date time
pub enum OnTransition {
    /// Date time is invalid because it was skipped by a forward transition.
    ///
    /// This variant gives the two [`DateTime`] corresponding to the transition instant, just before and just after the transition.
    ///
    /// This is different from the `mktime` behavior, which allows invalid datetimes when no DST information is available (by specifying `tm_isdst = -1`).
    ///
    ForwardTransition {
        /// Date time just before the forward transition
        before_transition: DateTime,
        /// Date time just after the forward transition
        after_transition: DateTime,
    },
    /// Date time is ambiguous because it was repeated after a backward transition
    BackwardTransition {
        /// Earlier date time before the backward transition
        earlier: DateTime,
        /// Later date time after the backward transition
        later: DateTime,
    },
}

Then you can use it like this for your use case:

Code
fn bounds(year: i32, month: u8, month_day: u8, local_time_zone_ref: TimeZoneRef) -> Result<Range<i64>, TzError> {
    const SECONDS_PER_DAY: i64 = 86400;

    let naive_day_start = UtcDateTime::new(year, month, month_day, 0, 0, 0, 0)?;
    let naive_day_end = UtcDateTime::from_timespec(naive_day_start.unix_time() + SECONDS_PER_DAY, 0)?;

    let day_start = DateTime::find(naive_day_start.year(), naive_day_start.month(), naive_day_start.month_day(), 0, 0, 0, 0, local_time_zone_ref)?;
    let day_end = DateTime::find(naive_day_end.year(), naive_day_end.month(), naive_day_end.month_day(), 0, 0, 0, 0, local_time_zone_ref)?;

    // process day_start and day_end
    let chosen_day_start: DateTime = todo!();
    let chosen_day_end: DateTime = todo!();

    Ok(chosen_day_start.unix_time()..chosen_day_end.unix_time())
}

fn compute_unix_time(
    year: i32,
    month: u8,
    month_day: u8,
    hour: u8,
    minute: u8,
    second: u8,
    ut_offset: Option<i32>,
    local_time_zone_ref: TimeZoneRef,
) -> Result<i64, TzError> {
    match ut_offset {
        Some(ut_offset) => Ok(DateTime::new(year, month, month_day, hour, minute, second, 0, LocalTimeType::with_ut_offset(ut_offset)?)?.unix_time()),
        None => {
            let result_kind = DateTime::find(year, month, month_day, hour, minute, second, 0, local_time_zone_ref)?;

            // process result_kind
            let chosen_date_time: DateTime = todo!();

            Ok(chosen_date_time.unix_time())
        }
    }
}

@scottlamb
Copy link
Author

In fact, even glibc is not coherent with itself with the following code:

Yuck! That's a bug in my present code then. The call sites I pointed out don't particularly care how the ambiguity is resolved but they need it to be deterministic for a given input, or my in-memory index can effectively be corrupted. More reason to ditch the libc API.

I am current working on the next version of the library, where I will add the following functions:
...

/// Output of [`DateTime::find`] function
pub enum FoundDateTimeKind {
   /// A unique valid and unambiguous date time was found
    Unique(DateTime),
    /// Found date time is invalid or ambiguous (at least one transition includes the date time)
    InvalidOrAmbiguous(Vec<OnTransition>),
}

Awesome! I'm excited to see you're already working on this.

Question: should it be considered valid for (and is there any actual case in the current IANA database where) transition periods overlap, so there are more than two possible offsets for a given ISO-8601 calendar time? It seems crazy to me. I just checked a couple APIs that I believe are considered among the best, and they don't seem to support this:

  • the Javascript TC39 Temporal API I linked to above just supports two possible offsets. You pass in compatible (the default), earlier, later, or reject to choose.
  • likewise, Java's newer JSR-310 java.time API (which obsoletes its original java.util API and Joda-Time) likewise to support just two offsets. (search for "overlap" in ZonedDateTime. It defaults to earlier, and you can call the withEarlierOffsetAtOverlap or withLaterOffsetAtOverlap accessors to be explicit or override.

so they must reject zone definitions with overlapping transitions or at least not ever return all possible offsets if this happens.

It'd be super convenient if FoundDateTimeKind had an impl with names taken from the Temporal API, so we can handle this with one-liners:

let day_start = DateTime::find(...)?.earlier();

// or maybe even defer the `Result` conversion until after the choice,
// so there's no redundant error path for `reject`:
let day_start = DateTime::find(...).earlier()?;
let day_start = DateTime::find(...).reject()?;

@x-hgg-x
Copy link
Owner

x-hgg-x commented Apr 2, 2022

Question: should it be considered valid for (and is there any actual case in the current IANA database where) transition periods overlap, so there are more than two possible offsets for a given ISO-8601 calendar time?

Exemple of possible overlapping transitions:

  • Jump 1h forward from 1970-01-01T02:00:00Z to 1970-01-01T03:00:00+01:00 (transition_unix_time = 7200)
  • Wait 30 minutes until 1970-01-01T03:30:00+01:00
  • Jump 1h backward from 1970-01-01T03:30:00+01:00 to 1970-01-01T02:30:00Z (transition_unix_time = 9000)

The TZif file format is defined in the RFC 8536, which doesn't disallow overlapping transitions, so the case is definitely possible. However I don't know if there is such a case in the current IANA database.

I will add the following modifications to my code:

Code
/// Output of [`DateTime::find`] function
pub enum FoundDateTimeKind {
    /// A unique valid and unambiguous date time was found
    Unique(DateTime),
    /// Found date time is invalid or ambiguous (at least one transition includes the date time)
    InvalidOrAmbiguous {
        /// First transition
        first_transition: OnTransition,
        /// Additional transitions
        additional_transitions: Vec<OnTransition>,
    },
}

impl FoundDateTimeKind {
    pub fn unique(&self) -> Option<DateTime> {
        match *self {
            Self::Unique(date_time) => Some(date_time),
            _ => None,
        }
    }

    pub fn earlier(&self) -> DateTime {
        match self {
            &Self::Unique(date_time) => date_time,
            Self::InvalidOrAmbiguous { first_transition, additional_transitions } => {
                let first_earlier = match first_transition {
                    OnTransition::ForwardTransition { before_transition, .. } => before_transition,
                    OnTransition::BackwardTransition { earlier, .. } => earlier,
                };

                *additional_transitions.iter().fold(first_earlier, |current_earlier, transition| match transition {
                    OnTransition::ForwardTransition { before_transition, .. } if before_transition < current_earlier => before_transition,
                    OnTransition::BackwardTransition { earlier, .. } if earlier < current_earlier => earlier,
                    _ => current_earlier,
                })
            }
        }
    }

    pub fn later(&self) -> DateTime {
        match self {
            &Self::Unique(date_time) => date_time,
            Self::InvalidOrAmbiguous { first_transition, additional_transitions } => {
                let first_later = match first_transition {
                    OnTransition::ForwardTransition { after_transition, .. } => after_transition,
                    OnTransition::BackwardTransition { later, .. } => later,
                };

                *additional_transitions.iter().fold(first_later, |current_later, transition| match transition {
                    OnTransition::ForwardTransition { after_transition, .. } if after_transition > current_later => after_transition,
                    OnTransition::BackwardTransition { later, .. } if later > current_later => later,
                    _ => current_later,
                })
            }
        }
    }
}

There won't be a Javascript-compatible method since it is not well-defined when several transitions are overlapping.

@x-hgg-x
Copy link
Owner

x-hgg-x commented Apr 13, 2022

Implemented in 98bbeac with some API simplifications compared to above.

@x-hgg-x x-hgg-x closed this as completed Apr 13, 2022
x-hgg-x added a commit that referenced this issue Apr 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants