Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

parsing timezone should support Unicode "MINUS SIGN" (U+2212) #835

Closed
jtmoon79 opened this issue Oct 5, 2022 · 2 comments
Closed

parsing timezone should support Unicode "MINUS SIGN" (U+2212) #835

jtmoon79 opened this issue Oct 5, 2022 · 2 comments

Comments

@jtmoon79
Copy link
Contributor

jtmoon79 commented Oct 5, 2022

According to Wikipedia

To represent a negative offset, ISO 8601 specifies using a minus sign, (). If the interchange character set is limited and does not have a minus sign character, then the hyphen-minus should be used, (-). ASCII does not have a minus sign, so its hyphen-minus character (code is 45 decimal or 2D hexadecimal) would be used. If the character set has a minus sign, then that character should be used. Unicode has a minus sign, and its character code is U+2212 (2212 hexadecimal); the HTML character entity invocation is −.

Problem

chrono parse_from_str fails to parse a Unicode "minus sign" character (U+2212).

rust playground

extern crate chrono;
use chrono::DateTime;

fn main() {
    let dt_pattern = r"%Y %b %d %H%M%S %:z";

    // use ASCII "hyphen-minus"
    let data = r"2019 Aug 30 125959 -07:00";
    let dt = DateTime::parse_from_str(data, dt_pattern);
    dbg!(&dt);

    // use Unicode "minus" character
    let data = r"2019 Aug 30 125959 −07:00";
    let dt = DateTime::parse_from_str(data, dt_pattern);
    dbg!(&dt);
}

results in output

[src/main.rs:10] &dt = Ok(
    2019-08-30T12:59:59-07:00,
)
[src/main.rs:15] &dt = Err(
    ParseError(
        Invalid,
    ),
)

(using chrono version 0.4.22).

In some circumstances, chrono users must manually transform a found "minus sign" character to "hyphen-minus" character -.

Solution

To more fully support the ISO 8601 standard, chrono should parse Unicode "minus sign" character (U+2212) in numeric timezone offsets, e.g. −07:00.

@djc
Copy link
Contributor

djc commented Oct 6, 2022

I'm open to adding support for this, want to submit a PR?

I am curious, which components are dumping obscure Unicode characters in your syslog? 😄

jtmoon79 added a commit to jtmoon79/chrono that referenced this issue Mar 28, 2023
Timezone signage also allows MINUS SIGN (U+2212) as
specified by ISO 8601 and RFC 3339.

Not for RFC 2822 format or RFC 8536 transition string.

Issue chronotope#835
jtmoon79 added a commit to jtmoon79/chrono that referenced this issue Mar 28, 2023
Extra tests not strictly necessary for prior commit introducing
MINUS SIGN parsing.

Issue chronotope#835
@jtmoon79 jtmoon79 changed the title parsing timezone should support parsing Unicode "minus" parsing timezone should support Unicode "MINUS SIGN" (U+2212) Mar 28, 2023
jtmoon79 added a commit to jtmoon79/chrono that referenced this issue Mar 28, 2023
Timezone signage also allows MINUS SIGN (U+2212) as
specified by ISO 8601 and RFC 3339.

Not for RFC 2822 format or RFC 8536 transition string.

Issue chronotope#835
jtmoon79 added a commit to jtmoon79/chrono that referenced this issue Mar 28, 2023
Extra tests not strictly necessary for prior commit introducing
MINUS SIGN parsing.

Issue chronotope#835
jtmoon79 added a commit to jtmoon79/chrono that referenced this issue Mar 30, 2023
jtmoon79 added a commit to jtmoon79/chrono that referenced this issue Mar 30, 2023
Simplify timezone_offset_internal loop on negative.

Issue chronotope#835
PR chronotope#1001
jtmoon79 added a commit to jtmoon79/chrono that referenced this issue May 21, 2023
Timezone signage also allows MINUS SIGN (U+2212) as
specified by ISO 8601 and RFC 3339.

Not for RFC 2822 format or RFC 8536 transition string.

Issue chronotope#835
jtmoon79 added a commit to jtmoon79/chrono that referenced this issue May 21, 2023
Extra tests not strictly necessary for prior commit introducing
MINUS SIGN parsing.

Issue chronotope#835
jtmoon79 added a commit to jtmoon79/chrono that referenced this issue May 21, 2023
jtmoon79 added a commit to jtmoon79/chrono that referenced this issue May 21, 2023
Simplify timezone_offset_internal loop on negative.

Issue chronotope#835
PR chronotope#1001
jtmoon79 added a commit to jtmoon79/chrono that referenced this issue May 28, 2023
Timezone signage also allows MINUS SIGN (U+2212) as
specified by ISO 8601 and RFC 3339.

Not for RFC 2822 format or RFC 8536 transition string.

Issue chronotope#835
jtmoon79 added a commit to jtmoon79/chrono that referenced this issue May 28, 2023
Timezone signage also allows MINUS SIGN (U+2212) as
specified by ISO 8601 and RFC 3339.

Not for RFC 2822 format or RFC 8536 transition string.

Issue chronotope#835
jtmoon79 added a commit to jtmoon79/chrono that referenced this issue May 28, 2023
Timezone signage also allows MINUS SIGN (U+2212) as
specified by ISO 8601 and RFC 3339.

Not for RFC 2822 format or RFC 8536 transition string.

Issue chronotope#835

(cherry picked from commit 5d14c6e)
jtmoon79 added a commit to jtmoon79/chrono that referenced this issue May 28, 2023
Timezone signage also allows MINUS SIGN (U+2212) as
specified by ISO 8601 and RFC 3339.

Not for RFC 2822 format or RFC 8536 transition string.

Issue chronotope#835
jtmoon79 added a commit to jtmoon79/chrono that referenced this issue May 28, 2023
Timezone signage also allows MINUS SIGN (U+2212) as
specified by ISO 8601 and RFC 3339.

Not for RFC 2822 format or RFC 8536 transition string.

Issue chronotope#835
jtmoon79 added a commit to jtmoon79/chrono that referenced this issue May 28, 2023
Timezone signage also allows MINUS SIGN (U+2212) as
specified by ISO 8601 and RFC 3339.

Not for RFC 2822 format or RFC 8536 transition string.

Issue chronotope#835

(cherry picked from commit 5d14c6e)
jtmoon79 added a commit to jtmoon79/chrono that referenced this issue May 28, 2023
Timezone signage also allows MINUS SIGN (U+2212) as
specified by ISO 8601 and RFC 3339.

Not for RFC 2822 format or RFC 8536 transition string.

Issue chronotope#835
jtmoon79 added a commit to jtmoon79/chrono that referenced this issue May 28, 2023
Timezone signage also allows MINUS SIGN (U+2212) as
specified by ISO 8601 and RFC 3339.

Not for RFC 2822 format or RFC 8536 transition string.

Issue chronotope#835
jtmoon79 added a commit to jtmoon79/chrono that referenced this issue May 29, 2023
Timezone signage also allows MINUS SIGN (U+2212) as
specified by ISO 8601 and RFC 3339.

Not for RFC 2822 format or RFC 8536 transition string.

Issue chronotope#835
jtmoon79 added a commit to jtmoon79/chrono that referenced this issue Jun 9, 2023
Timezone signage also allows MINUS SIGN (U+2212) as
specified by ISO 8601 and RFC 3339.

Not for RFC 2822 format or RFC 8536 transition string.

Issue chronotope#835
jtmoon79 added a commit to jtmoon79/chrono that referenced this issue Jun 9, 2023
Timezone signage also allows MINUS SIGN (U+2212) as
specified by ISO 8601 and RFC 3339.

Not for RFC 2822 format or RFC 8536 transition string.

Issue chronotope#835
jtmoon79 added a commit to jtmoon79/chrono that referenced this issue Jun 9, 2023
Timezone signage also allows MINUS SIGN (U+2212) as
specified by ISO 8601 and RFC 3339.

Not for RFC 2822 format or RFC 8536 transition string.

Issue chronotope#835
jtmoon79 added a commit to jtmoon79/chrono that referenced this issue Jun 9, 2023
Timezone signage also allows MINUS SIGN (U+2212) as
specified by ISO 8601 and RFC 3339.

Not for RFC 2822 format or RFC 8536 transition string.

Issue chronotope#835
jtmoon79 added a commit to jtmoon79/chrono that referenced this issue Jun 9, 2023
Timezone signage also allows MINUS SIGN (U+2212) as
specified by ISO 8601 and RFC 3339.

Not for RFC 2822 format or RFC 8536 transition string.

Issue chronotope#835
djc pushed a commit that referenced this issue Jun 12, 2023
Timezone signage also allows MINUS SIGN (U+2212) as
specified by ISO 8601 and RFC 3339.

Not for RFC 2822 format or RFC 8536 transition string.

Issue #835
@pitdicker
Copy link
Collaborator

Fixed in #1087.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants