Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parse ISO 8601 basic format (hhmmss) #87

Closed
C4ptainCrunch opened this issue Dec 24, 2013 · 7 comments
Closed

Parse ISO 8601 basic format (hhmmss) #87

C4ptainCrunch opened this issue Dec 24, 2013 · 7 comments
Labels

Comments

@C4ptainCrunch
Copy link

At the moment being, arrow.get ignores the seconds if the datetime is in the basic format (hhmmss) and not in the extended format (hh:mm:ss) (see wikipedia article for more info)

In [1]: arrow.get('19980119T070100')
Out[1]: <Arrow [1998-01-19T07:01:00+00:00]>

In [2]: arrow.get('19980119T070101') # added 1 sec but output is the same
Out[2]: <Arrow [1998-01-19T07:01:00+00:00]>

If I add microseconds, output is correct

In [3]: arrow.get('19980119T070101.0')
Out[3]: <Arrow [1998-01-19T07:01:01+00:00]>

A part of the problem looks like the has_seconds detection in arrow/parser.py:70 is wrong. (Sould check len(time_parts[0]) if no : were found)

If it's ok with you i'll send a pull request in the next few days fixing the detection, the parsing (i didn't check i it need to be fixed as well) and adding new tests.

Edit : After a rapid test it looks like adding has_seconds = not has_seconds and len(time_parts[0]) == 6 after line 70 fixes the problem. (But i still will investigate futher and write some tests)

@C4ptainCrunch
Copy link
Author

Hey ! :)

Are you not interested or did you not see my message (or didn't have the time to respond) ?

@crsmithdev
Copy link
Collaborator

Apologies for the delay in replying...yes, this looks like a legit issue, and there are a number of similar things I'm wanting to address in the next update regarding bugs in ISO parsing. Feel free to address in a pull request in the mean time, though!

@moreati
Copy link

moreati commented Aug 27, 2014

#108 looks like a duplicate of this

@andrewelkins
Copy link
Contributor

I can no longer reproduce this. Can you @C4ptainCrunch

@C4ptainCrunch
Copy link
Author

C4ptainCrunch commented Aug 24, 2015 via email

@C4ptainCrunch
Copy link
Author

It looks that Arrow does not parse this format anymore

>>> arrow.get('19980119T070101')
ParserError: Could not match input to any of [u'YYYY-MM-DDTHH:mm'] on '19980119T070101'

@JoelChambers
Copy link

I can confirm that arrow 0.12.1 is still not parsing basic ISO-8601 date/time strings without separators, such as 20180710T230024 and 20180710T230024Z. I get the following error:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "arrow/api.py", line 22, in get
    return _factory.get(*args, **kwargs)
  File "arrow/factory.py", line 174, in get
    dt = parser.DateTimeParser(locale).parse_iso(arg)
  File "arrow/parser.py", line 119, in parse_iso
    return self._parse_multiformat(string, formats)
  File "arrow/parser.py", line 283, in _parse_multiformat
    raise ParserError('Could not match input to any of {0} on \'{1}\''.format(formats, string))
arrow.parser.ParserError: Could not match input to any of ['YYYY-MM-DDTHH:mm'] on '20180710T230024Z'

It looks like the parse_iso function needs to have more formats added to parse dates and datetimes without separators. I confess to not being good enough at Python myself to submit a code recommendation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants