Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add strict mode support when parsing date strings. #583

Open
jason-codaio opened this issue Jan 31, 2017 · 6 comments
Open

Add strict mode support when parsing date strings. #583

jason-codaio opened this issue Jan 31, 2017 · 6 comments
Labels
Milestone

Comments

@jason-codaio
Copy link

jason-codaio commented Jan 31, 2017

I recently realized sugar seems to ignore an arbitrary prefix, which makes it difficult to use for sniffing things that look like dates. For example some random text 09/16/2016 gets parsed as Sept 16th 2016, which in some use cases is pretty neat, but in others is a bit frustrating. Some sort of strict mode option would be nice, to prevent parsing to strings that are only contain date parts. I realize this might be tricky because for the above example I believe Sugar falls back to native JavaScript Date parsing, which honestly is something else that would be nice to prevent.

@andrewplummer
Copy link
Owner

Actually, Sugar date parsing is always strict, and in fact there was previously a request to make it "non-strict" (#297), which is possible, but not given priority as a use case.

What seems to be happening is that the format is not parsed and so falling back to native date parsing. There could be a flag to prevent this, as it would be hard to do otherwise, however I'm hard pressed to think of a use case where your application would not want to filter out certain content beforehand.

@jason-codaio
Copy link
Author

Filtering out certain content beforehand seems very difficult without writing a date parser because you need to be able to distinguish where does the garbage text start and end verse where the date portion starts and ends.

If you mean not parsing data that isn't a date, I would say consider the weakly typed data case where you want to use a date parser to sniff various fields to figure out if they are a date. While a date is found in this string I wouldn't consider it a date.

@andrewplummer
Copy link
Owner

andrewplummer commented Feb 1, 2017

To be honest, I'm not really sure what any of this means in the context of this discussion. In any case, let us define "strict mode" as a concept which means "accept string input that directly maps to a date and nothing else". As an example, this means "Thursday at 3pm" would parse, where "pick up the kids on Thursday at 3pm" would not. Or, even more concretely, it effectively means binding parsing regexes with ^ and $. So, to be clear, this is how Sugar behaves currently. However it seems that browser parsing may fall back to "greedy" or "non-strict" parsing, and this is what you're seeing currently.

To make this simple, if you are interested in parsing out text in "non-strict" mode, then please respond to #297 and I will close this as it is a duplicate.

If however, you want the opposite, or basically parsing that is even more strict than usual (by not falling back to non-strict browser parsing), then let's discuss your use cases here.

Although this is outside the core use-case of Sugar, which is to parse out whatever dates it is able to recognize, it would be possible to turn off browser parsing fallbacks, however I would like to discuss use cases first.

@jason-codaio
Copy link
Author

I would like true strict mode. Since sugar falls back to an api that is not strict it is by itself not strict since the caller has no idea Sugar has fallen back. There are other dangers with falling back to browser parsing in that it is not consistent across browsers, which is dangerous in many contexts .

Use cases:

  • Consistent date parsing behavior. Falling back to the browser leads to inconsistent results across different browsers.
  • Weekly typed data. In my data science scenarios your information schema is not strongly defined and you have to infer it by sniffing data fields. You would use a date parser to see if it matches any common date formats and is a date. In this case you would not want pick up the kids on Thursday at 3pm to come back as a date.

@andrewplummer
Copy link
Owner

Ok it's been a while to respond to this, but I'm circling back around to it.
Bottom line, I can see a use case for this. The only way I can think that "true strict mode" could be supported is to simply add a fallback option and have it switch on/off the fallback to parsing with new Date(...). If this works for you it should be simple to add.

One notable thing is that from what I've seen Firefox is already "strict", and Chrome seems to truncate junk characters on the front only (in other words, "pick up the kids on 2/5/1999" parses but "on 2/5/1999 pick up the kids" does not). In any case the bottom line is that there is inconsistency there and I can see a use case for removing it.

@andrewplummer andrewplummer added this to the Minor/Major milestone Apr 5, 2017
@jason-codaio
Copy link
Author

Hey @andrewplummer yeah I think that should be sufficient and help address those cross browser inconsistencies.

It probably isn't required, but might be worth looking at what common formats were being handled by the browser fallback and consider adding them, so that they would still work with the fallback shut off. The library user can always add them directly though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants