Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dates can be anywhere #5

Merged
merged 5 commits into from
Mar 22, 2013
Merged

Dates can be anywhere #5

merged 5 commits into from
Mar 22, 2013

Conversation

sd
Copy link
Contributor

@sd sd commented Mar 18, 2013

The standard DateTime.parse allows for dates (and times) anywhere in the string, as well as extraneous sequences (like weekdays).

I've modified the regexp to no longer limit the search to the first part of the string to parse.

DateTime.parse("Thu 13/1/2013") no works as expected, as well as DateTime.parse("02:45:30 PM 13/1/2013")

…rse behavior, allowing for weekdays and dates after times
@jeremyevans
Copy link
Owner

The reason I didn't do this originally is that I wasn't sure it was safe. There may be corner cases where this changes how certain dates are parsed. You are using \b to separate the regexp. This breaks certain cases currently handled correctly such as '1/2/2012Thu'.

I appreciate the improvement you are attempting to make, and would optimally like such behavior to be part of american_date. Can you think of a safer way to implement this? Ideally, I would like to preserve the behavior of ruby 1.8, so that ruby 1.9+ with american_date operates like ruby 1.8 in respect to m/d/y handling.

@sd
Copy link
Contributor Author

sd commented Mar 18, 2013

I'll guess I'll do the right thing and look at the source code for the original function and see what the actual behavior is... Yak shaving can be fun!

@sd
Copy link
Contributor Author

sd commented Mar 18, 2013

1.8 (and 1.9) are very flexible when it comes to datetime strings... anything that matches a date and or time is used, and anything else is ignored.

So removing any anchors around the date regexp is the best way to mimic this behavior, even if it's a bit riskier.

In any case, the original regexp was anchoring the date at the beginning of the string, which breaks round-tripping for many common date formats like "Thu 2/1/2013" and "3:45pm 2/1/2013", hence the need to remove the anchors.

@jeremyevans
Copy link
Owner

The problem with not having any anchors at all is it changes the meaning of lots of date strings (much more so than having \b anchors). Consider '1001/12/20'.

I'm very risk averse in this case. I need to be fairly confident that such a change can be made without breakage before I'll merge.

I'm not sure if you've read the other pull requests, but my recommendation to other people who wanted different parsing is to just replace the regexp. I'll consider not freezing the existing regexp to make it easier to do that.

@clonezone
Copy link

Given the desire to not have the regex end in «\b», that it should end with «(?!\d)» so that it does not muck with, say, “10/10/1010101010”.

Also, the regex should change to allow for «.» and «-» as delimiters since the standard implementation does. (See ext/date/date_parse.c.)

@sd
Copy link
Contributor Author

sd commented Mar 22, 2013

Jeremy, I understand your concerns.

I see four simple options to move forward:

  • You don't need to do anything. We're happy keeping our own branch.
  • You unfreeze the regexp so that local installs can use your gem without forking, but customize the regexp. This is simpler, but makes it (slightly) harder to test the local regexp.
  • You either implement or accept a patch implementing predefined regexps that are tested as part of the gem. Something like AmericanDate.be_conservative or AmericanDate.match_anything_that_barely_resembles_a_date.
  • We keep trying to make a single regexp that covers most cases, and add all relevant specs from https://github.com/rubyspec/rubyspec/blob/master/library/date/parse_spec.rb to make sure it doesn't break backwards compatibility.

As I said, we're happy with our branch at this point, so I'll leave it up to you to decide how you want to proceed.

And thanks for creating this gem in the first place!

@jeremyevans
Copy link
Owner

@clonezone Ending with negative lookahead for digit seems reasonable and I think will be safe. We would also want to start with a negative lookbehind for a digit for the same reason. Negative lookbehind in a regexp is not supported on ruby 1.8 (SyntaxError), but we can probably use eval to get around that since on 1.8 the regexp isn't actually used. Here's what I'm talking about: eval('%r_(?<!\d)(\d{1,2})/(\d{1,2})/(\d{4}|\d{2})(?!\d)_')

I'm against accepting . and - as delimeters (at least by default). The point of this library is to get back the ruby 1.8.7 behavior as much as possible, and on ruby 1.8.7:

irb(main):001:0> Date.parse('10/11/2012').to_s
=> "2012-10-11"
irb(main):002:0> Date.parse('10-11-2012').to_s
=> "2012-11-10"
irb(main):003:0> Date.parse('10.11.2012').to_s
=> "2012-11-10"

@sd The regexp actually is unfrozen now. I'm not sure how much that buys you, since using remove_const and const_set allows pretty much the same thing. So the second option has pretty much always been available.

I'm not keen on adding additional configuration options or methods that change the behavior, so I'm against the third option.

Does modiyfing the regexp to use negative lookbehind and negative lookahead for a digit satisfy your fourth option? If so, I think that option is best, so if you could please update the pull request with that code and the additional spec examples we've discussed, I'll pull and release a new gem.

@clonezone
Copy link

Wow, so +1 to 1.9 for consistent behavior.

@jeremyevans
Copy link
Owner

@sd Should I try merging and testing now, or are there still some commits pending? Thanks!

@sd
Copy link
Contributor Author

sd commented Mar 22, 2013

I think that's all... let me know if any issues come up

On Fri, Mar 22, 2013 at 5:03 PM, Jeremy Evans notifications@github.comwrote:

@sd https://github.com/sd Should I try merging and testing now, or are
there still some commits pending? Thanks!


Reply to this email directly or view it on GitHubhttps://github.com//pull/5#issuecomment-15321372
.

@jeremyevans jeremyevans merged commit 2ffa50c into jeremyevans:master Mar 22, 2013
@jeremyevans
Copy link
Owner

Everything looks and tests fine. I've pushed out a 1.1.0 gem release with these fixes. Thanks to both of you for your help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants