Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make Time.at loseless with ActiveSupport::TimeWithZone argument #40137

Closed
wants to merge 3 commits into from
Closed

Make Time.at loseless with ActiveSupport::TimeWithZone argument #40137

wants to merge 3 commits into from

Conversation

pirj
Copy link
Contributor

@pirj pirj commented Aug 29, 2020

Summary

Float representation is not sufficient to preserve nanosecond precision. Previously, to_f conversion was needed because TimeWithZone failed to conform to Time.at protocol that requires the object to respond to to_int in addition to to_r.

https://bugs.ruby-lang.org/issues/17131#note-1

To use the precise Rational representation of time, Time.at requires to_int to be defined in addition to to_r to make a distinction and avoid using to_r with objects that are not time objects, e.g. a String "123.456" (Rational(15432/125)).

ruby/ruby@7e1fddb
https://github.com/ruby/ruby/blob/7e1fddba4a609cb7bf4a696eccd892e68753bb21/spec/ruby/core/time/at_spec.rb#L96
https://bugs.ruby-lang.org/issues/17131#note-1

Time.at performs a check if the first argument is kind of a ::Time, however it can't be tricked by overriding kind_of?, and it wouldn't be sufficient, since it uses the internal structure of the Time class to copy over the time data.

In case when to_r is defined and to_int is not, it fails fast with a TypeError. This was previously considered in a test (initially introduced by b7f9de2). The nature of origin of the exception was not determined at the moment.

fixes #38831
improves #9403

Other Information

For DateTime, the coercion to Time is expensive, but most probably comparable to the effort needed to coerce it to floating point number and back to Time that Time.at does internally.

@pirj
Copy link
Contributor Author

pirj commented Aug 29, 2020

Test failure ActionView::Template::Error: wrong number of arguments (given 2, expected 1) seems unrelated.

Comment on lines +469 to +476
def to_int
end
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this method really supposed to be blank? I figured it'd be an alias for to_i, and return a number.

Copy link
Contributor Author

@pirj pirj Aug 30, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't think of a scenario where its return value is used.

Please correct me if I'm mistaken, there's this convention for to_ methods, there are e.g. to_s and to_str (the same applies to to_i/to_int, to_h/to_hash). The latter works if the object is of a target class. So, a string-like object would respond to to_str, and that would be used e.g. in "a" + object, a to_str would be called on object. Not to_s to avoid the implicit coercion, because many classes define to_s to provide their string representation. So, the short versions are for explicit coercion and the longer for implicit.

The same, I believe applies in this case. I don't want to make ActiveSupport::TimeWithZone implicitly coercible to an integer, that would make 1 + Time.current possible, but that doesn't make any sense. I rather prefer it to return nil and fail early.

Copy link
Contributor

@danielricecodes danielricecodes Sep 8, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then perhaps the method should return nil as opposed to being empty? And add some comments explaining why since not everyone is going to read this PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a test for this? I'm not seeing one. If not, there definitely should be.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test that covers the new behaviour is:

- assert_raise(TypeError) { assert_equal(Time.utc(2000, 1, 1, 0, 0, 0), Time.at(ActiveSupport::TimeWithZone.new(Time.utc(2000, 1, 1, 0, 0, 0), ActiveSupport::TimeZone["UTC"]), 0)) }
+ time_with_zone = ActiveSupport::TimeWithZone.new(time_with_nsec, ActiveSupport::TimeZone["UTC"])
+ assert_equal time_with_zone, Time.at(time_with_zone)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should return nil as opposed to being empty

This may lead to confusion since nil is a specific value, but actually it can equally be "hello", or Float::Infinity, or even raise "ooops". This method is never called, so its return value or its behaviour doesn't really matter.

add some comments explaining why

Sure, that makes total sense.

This comment was marked as outdated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added even more details in the comment. Please take another look.

@@ -47,8 +47,10 @@ def at_with_coercion(*args)
# Time.at can be called with a time or numerical value
time_or_number = args.first

if time_or_number.is_a?(ActiveSupport::TimeWithZone) || time_or_number.is_a?(DateTime)
at_without_coercion(time_or_number.to_f).getlocal
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The gist of this pull request to get rid of this to_f that causes the loss of nanosecond precision.

This could have probably be achieved by using time_or_number.to_r for ActiveSupport::TimeWithZone, but that would make it the whole trickery around Time.at even harder to understand.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think calling to_r ourselves is preferable to implementing to_int just so that Ruby will call to_r internally.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would only solve the problem with precision loss in Time.at.
I'm not 100% certain that's the only use case that Time uses to distinguish a String from a Time-like objects.
num_exact which performs respond_to?(:to_int) check, seems to be used outside of Time.at. Can't tell for sure if that could lead to loss of the precision, but why risk and work around instead of following the protocol for Time-like object that Ruby dictates?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Several other core methods also check whether their arguments respond to to_int: https://github.com/ruby/spec/search?q=%22should_receive+to_int%22

Allowing those methods to accept ActiveSupport::TimeWithZone when they couldn't before seems risky. Applications may also be using respond_to?(:to_int) to detect numeric values, which would also break with this change.

Calling to_r explicitly resolves the known problem (Time.at losing precision), with minimal risk of unwanted side-effects. Implementing to_int may solve other (hypothetical) problems, but the higher risk isn't worth it, IMO.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that Time itself doesn't respond to to_int (it responds to_i), and this makes this Ruby interface a bit awkward. However, what concerns you exactly?
Let's see those to_int usages one by one (it would probably be easier to check all ActiveSupport::TimeWithZone usages in Rails):

  1. Integer's << and >> shifts.
    If a Time object is passed as an argument, it's an undefined method 'to_int', but if it's AS::TWZ, the exception is different, TypeError: can't convert ActiveSupport::TimeWithZone to Integer (ActiveSupport::TimeWithZone#to_int gives NilClass). For me, the second is even more descriptive than the first, WDYT?
  2. For usages as Array's element reference index argument, it's somewhat similar: TypeError: no implicit conversion of Time into Integer vs TypeError: can't convert ActiveSupport::TimeWithZone to Integer (ActiveSupport::TimeWithZone#to_int gives NilClass).
    3... The same for Process.getrlimit, String#[]=, Array#drop, Array#sample, String#truncate, Process.setrlimit, Array#rotate, String#setbyte, Array#cycle, String#sum, Array.new
  3. Array#pack with Time/AS::TWZ arguments seem to behave similarly to the above given an integer directive pack format, e.g. 'Q'.
  4. Doesn't really apply to Array#hash, since it delegates to hash and expects to_int to be called on that hash, and Time-like is unlikely to have a chance to be returned there.

The list above makes the results from the first two pages of the search you've referenced.
So, from what I can see so far, there is only a difference in the exception message where AS::TWZ is used instead of Time, and it's not in a way more cryptic with AS::TWZ than it is with Time.

That makes me think that if a Time can't be used as an argument somewhere, the addition of to_int to AS::TWZ doesn't make a culprit of using AS::TWZ there, there's an exception clearly saying that it should not be used as such.

Do you have some real cases (in a way that AS::TWZ can't be used in place of Time, or where AS::TWZ can be used somewhere where the use of Time would raise an error)?
I can check for the usages of AS::TWZ in search for places where to_r should be called and it isn't.
This way, we'll have more information on hand to be able to make a more weighted decision if we want to locally work around by calling to_r in Rails and recommend doing so for other projects in Rails ecosystem, or to follow the given Ruby Time-like protocol and implement an empty to_int flag-method.

If there's any real problem, and Time-like objects that adhere to Time-like Ruby object protocol involve in a risk of weird behavioral difference between Time and Time-like (AS::TWZ in our case), we can open a discussion in Ruby issue tracker, and find a way how to avoid such different behavior.

Does it work for you, @eugeneius ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My concern is that ActiveSupport::TimeWithZone isn't just used internally, it's part of Rails' public API. Users interact with it pretty frequently, e.g. any time they call created_at on a record. I don't want to change its API in a way that's nonsensical for end users when there's an alternative way to get the same result without doing so.

Also, this may be splitting hairs, but I don't believe Ruby actually has a protocol for Time-like objects: rather, respond_to?(:to_int) is a protocol for numeric objects, and Time.at has some special behaviour for those. Implementing that protocol for ActiveSupport::TimeWithZone—which is not numeric—is undesirable, even if it results in the behaviour we want.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand that.
At the same time, the documentation explicitly states that:

TimeWithZone instances implement the same API as Ruby Time instances, so that Time and TimeWithZone instances are interchangeable.

This boils down to our dilemma since Time doesn't implement to_int, but in order to be interchangeable, TimeWithZone should implement it.

I might be exagerrating when mentioning "protocol", let me know what conclusion you make keeping the following in mind:

  1. https://bugs.ruby-lang.org/issues/17131#note-1

Assuming that the value passed to Time#at responds to to_r and returns a rational and responds to to_int, the precision specified by the rational should be kept.

(this doesn't explicitly states "and only if it responds to to_int", but I assume this is the case).

https://bugs.ruby-lang.org/projects/ruby-master/repository/git/revisions/7e1fddba4a609cb7bf4a696eccd892e68753bb21

States Time.at expects rational-like argument to respond to #to_int

https://github.com/ruby/ruby/blob/520a734ad9c7348f4e4858ee24640f42c88fd389/time.c#L525

            /* test to_int method availability to reject non-Numeric
             * objects such as String, Time, etc which have to_r method. */

The reason it lists Time along with String here is that there's a special case for Time, at least in Time.at https://github.com/ruby/ruby/blob/520a734ad9c7348f4e4858ee24640f42c88fd389/time.c#L2849

else if (IsTimeval(time)) {
 // COPY the underlying time object data

As you can see, the following is not exactly true:

respond_to?(:to_int) is a protocol for numeric objects

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My position hasn't changed from:

Calling to_r explicitly resolves the known problem (Time.at losing precision), with minimal risk of unwanted side-effects. Implementing to_int may solve other (hypothetical) problems, but the higher risk isn't worth it, IMO.

If there are other problems that would be solved by implementing to_int, let's discuss them concretely.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just merged #40448, which adds the explicit to_r conversion, so that #38831 will be fixed in the next release.

If you rebase this to add the DateTime fix on top, I'd be happy to merge it too.

# Only test this if the underlying Time.at raises a TypeError
begin
Time.at_without_coercion(Time.now, 0)
rescue TypeError
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With to_int, ActiveSupport::TimeWithZone complies with the protocol that Time.at requires, and TypeError is no more.

@danielricecodes
Copy link
Contributor

My biggest question is why? ActiveSupport::TimeWithZone#to_time returns a Time object so why do you need to pass it as an argument to Time.at? Can't you just call .to_time and not lose any precision? Or is the Time.at interface itself not fully defined until this issue has been resolved?

@pirj
Copy link
Contributor Author

pirj commented Sep 8, 2020

My biggest question is why?

Good question. Mostly because of this and this:

new_job[:at] = Time.at(new_job[:at]) if new_job[:at]
job.scheduled_at = Time.at(payload[:at]) if payload.key?(:at)

but I believe there can be other usages I haven't yet stumbled upon.

Making it a rule to call to_time on ActiveSupport::TimeWithZone is possible, but that spreads out the responsibility and is error-prone and quite unintuitive.

Since ActiveSupport::TimeWithZone pretends to be a ::Time:

    # Say we're a Time to thwart type checking.
    def is_a?(klass)
      klass == ::Time || super
    end
    alias_method :kind_of?, :is_a?

it should better behave like one.

And, the main reason for this pull request, it should maintain the nanosecond precision when passed through Time.at just the ordinary ::Time does.

By the way, if you check on Linux, Time.current actually returns non-zero nanoseconds (as opposed to macOS).

To use the precise Rational representation of time, Time.at requires
`to_int` to be defined in addition to `to_r` to make a distinction and
avoid using `to_r` with objects that are not time objects, e.g. a
String "123.456" (Rational(15432/125)).

ruby/ruby@7e1fddb
https://github.com/ruby/ruby/blob/7e1fddba4a609cb7bf4a696eccd892e68753bb21/spec/ruby/core/time/at_spec.rb#L96
https://bugs.ruby-lang.org/issues/17131#note-1

Time.at performs a check if the first argument is kind of a `::Time`,
however it can't be tricked by overriding `kind_of?`, and it wouldn't be
sufficient, since it uses the internal structure of the `Time` class to
copy over the time data.

In case when `to_r` is defined and `to_int` is not, it fails fast with a
TypeError. This was previously considered in a test (initially
introduced by
b7f9de2.
The nature of origin of the exception was not determined at the moment.

This commit extends the change made in #9403
Float representation is not sufficient to preserve nanosecond precision.
Previously, `to_f` conversion was needed because TimeWithZone failed to
conform to `Time.at` protocol that requires the object to respond to
`to_int` in addition to `to_r`.
https://bugs.ruby-lang.org/issues/17131#note-1

For `DateTime`, the coercion to `Time` is expensive, but most probably
comparable to the effort needed to coerce it to floating point number
and back to `Time` that `Time.at` does internally.

fixes #38831
@pirj
Copy link
Contributor Author

pirj commented Sep 17, 2020

@danielricecodes, @natematykiewicz please take another look. Replied to your questions and added clarifications directly to the code.

@@ -47,8 +47,10 @@ def at_with_coercion(*args)
# Time.at can be called with a time or numerical value
time_or_number = args.first

if time_or_number.is_a?(ActiveSupport::TimeWithZone) || time_or_number.is_a?(DateTime)
at_without_coercion(time_or_number.to_f).getlocal
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think calling to_r ourselves is preferable to implementing to_int just so that Ruby will call to_r internally.

activesupport/test/core_ext/time_ext_test.rb Outdated Show resolved Hide resolved
assert_equal time_with_zone, Time.at(time_with_zone)
assert_equal time_with_zone.to_r, Time.at(time_with_zone).to_r
assert_equal time_with_zone.to_f, Time.at(time_with_zone).to_f
assert_equal time_with_zone.nsec, Time.at(time_with_zone).nsec
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we missing an equivalent test for DateTime? Its precision is also improved by these changes.

Co-authored-by: Eugene Kenny <elkenny@gmail.com>
@pirj
Copy link
Contributor Author

pirj commented Dec 6, 2020

@eugeneius @BKSpurgeon Thanks for that other fix. I guess it does the job.
Do you think anything from this pull request might still be useful, like tests? I will just close otherwise.

@benkoshy
Copy link
Contributor

benkoshy commented Dec 6, 2020

Do you think anything from this pull request might still be useful, like tests? I will just close otherwise.

I haven't scrutinised the PR too closely -- so I'm not in good position to give you an educated opinion. What I can say is that I am very appreciative of the work / thought that has gone into it; and if I can be so bold and presumptuous as to speak for the entire Rails community, I would say that they are too.

@pirj pirj closed this Dec 6, 2020
@pirj pirj deleted the fix-active_support-time_with_zone-to-work-with-time_at branch December 6, 2020 23:11
@pirj
Copy link
Contributor Author

pirj commented Dec 7, 2020

Thanks for the warm words!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Time.at(time) != time
5 participants