Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New TimeSpan.From overloads which take integers. #93890

Closed
tannergooding opened this issue Oct 23, 2023 · 24 comments · Fixed by #98633
Closed

New TimeSpan.From overloads which take integers. #93890

tannergooding opened this issue Oct 23, 2023 · 24 comments · Fixed by #98633
Labels
api-approved API was approved in API review, it can be implemented area-System.DateTime
Milestone

Comments

@tannergooding
Copy link
Member

tannergooding commented Oct 23, 2023

Summary

TimeSpan exposes several From* methods that allow users to create a timespan using a double. However, double is a binary based floating-point format and thus has natural imprecision that can introduce error. For example, TimeSpan.FromSeconds(101.832) is not 101 seconds, 832 milliseconds, but rather is 101 seconds, 831.9999999999936335370875895023345947265625 milliseconds.

This has, over time, led to repeated user confusion and bugs in the API surface that users have had to rationalize and deal with. It is also one of the less efficient ways to represent this data and makes it problematic for users wanting or expecting a particular behavior

As such, it is proposed that new overloads be provided, allowing users to pass in integers so they can get the desired and intended behavior. It may be worth considering whether or not the double overloads should be "deprecated" (either via obsoletion or some analyzer) to help direct users towards success.

API Proposal

namespace System;

public partial struct TimeSpan
{
    public static TimeSpan FromDays(int days, int hours = 0, int minutes = 0, int seconds = 0, int milliseconds = 0, int microseconds = 0);
    public static TimeSpan FromHours(int hours, int minutes = 0, int seconds = 0, int milliseconds = 0, int microseconds = 0);
    public static TimeSpan FromMinutes(int minutes, int seconds = 0, int milliseconds = 0, int microseconds = 0);
    public static TimeSpan FromSeconds(int seconds, int milliseconds = 0, int microseconds = 0);
    public static TimeSpan FromMilliseconds(int milliseconds, int microseconds = 0);
    public static TimeSpan FromMicroseconds(int microseconds);
}

Additional Considerations

It may be desirable to not use default parameters. However, this provides a benefit that a user can do something like FromDays(5, seconds: 30) if that were appropriate for their use case.

After these APIs are exposed, calling From(x) where x was an integer may produce a different result in various edge cases.

APIs where the user specifies a uint, long, or ulong would still end up calling the double overload due to the implicit conversion that exists in C#. Thus, an analyzer or deprecation of the double overloads may be desirable to help users achieve success.

Exposing overloads that take decimal could also be desirable and would avoid the issues around precision loss, but would be much less performant than breaking it apart like this.

@tannergooding tannergooding added api-suggestion Early API idea and discussion, it is NOT ready for implementation area-System.DateTime labels Oct 23, 2023
@ghost
Copy link

ghost commented Oct 23, 2023

Tagging subscribers to this area: @dotnet/area-system-datetime
See info in area-owners.md if you want to be subscribed.

Issue Details

Summary

TimeSpan exposes several From* methods that allow users to create a timespan using a double. However, double is a binary based floating-point format and thus has natural imprecision that can introduce error. For example, TimeSpan.FromSeconds(101.832) is not 101 seconds, 832 milliseconds, but rather is 101 seconds, 831.9999999999936335370875895023345947265625 milliseconds.

This has, over time, led to repeated user confusion and bugs in the API surface that users have had to rationalize and deal with. It is also one of the less efficient ways to represent this data and makes it problematic for users wanting or expecting a particular behavior

As such, it is proposed that new overloads be provided, allowing users to pass in integers so they can get the desired and intended behavior. It may be worth considering whether or not the double overloads should be "deprecated" (either via obsoletion or some analyzer) to help direct users towards success.

API Proposal

namespace System;

public partial struct TimeSpan
{
    public static TimeSpan FromDays(int days, int hours = 0, int minutes = 0, int seconds = 0, int milliseconds = 0, int microseconds = 0);
    public static TimeSpan FromHours(int hours = 0, int minutes = 0, int seconds = 0, int milliseconds = 0, int microseconds = 0);
    public static TimeSpan FromMinutes(int minutes, int seconds = 0, int milliseconds = 0, int microseconds = 0);
    public static TimeSpan FromSeconds(int seconds, int milliseconds = 0, int microseconds = 0);
    public static TimeSpan FromMilliseconds(int milliseconds, int microseconds = 0);
    public static TimeSpan FromMicroseconds(int microseconds);
}

Additional Considerations

It may be desirable to not use default parameters. However, this provides a benefit that a user can do something like FromDays(5, seconds: 30) if that were appropriate for their use case.

After these APIs are exposed, calling From(x) where x was an integer may produce a different result in various edge cases.

APIs where the user specifies a uint, long, or ulong would still end up calling the double overload due to the implicit conversion that exists in C#. Thus, an analyzer or deprecation of the double overloads may be desirable to help users achieve success.

Exposing overloads that take decimal could also be desirable and would avoid the issues around precision loss, but would be much less performant than breaking it apart like this.

Author: tannergooding
Assignees: -
Labels:

api-suggestion, area-System.DateTime

Milestone: -

@ghost ghost added the untriaged New issue has not been triaged by the area owner label Oct 23, 2023
@tannergooding
Copy link
Member Author

Opened in response to #93831 and the numerous other issues where the various TimeSpan APIs which take double have shown themselves to be problematic.

CC. @tarekgh

@vcsjones
Copy link
Member

vcsjones commented Oct 23, 2023

TimeSpan has constructors that largely do the same thing, unless I am misunderstanding. There was also the recent addition of constructors that take microseconds in #23799.

Are the constructors reasonable enough? If not, should the focus instead be on improving the constructors?

@tarekgh
Copy link
Member

tarekgh commented Oct 23, 2023

public static TimeSpan FromHours(int hours = 0, int minutes = 0, int seconds = 0, int milliseconds = 0, int microseconds = 0);

@tannergooding did you mean to have the hours parameter with default value?

@vcsjones the idea here is having the users not providing a lot of value as the constructor requires. Having default parameter values will help.

@tarekgh tarekgh added this to the 9.0.0 milestone Oct 23, 2023
@ghost ghost removed the untriaged New issue has not been triaged by the area owner label Oct 23, 2023
@tarekgh tarekgh modified the milestones: 9.0.0, Future Oct 23, 2023
@tannergooding
Copy link
Member Author

did you mean to have the hours parameter with default value?

Nope. The first parameter should be non-defaultable. Fixed.

Please feel free to mark api-ready-for-review if you think this is in the right shape and good for API review.

Are the constructors reasonable enough? If not, should the focus instead be on improving the constructors?

I think the constructors are worse in a lot of ways over static methods. For example, new TimeSpan(1, 2, 3) and new TimeSpan(1, 2, 3, 4) having arguments not line up (one is hours first the other is days first). You can't expose defaultable parameters without causing ambiguities. new TimeSpan(...) in general doesn't help provide clarity or intent, at a glance, etc.

The explicit From* APIs, on the other hand, avoid most of these issues and provide additional clarity to the code in question (as a personal preference, TimeSpan.FromHours(5, 1, 30) also looks a lot cleaner than new TimeSpan(5, 1, 30) and is more visually appealing in code).

@tarekgh tarekgh added api-ready-for-review API is ready for review, it is NOT ready for implementation and removed api-suggestion Early API idea and discussion, it is NOT ready for implementation labels Oct 23, 2023
@skyoxZ
Copy link
Contributor

skyoxZ commented Oct 24, 2023

After these APIs are exposed, calling From(x) where x was an integer may produce a different result in various edge cases.

Really? I thought double deals Integer without precision loss. Then these APIs seem useless.

@tannergooding
Copy link
Member Author

I thought double deals Integer without precision loss

No. double can only accurately represent integers up to 2^53, after this point it is lossy. It introduces "loss" for fractional data far before this. This likewise means a loss of data when representing things like FromDays() where the scale is large, because they become a large integral value that can exceed this boundary. -- For float, this boundary is 2^24

The exclusively integer based approach as proposed here is more accurate and likely to be faster for several scenarios. It likewise doesn't need to handle edge cases such as NaN or Infinity. The difference in result is that it will return a precise answer matching exactly what the user intended and will not introduce minor inaccuracies due to underlying limitations of a binary based floating-point representation. -- There are notably ways to improve the accuracy of the double based overloads, and we've done several of those. But there is fundamentally still some level of error since what the user types as a literal is not always going to be the actual value represented, especially when fractional data is involved.

@Clockwork-Muse
Copy link
Contributor

I'm pretty sure @tarekgh meant int specifically here, and as a 32-bit type would be less than the precision issues of double.

@tannergooding
Copy link
Member Author

The consideration is you have 864_000_000_000 ticks per day. This means trying to represent more than 10424 days (28.55 years) puts you at risk of data loss because it becomes a double in the general range of 2^53 once scaled to ticks

Using int ensures there is no loss of data as you can simply upcast to long and then scale, since it will be in range of 2^63, you’re ultimately fine.

You could, notably, avoid much of this data loss for the double overload by breaking the double into integral and fractional parts, casting the integral part to an integer, then scaling, then separately scaling by the fractional part and adding them together. However, this makes the code more expensive and results in a non trivial performance regression for which users don’t have an easy workaround. It likewise doesn’t solve the issue around dealing with the fractional data that is likely not as precise as what the user likely typed as a literal (given only powers of 2 and certain multiples of powers of 2 can be represented exactly)

It’s an altogether tricky business and a good example of where the convenience factor doesn’t necessarily make something good. It shows how making a fix can then also be problematic due to people depending on the existing behavior or perf characteristics.

@Clockwork-Muse
Copy link
Contributor

... Sure, but all the internal representation is integer based, so that only applies to the constructor.

@tannergooding
Copy link
Member Author

@Clockwork-Muse, I'm unsure what you're trying to say or point you're making. Could you please clarify?

What I'm stating is that TimeSpan is backed by a 64-bit data type that tracks 100ns ticks. The constructors are setup in a way that makes them confusing (potentially violates some framework design guidelines) and which cannot allow them to use things like defaultable parameters, so they cannot be used here. The From* methods currently take a double, this introduces error with fractional data and due to the current implementation error with integer data when it is scaled by the TicksPer* value, causing it to pass 2^53. It cannot be trivially changed due to it the significant and negative impact it would have on performance.

These new functions address those various concerns and give a much friendlier way to express the data, both from a performance and usability perspective.

@Clockwork-Muse
Copy link
Contributor

Sorry, yes, I'm agreeing with you here.

@bartonjs
Copy link
Member

bartonjs commented Oct 24, 2023

Video

  • We discussed the edge cases of when FromDays(literal) might give a different value on a recompile (versus that number treated as a double). Unless Jeremy did bad math, that is approximately at FromDays(67_108_864), which seems big enough to not be a concern.
  • The methods should be overloaded to have no default parameters per our normal default parameters usability guideline
  • We changed minutes and smaller to be long because their values can mathematically exceed int.MaxValue without exceeding DateTime.MaxValue
namespace System;

public partial struct TimeSpan
{
    public static TimeSpan FromDays(int days);
    public static TimeSpan FromDays(int days, int hours = 0, long minutes = 0, long seconds = 0, long milliseconds = 0, long  microseconds = 0);
    public static TimeSpan FromHours(int hours);
    public static TimeSpan FromHours(int hours, long minutes = 0, long seconds = 0, long milliseconds = 0, long microseconds = 0);
    public static TimeSpan FromMinutes(long minutes);
    public static TimeSpan FromMinutes(long minutes, long seconds = 0, long milliseconds = 0, long microseconds = 0);
    public static TimeSpan FromSeconds(long seconds);
    public static TimeSpan FromSeconds(long seconds, long milliseconds = 0, long microseconds = 0);
    public static TimeSpan FromMilliseconds(long milliseconds, long microseconds = 0);
    public static TimeSpan FromMicroseconds(long microseconds);
}

@bartonjs bartonjs added api-approved API was approved in API review, it can be implemented and removed api-ready-for-review API is ready for review, it is NOT ready for implementation labels Oct 24, 2023
@tannergooding tannergooding added the help wanted [up-for-grabs] Good issue for external contributors label Oct 24, 2023
@tcortega
Copy link
Contributor

@bartonjs @tannergooding I'd be interested in working on this issue if it's still available. I have a few clarifying questions:

  1. What's the plan for the double overloads? Will they remain unchanged or is there an intent to deprecate them?
  2. I'm assuming the TimeSpan constructors will remain as they are. Can you confirm this?
  3. The original issue mentioned the addition of analyzers. Is integrating an analyzer a requisite for resolving this issue?

@tarekgh
Copy link
Member

tarekgh commented Oct 24, 2023

@tcortega thanks for willing to help with this one. You can work on it.

What's the plan for the double overloads? Will they remain unchanged or is there an intent to deprecate them?

No, we are not touching these overloads nor deprecating it.

I'm assuming the TimeSpan constructors will remain as they are. Can you confirm this?

Yes, we are not touching the constructors.

The original issue mentioned the addition of analyzers. Is integrating an analyzer a requisite for resolving this issue?

No. analyzer work shouldn't block this issue.

@tarekgh
Copy link
Member

tarekgh commented Oct 28, 2023

@tcortega are you going to work in this one?

@tcortega
Copy link
Contributor

@tcortega are you going to work in this one?

Yes, can you assign it to me please?

tcortega added a commit to tcortega/runtime that referenced this issue Oct 28, 2023
@tcortega
Copy link
Contributor

@tarekgh I've submitted a preliminary pull request with my current modifications. It'd be great if you could review it and provide feedback on any issues. In the meantime, I'll be working on the test cases for the new methods.

@colejohnson66
Copy link

As an outsider: what is the purpose of special casing long.MinValue and long.MaxValue with checks to return the static TimeSpan.MinValue and TimeSpan.MaxValue? new TimeSpan(long.MinValue) is identical to TimeSpan.MinValue. And because it’s a struct, there’s no allocations to worry about.

Is this a runtime optimization/inlining thing? If it is, I’d argue it should be called out in the source for future readers.

@tcortega
Copy link
Contributor

As an outsider: what is the purpose of special casing long.MinValue and long.MaxValue with checks to return the static TimeSpan.MinValue and TimeSpan.MaxValue? new TimeSpan(long.MinValue) is identical to TimeSpan.MinValue. And because it’s a struct, there’s no allocations to worry about.

Is this a runtime optimization/inlining thing? If it is, I’d argue it should be called out in the source for future readers.

I'd argue that it is due to allocations purpose but I simply attempted to copy the current structure of the IntervalFromDoubleTicks that performs an equivalent check:

private static TimeSpan IntervalFromDoubleTicks(double ticks)
{
    if ((ticks > long.MaxValue) || (ticks < long.MinValue) || double.IsNaN(ticks))
        ThrowHelper.ThrowOverflowException_TimeSpanTooLong();
    if (ticks == long.MaxValue)
        return MaxValue;
    return new TimeSpan((long)ticks);
}

@tcortega
Copy link
Contributor

@colejohnson66 You were right anyways, I refactored the code.

@tcortega
Copy link
Contributor

@tarekgh In our prior conversation on PR #94143, you mentioned I should align the overflow check with the existing one in the constructor found at

public TimeSpan(int days, int hours, int minutes, int seconds, int milliseconds, int microseconds)

However, it appears that the current validation is flawed, as demonstrated by the following inconsistency when using csharprepl:

> new TimeSpan(2142483647, 0, 0, 0, 0, 0)
7443823.15:41:44.4838400

> TimeSpan.FromDays(2142483647)
┌───────────────────────Exception───────────────────────┐
│ TimeSpan overflowed because the duration is too long. │
└───────────────────────────────────────────────────────┘

@tarekgh
Copy link
Member

tarekgh commented Oct 30, 2023

@tcortega this is a bug we can fix. It doesn't mean we should follow this wrong behavior.

@tcortega
Copy link
Contributor

tcortega commented Nov 1, 2023

I had a lot on my plate today, so I'm planning to knock this out tomorrow or the day after.

@tarekgh tarekgh modified the milestones: Future, 9.0.0 Nov 20, 2023
stefannikolei added a commit to stefannikolei/runtime that referenced this issue Dec 6, 2023
@ghost ghost added the in-pr There is an active PR which will close this issue when it is merged label Dec 8, 2023
@ghost ghost removed the in-pr There is an active PR which will close this issue when it is merged label Feb 10, 2024
@ghost ghost added the in-pr There is an active PR which will close this issue when it is merged label Feb 18, 2024
@ghost ghost removed the in-pr There is an active PR which will close this issue when it is merged label Mar 2, 2024
@tarekgh tarekgh removed the help wanted [up-for-grabs] Good issue for external contributors label Mar 2, 2024
@github-actions github-actions bot locked and limited conversation to collaborators Apr 1, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
api-approved API was approved in API review, it can be implemented area-System.DateTime
Projects
None yet
8 participants