Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance considerations #567

Closed
filmor opened this issue Aug 17, 2016 · 12 comments
Closed

Performance considerations #567

filmor opened this issue Aug 17, 2016 · 12 comments
Milestone

Comments

@filmor
Copy link

filmor commented Aug 17, 2016

Why does NodaTime 2 use (nanoOfDay, day) as the data model? Wouldn't it be nicer to use (overflow, unixTimestampNanos)?

My concern is basically, that NodaTime 2 introduces a huge overhead on converting a 64 bit integer to an Instant object, just because of this convention (compared to NodaTime 1, where instead of multiple arithmetic operations it was a straight assignment).

Nanosecond precision in a 64bit unix timestamp is enough to express the interval [1824, 2115], so in that whole time you could ignore the overflow part completely.

@jskeet
Copy link
Member

jskeet commented Aug 17, 2016

I assume you mean for the datamodel for Instant? Aside from anything else, this model makes it very easy to convert from a OffsetDateTime (etc) to an Instant, because you just need to add the offset, potentially adding or removing a day if it overflows.

Can you give an example of a piece of code which performs well in Noda Time 1.x but badly under 2.0? I'm very happy to work on improving a concrete use case, but it's hard to reason about performance in the abstract.

(The benchmarks aren't currently running automatically, and I need to do work on pushing those etc, but early in the 2.0 cycle I believe I compared everything and didn't see the huge overhead you're talking about.)

@filmor
Copy link
Author

filmor commented Aug 17, 2016

Yes, I meant Instant. How does this behaviour change if you just use the whole range of a 64bit integer instead of just a single day? You can still do additions in the exact same way.

A piece of code like this:

  for (var i = 0; i < arr.Length; ++i)
  {
      var val = Instant.FromUnixTimeTicks(arr[i]);
      // var val = Instant.FromTicksSinceUnixEpoch(arr[i]);
  }

was basically immediate with version 1.3, as it's essentially a straight assignment, while the new version makes this perform orders of magnitude worse (40ms vs 3s on average for an array of 2^25 long entries).

@jskeet jskeet added this to the 2.0-consider milestone Aug 17, 2016
@jskeet
Copy link
Member

jskeet commented Aug 17, 2016

How does this behaviour change if you just use the whole range of a 64bit integer instead of just a single day? You can still do additions in the exact same way.

That changes because I'd have to convert from the date-based LocalDateTime format to the unix offset to start with.

All of this is possible, but I strongly suspect it would make quite a bit of the code more complicated - and would certainly trade off performance for some other set of operations.

In fact it wouldn't be a change to Instant as much as it is to Duration, which is the basis of most things now. I'll try to have a play when I get the chance...

@jskeet
Copy link
Member

jskeet commented Aug 18, 2016

Out of interest, what do you later do with the Instant? I suspect that any improvement here would be negated by that, but if your use case is really just to preserve it as an Instant (perhaps only converting one in a million values to something else) then that wouldn't be particularly fair.

@jskeet
Copy link
Member

jskeet commented Aug 19, 2016

I've had a quick look at this, and as soon as you use Instant.FromUnixTimeTicks(...).WithOffset(...).Year (for example), v2.0 wins in terms of performance. Obviously just the FromUnixTimeTicks call wins hands-down with v1.0 as it does basically nothing. Even with a change of representation, it's still going to be slower in v2.0 due to more validation and multiplication by NanosecondsPerTick. Will try to do some more benchmarking to see how much we'd save if - for in-range values - we just had to do that multiplication.

@jskeet
Copy link
Member

jskeet commented Aug 19, 2016

Okay, a tiny bit more testing: removing the argument validation halves the time taken. Inlining the validation still massively improves things, which suggests I might want to have some sort of automatic inlining of preconditions. Will raise a separate issue for that... of course, while a 50% improvement is wonderful, it doesn't get us back to where you'd like to be. I don't think we'll ever be at quite that stage, but I'll continue to hack at this a bit... see whether an alternative Duration implementation can be universally better.

@filmor
Copy link
Author

filmor commented Aug 19, 2016

Thanks for all your work already :)

The idea was basically, to be fast at interchanging data. I don't always have to actually do something with the instants, only if they are "off", so I prefer a fast path from int64 arrays.

@jskeet
Copy link
Member

jskeet commented Aug 19, 2016

Thank you for bringing it to my attention :) If nothing, I can definitely improve it by nearly 50%... and the way I can do that may well improve all preconditions. I'm now going to run through the full benchmark set twice, once with a tweak to preconditions. This could be a huge win :)

@jskeet
Copy link
Member

jskeet commented Aug 24, 2016

Just tried this again with milliseconds, and it's an order of magnitude worse than with ticks. Ick. More work to do...

@jskeet
Copy link
Member

jskeet commented Aug 30, 2016

I've got the first pass at a change in https://github.com/jskeet/nodatime/tree/duration-refactoring
Unfortunately, that doesn't affect the performance of FromUnixTimeTicks much right now... it makes it a little better, but not much. More work is required to see whether a) we can improve it in the refactored version; b) the same work could improve things in the original 2.0 code.

Having done the refactoring (to blocks of ~180 years) it does feel slightly out of kilter with the rest of the code - in 2.0, Noda Time is very "day vs part-of-day" based, consistently. This change alters that significantly. It feels like it should be faster for very common cases (+/- 180 years) but that hasn't been proven yet.

Basically: I'm still looking, but I'm not committing to the bigger change yet. We can definitely optimize FromUnixTime* further in the current master code though.

@jskeet
Copy link
Member

jskeet commented Sep 1, 2016

Okay, I've done a bit more work on this... in a small test rig, I get the following results for constructing instants from 100 million values:

  • Noda Time 1.x: 42ms
  • Noda Time 2.x: 1100ms after some more tweaking
  • Noda Time 2.x with large chunks: 800ms

I can get it down to about 200ms by inlining a load of code in the test rig, but it looks like as soon as there are any methods involved, the performance hits the 800ms area.

There's no way we'll get it to the Noda Time 1.x performance, simply because:

  • Instant is bigger now (12 / 16 bytes instead of 8) due to using nanosecond precision (which I still think is the right call, particularly as .NET moves away from being restricted to Windows)
  • There's more arithmetic to do

Basically, you happen to have a use case where the 1.x implementation is utterly trivial and the 2.x implementation isn't... so a huge performance gap is pretty inevitable. That said, I can completely see that it sucks for you :(

Given the relatively meager benefit in using the large blocks here, and the significant implementation complexity, I think I'll stick with the existing model, but obviously tweak it as far as I can. (It's a shame - I'd really hoped that using large blocks would help, but it looks like it's not the maths that's taking the time...) Do push back if you can think of a solution that doesn't involve moving back to ticks-level precision...

@jskeet jskeet modified the milestones: 2.0.0, 2.0-consider Nov 15, 2016
@jskeet
Copy link
Member

jskeet commented Nov 15, 2016

Setting as 2.0 as I know there are a few bits I haven't merged yet. We won't be doing the big refactor though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants