Performance considerations #567

filmor · 2016-08-17T10:17:38Z

Why does NodaTime 2 use (nanoOfDay, day) as the data model? Wouldn't it be nicer to use (overflow, unixTimestampNanos)?

My concern is basically, that NodaTime 2 introduces a huge overhead on converting a 64 bit integer to an Instant object, just because of this convention (compared to NodaTime 1, where instead of multiple arithmetic operations it was a straight assignment).

Nanosecond precision in a 64bit unix timestamp is enough to express the interval [1824, 2115], so in that whole time you could ignore the overflow part completely.

The text was updated successfully, but these errors were encountered:

jskeet · 2016-08-17T10:53:06Z

I assume you mean for the datamodel for Instant? Aside from anything else, this model makes it very easy to convert from a OffsetDateTime (etc) to an Instant, because you just need to add the offset, potentially adding or removing a day if it overflows.

Can you give an example of a piece of code which performs well in Noda Time 1.x but badly under 2.0? I'm very happy to work on improving a concrete use case, but it's hard to reason about performance in the abstract.

(The benchmarks aren't currently running automatically, and I need to do work on pushing those etc, but early in the 2.0 cycle I believe I compared everything and didn't see the huge overhead you're talking about.)

filmor · 2016-08-17T11:16:30Z

Yes, I meant Instant. How does this behaviour change if you just use the whole range of a 64bit integer instead of just a single day? You can still do additions in the exact same way.

A piece of code like this:

  for (var i = 0; i < arr.Length; ++i)
  {
      var val = Instant.FromUnixTimeTicks(arr[i]);
      // var val = Instant.FromTicksSinceUnixEpoch(arr[i]);
  }

was basically immediate with version 1.3, as it's essentially a straight assignment, while the new version makes this perform orders of magnitude worse (40ms vs 3s on average for an array of 2^25 long entries).

jskeet · 2016-08-17T12:34:12Z

How does this behaviour change if you just use the whole range of a 64bit integer instead of just a single day? You can still do additions in the exact same way.

That changes because I'd have to convert from the date-based LocalDateTime format to the unix offset to start with.

All of this is possible, but I strongly suspect it would make quite a bit of the code more complicated - and would certainly trade off performance for some other set of operations.

In fact it wouldn't be a change to Instant as much as it is to Duration, which is the basis of most things now. I'll try to have a play when I get the chance...

jskeet · 2016-08-18T15:02:10Z

Out of interest, what do you later do with the Instant? I suspect that any improvement here would be negated by that, but if your use case is really just to preserve it as an Instant (perhaps only converting one in a million values to something else) then that wouldn't be particularly fair.

jskeet · 2016-08-19T14:14:34Z

I've had a quick look at this, and as soon as you use Instant.FromUnixTimeTicks(...).WithOffset(...).Year (for example), v2.0 wins in terms of performance. Obviously just the FromUnixTimeTicks call wins hands-down with v1.0 as it does basically nothing. Even with a change of representation, it's still going to be slower in v2.0 due to more validation and multiplication by NanosecondsPerTick. Will try to do some more benchmarking to see how much we'd save if - for in-range values - we just had to do that multiplication.

jskeet · 2016-08-19T14:38:39Z

Okay, a tiny bit more testing: removing the argument validation halves the time taken. Inlining the validation still massively improves things, which suggests I might want to have some sort of automatic inlining of preconditions. Will raise a separate issue for that... of course, while a 50% improvement is wonderful, it doesn't get us back to where you'd like to be. I don't think we'll ever be at quite that stage, but I'll continue to hack at this a bit... see whether an alternative Duration implementation can be universally better.

filmor · 2016-08-19T14:45:48Z

Thanks for all your work already :)

The idea was basically, to be fast at interchanging data. I don't always have to actually do something with the instants, only if they are "off", so I prefer a fast path from int64 arrays.

jskeet · 2016-08-19T14:49:47Z

Thank you for bringing it to my attention :) If nothing, I can definitely improve it by nearly 50%... and the way I can do that may well improve all preconditions. I'm now going to run through the full benchmark set twice, once with a tweak to preconditions. This could be a huge win :)

jskeet · 2016-08-24T15:21:28Z

Just tried this again with milliseconds, and it's an order of magnitude worse than with ticks. Ick. More work to do...

jskeet · 2016-08-30T13:24:55Z

I've got the first pass at a change in https://github.com/jskeet/nodatime/tree/duration-refactoring
Unfortunately, that doesn't affect the performance of FromUnixTimeTicks much right now... it makes it a little better, but not much. More work is required to see whether a) we can improve it in the refactored version; b) the same work could improve things in the original 2.0 code.

Having done the refactoring (to blocks of ~180 years) it does feel slightly out of kilter with the rest of the code - in 2.0, Noda Time is very "day vs part-of-day" based, consistently. This change alters that significantly. It feels like it should be faster for very common cases (+/- 180 years) but that hasn't been proven yet.

Basically: I'm still looking, but I'm not committing to the bigger change yet. We can definitely optimize FromUnixTime* further in the current master code though.

jskeet · 2016-09-01T07:44:32Z

Okay, I've done a bit more work on this... in a small test rig, I get the following results for constructing instants from 100 million values:

Noda Time 1.x: 42ms
Noda Time 2.x: 1100ms after some more tweaking
Noda Time 2.x with large chunks: 800ms

I can get it down to about 200ms by inlining a load of code in the test rig, but it looks like as soon as there are any methods involved, the performance hits the 800ms area.

There's no way we'll get it to the Noda Time 1.x performance, simply because:

Instant is bigger now (12 / 16 bytes instead of 8) due to using nanosecond precision (which I still think is the right call, particularly as .NET moves away from being restricted to Windows)
There's more arithmetic to do

Basically, you happen to have a use case where the 1.x implementation is utterly trivial and the 2.x implementation isn't... so a huge performance gap is pretty inevitable. That said, I can completely see that it sucks for you :(

Given the relatively meager benefit in using the large blocks here, and the significant implementation complexity, I think I'll stick with the existing model, but obviously tweak it as far as I can. (It's a shame - I'd really hoped that using large blocks would help, but it looks like it's not the maths that's taking the time...) Do push back if you can think of a solution that doesn't involve moving back to ticks-level precision...

jskeet · 2016-11-15T10:03:07Z

Setting as 2.0 as I know there are a few bits I haven't merged yet. We won't be doing the big refactor though.

jskeet added this to the 2.0-consider milestone Aug 17, 2016

jskeet added the enhancement label Aug 17, 2016

jskeet modified the milestones: 2.0.0, 2.0-consider Nov 15, 2016

jskeet mentioned this issue Feb 19, 2017

A few optimizations #651

Merged

jskeet closed this as completed in #651 Feb 24, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance considerations #567

Performance considerations #567

filmor commented Aug 17, 2016

jskeet commented Aug 17, 2016

filmor commented Aug 17, 2016

jskeet commented Aug 17, 2016

jskeet commented Aug 18, 2016

jskeet commented Aug 19, 2016

jskeet commented Aug 19, 2016

filmor commented Aug 19, 2016

jskeet commented Aug 19, 2016

jskeet commented Aug 24, 2016

jskeet commented Aug 30, 2016

jskeet commented Sep 1, 2016

jskeet commented Nov 15, 2016

Performance considerations #567

Performance considerations #567

Comments

filmor commented Aug 17, 2016

jskeet commented Aug 17, 2016

filmor commented Aug 17, 2016

jskeet commented Aug 17, 2016

jskeet commented Aug 18, 2016

jskeet commented Aug 19, 2016

jskeet commented Aug 19, 2016

filmor commented Aug 19, 2016

jskeet commented Aug 19, 2016

jskeet commented Aug 24, 2016

jskeet commented Aug 30, 2016

jskeet commented Sep 1, 2016

jskeet commented Nov 15, 2016