Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance of Master has degraded #2325

Closed
clinct opened this issue Aug 23, 2016 · 20 comments
Closed

Performance of Master has degraded #2325

clinct opened this issue Aug 23, 2016 · 20 comments

Comments

@clinct
Copy link
Contributor

clinct commented Aug 23, 2016

We wanted at Plannerstack to update to the latest version of Master. We however found out that the performance of the latest Master has degraded quite a bit, in some cases taking 4 times longer to return a result.

  • Deployed version of OTP, OpenTripPlanner 0.20.0-SNAPSHOT: b34e4c2
  • Tested against the current Master branch: 13763e9

From the 15 tests 9 take way longer to complete now:

Test 0.20.0: b34e4c2 Master: 13763e9
Test 3 241 372
Test 4 394 910
Test 5 994 1810
Test 6 441 1184
Test 7 137 321
Test 8 422 984
Test 10 189 761
Test 11 167 331
Test 14 253 487

These 15 tests encompass special cases within the Dutch transit network, they can be found here: https://github.com/plannerstack/mmri/blob/master/tests/static-tests.json

I didn't have the time yet to look into it any further (e.g. bisecting), but did think it was quite a big issue and was wondering whether anyone else has seen this?

@kalon33
Copy link
Contributor

kalon33 commented Aug 23, 2016

I also feel that I run in more timeouts as before in my OTP instance, and that it is slower, but I didn't quantified it.

Apparently, that's not only a feeling... Thanks for reporting this precisely.

----- Mail original -----

De: "Jasper Hartong" notifications@github.com
À: "opentripplanner/OpenTripPlanner" OpenTripPlanner@noreply.github.com
Envoyé: Mardi 23 Août 2016 10:22:39
Objet: [opentripplanner/OpenTripPlanner] Performance of Master has degraded (#2325)

We wanted at Plannerstack to update to the latest version of Master. We however found out that the performance of the latest Master has degraded quite a bit, in some cases taking 4 times longer to return a result.

* Deployed version of OTP, OpenTripPlanner 0.20.0-SNAPSHOT: b34e4c2 
* Tested against the current Master branch: 13763e9 

From the 15 tests 9 take way longer to complete now: Test 0.20.0: b34e4c2
Master: 13763e9
Test 3 241 372
Test 4 394 910
Test 5 994 1810
Test 6 441 1184
Test 7 137 321
Test 8 422 984
Test 10 189 761
Test 11 167 331
Test 14 253 487

These 15 tests encompass special cases within the Dutch transit network, they can be found here: https://github.com/plannerstack/mmri/blob/master/tests/static-tests.json

I didn't have the time yet to look into it any further (e.g. bisecting), but did think it was quite a big issue and was wondering whether anyone else has seen this?


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub , or mute the thread .

@abyrd
Copy link
Member

abyrd commented Aug 23, 2016

Thanks @clinct for reporting this slowdown. I am very surprised to see a 3-4x slowdown, I didn't think anything had been changed recently that would have such an effect. This definitely must be addressed before the 1.0 release.

@abyrd abyrd added this to the 1.0.0 release milestone Aug 23, 2016
@abyrd
Copy link
Member

abyrd commented Aug 23, 2016

@clinct what branch or fork is your deployed version b34e4c2 on? It looks like that branch diverged from master at 02a667a, and I get no response from a git branch -r --contains b34e4c26751ce87ae7997a7e3357fa33ac4b03a0.

@abyrd
Copy link
Member

abyrd commented Aug 23, 2016

It looks like a branch that only has a few commits to implement the Dutch fare service. I'll examine changes on master since 02a667a.

@clinct
Copy link
Contributor Author

clinct commented Aug 23, 2016

Hi @abyrd, thanks for looking into it. It was indeed a branch with work on the Dutch Fares.

@abyrd
Copy link
Member

abyrd commented Aug 23, 2016

Between 02a667a and current master, the only changes I would think could have such a large impact are the changes to the origin/destination linking, and even then I don't see immediately how this could have an impact.

@abyrd
Copy link
Member

abyrd commented Aug 23, 2016

Unfortunately I think we need to bisect this.

@abyrd
Copy link
Member

abyrd commented Aug 23, 2016

@clinct are you sure the configuration of the two deployments is identical? Are you seeing only slowdowns or also some speed-ups (i.e. different results and timings or only worse timings?)

@clinct
Copy link
Contributor Author

clinct commented Aug 23, 2016

@abyrd Are you able to confirm the slowdown on your end as well?

@abyrd
Copy link
Member

abyrd commented Aug 23, 2016

@clinct no, I don't have any instances of OTP set up to check this right now.

@clinct
Copy link
Contributor Author

clinct commented Aug 23, 2016

I only see similar or worse times, the actual results seem to be the same (based on the graph of duration of the travel times).

All I did was merging Master and re-building (which also automatically runs the tests).

@abyrd
Copy link
Member

abyrd commented Sep 7, 2016

We discussed this at the OTP steering committee meeting. If we can't find the offending commit by bisection soon, we will go ahead with the 1.0 release because we see only slowdowns, not incorrect routes. Slowdown would then be addressed in 1.1.

@buma
Copy link
Contributor

buma commented Sep 7, 2016

This can definitely be SimpleStreetSplitter. I did some tests on e359bc7 since it's the last commit where old linker still exists and switching between them is easy. On my simple part of washington simple car routing takes 24 ms out of this 2*8 ms is linking (origin and destination). In old linker trip takes 9 ms linking takes 2+1ms.

@clinct: You can take same commit and switch between linkers in StreetVertexIndexServiceImpl line 577 call to old linker is commented just comment line to simpleStreetSplitter and uncomment other one.

I'll check why is it so slow and maybe #2271 will speed it up.

@abyrd
Copy link
Member

abyrd commented Sep 7, 2016

Hi @buma, thanks for the input. I'm guessing you mean 2+8 ms rather than 2*8 for linking, which would give us 16 msec linking + 8 msec routing for the new linker vs. 3 msec linking + 6 msec routing with the old one. But in @clinct 's tests, number 10 increased from 189 to 761 msec and number 6 increased from 441 to 1184 msec. So we're looking at 500-700 msec slowdowns and it would be strange if those were entirely attributable to the linking process.

@abyrd
Copy link
Member

abyrd commented Sep 7, 2016

@buma after looking back at #2271 I now remember that the simple street splitter does not use an iteratively increasing search radius. I recently massively sped up the linking in R5 using a two stage search (first 150 meters, then the maximum radius) and we should probably do that in OTP. But in R5 we were linking hundreds of thousands of points. Here we're linking only two, so I don't see how this could cause 500-700 msec slowdown.

@abyrd
Copy link
Member

abyrd commented Sep 7, 2016

Our hashgrid spatial index is not particularly efficient in the first place (see #2336). We should probably change it. As noted on #2271 we should probably re-implement the iterative search radius increase to only increase once, and to cooperate with a finer hashgrid. There are certain locations that currently will hit a full 9-cell, 1.5km x 1.5km area which could include a very large number of street geometries.

@abyrd
Copy link
Member

abyrd commented Sep 7, 2016

Also relevant is issue #2170

@buma
Copy link
Contributor

buma commented Sep 7, 2016

@abyrd: I meant 2*8 since both origin and destination search take 8ms. I would like to try this on clintct tests too but I don't know with which data tests are run to see if this contributes to the slowdowns. It's probably not just that.

@abyrd
Copy link
Member

abyrd commented Sep 7, 2016

Actually I just heard from @clinct and in his latest run he no longer sees the difference. While the iterative link-radius increase is still a good idea, I'm going to put it off for 1.1 since it's just a moderate efficiency improvement on large graphs.

@abyrd
Copy link
Member

abyrd commented Oct 14, 2016

Before we released 1.0, @clinct confirmed that the slowdown was no longer observable. Apparently the problem was caused by something else.

@abyrd abyrd closed this as completed Oct 14, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants