New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rapid, subsequent requests will return unexpected 500's #1611

Closed
clinct opened this Issue Nov 12, 2014 · 7 comments

Comments

Projects
None yet
4 participants
@clinct
Collaborator

clinct commented Nov 12, 2014

I did some stress testing of OTP master branch and found out that if I rapidly fire different requests (1 per 0.5 seconds, around 20), some of them that normally return a 200 will suddenly start returning 500’s.

The performed requests were all send with similar parameters with only varying fromPlaces:

?maxTransfers=25&_dc=1358423838102&arriveBy=false&mode=TRANSIT,WALK&optimize=QUICK&maxWalkDistance=3000&walkSpeed=1.3888&hst=true&date=2014-11-11&time=16:30&toPlace=52.368104267594056,4.856208655327167&fromPlace=52.3681107,4.8562698

@hannesj

This comment has been minimized.

Contributor

hannesj commented Nov 12, 2014

I've also seen this. It seems that the current usage of having runState as a field in GenericAStar is not thread-safe. I solved this by having a new GenericAStar per request and haven't had issues since.

See https://github.com/hannesj/OpenTripPlanner/blob/master/src/main/java/org/opentripplanner/routing/impl/SimpleAStarPathServiceImpl.java

@abyrd

This comment has been minimized.

Member

abyrd commented Nov 12, 2014

Thanks @clinct for the report, and thank you @hannesj for pointing out the source of the problem. These state variables were originally function-scoped, but were pulled out to fields recently to make it possible to pause the routing process and observe its behavior. It's not immediately obvious to someone jumping into OTP that GenericAStar is not instantiated on a per-request basis, so yes we ended up with a non-thread-safe situation. OTP behavior is currently undefined under concurrent requests, so we'll make it a priority to fix this.

@abyrd

This comment has been minimized.

Member

abyrd commented Nov 12, 2014

By the way @hannesj, if you spot and diagnose things like this, definitely feel free to ticket/report them in the future.

@hannesj

This comment has been minimized.

Contributor

hannesj commented Nov 12, 2014

@abyrd I've always tried to report them when they come up, but this one slipped my mind, as I was working on a separate path service implementation at that time. (even tough it is actually a workaround for another bug in OTP)

@bmander

This comment has been minimized.

Contributor

bmander commented Nov 13, 2014

PathService implementations no longer take SPTServices in the constructor; rather take an SPTServiceFactory and instantiate their own SPTService instances as needed.

84c9e0a

@abyrd Could you sign off on this before closing it?

@clinct

This comment has been minimized.

Collaborator

clinct commented Nov 16, 2014

Repeated the same tests and the issue is no longer there; no unexpected 500's were returned. Thanks!

@abyrd

This comment has been minimized.

Member

abyrd commented Nov 17, 2014

Some line notes added directly in 84c9e0a, but as that is not a pull request they did not carry over to this ticket. Most of that commentary is now ticket #1618.

@abyrd abyrd closed this Nov 17, 2014

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment