Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

route service concurrency limit / deadlock? #798

Closed
slvlirnoff opened this issue Jun 29, 2017 · 3 comments
Closed

route service concurrency limit / deadlock? #798

slvlirnoff opened this issue Jun 29, 2017 · 3 comments

Comments

@slvlirnoff
Copy link
Contributor

slvlirnoff commented Jun 29, 2017

When I run the route service with a concurrency higher than 1 and I try to stress test the service using siege after a while the server fully lock and still accept request but don't return any routes (timeout after a while).

There's no specific logs for the service and the likelihood increase with the amount of concurrency.

With a concurrency of 8 (server with 8 core) it happens after a few hundreds requests and with a lower concurrency of 3 it almost never happens for ~20k requests but still occur from time to time.

Once it start it doesn't recover even after a long period of inactivity unless I restart the service.
When I do a request to the route service, it does the following output, which seems that it still does the route but just don't answer the request.

2017/06/29 09:38:51.238528 GET /route?json=%7B%22locations%22%3A%5B%7B%22lon%22%3A-0.55521%2C%22lat%22%3A44.82983%2C%22type%22%3A%22break%22%7D%2C%7B%22lon%22%3A-1.67062%2C%22lat%22%3A48.1039%2C%22type%22%3A%22break%22%7D%5D%2C%22costing%22%3A%22auto%22%2C%22costing_options%22%3A%7B%22transit%22%3A%7B%22use_bus%22%3A0.3%2C%22use_rail%22%3A0.6%2C%22use_transfers%22%3A0.3%7D%7D%2C%22date_time%22%3A%7B%22type%22%3A1%2C%22value%22%3A%222017-06-29T07%3A25%22%7D%7D&api_key= HTTP/1.0
2017/06/29 09:38:51.238935 #033[32;1m[INFO]#033[0m Got Loki Request 29738
2017/06/29 09:38:51.239061 [ANALYTICS] locations_count::2
2017/06/29 09:38:51.239076 [ANALYTICS] costing_type::auto
2017/06/29 09:38:51.239192 [ANALYTICS] location_distance::374.357178km
2017/06/29 09:38:51.256769 #033[32;1m[INFO]#033[0m Got Thor Request 29738
2017/06/29 09:38:51.257295 [ANALYTICS] travel_mode::0
2017/06/29 09:38:51.291222 [ANALYTICS] admin_state_iso::
2017/06/29 09:38:51.291242 [ANALYTICS] admin_country_iso::
2017/06/29 09:38:51.291798 #033[32;1m[INFO]#033[0m Got Odin Request 29738
2017/06/29 09:38:51.292117 #033[32;1m[INFO]#033[0m trip_path_->node_size()=246
2017/06/29 09:38:51.292768 #033[32;1m[INFO]#033[0m maneuver_count::41
2017/06/29 09:38:51.293149 #033[32;1m[INFO]#033[0m Got Tyr Request 29738
2017/06/29 09:38:51.293259 [ANALYTICS] language::en-US
2017/06/29 09:38:51.294089 [ANALYTICS] trip_length::458.513519km
2017/06/29 09:38:51.294231 200 39133

Any ideas?

Best,
Cyprien

Some details:

  • planet level import
  • some transit
  • 64GB / 8 core server
  • I used a fixed list of routes for the test that are picked randomly and all return results
  • the service is behind a nginx proxy
@slvlirnoff
Copy link
Contributor Author

slvlirnoff commented Jun 29, 2017

In fact, I had an issue in my list of routes for the tests and one route was duplicated many times.
It seems that if I have enough different routes picked randomly it almost never happens but if I use only one route for the stress testing for instance this one:

/route?json=%7B"locations"%3A%5B%7B"lon"%3A7.741244%2C"lat"%3A47.536451%2C"type"%3A"break"%7D%2C%7B"lon"%3A7.606642%2C"lat"%3A47.566794%2C"type"%3A"break"%7D%5D%2C"costing"%3A"auto"%2C"costing_options"%3A%7B"transit"%3A%7B"use_bus"%3A0.3%2C"use_rail"%3A0.6%2C"use_transfers"%3A0.3%7D%7D%2C"date_time"%3A%7B"type"%3A1%2C"value"%3A"2017-06-29T07%3A25"%7D%7D&api_key=

It does occur very quickly.

@kevinkreiser
Copy link
Member

i have seen this exact behavior before when testing prime_server as a file server. the fact that you are running into it is just more evidence that its a proper bug. i'm going to open an issue over there to track this

@kevinkreiser
Copy link
Member

we're tracking this bug upstream where its occurring

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants