Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When runnning a custom overpass api with no slot management, the check in downloader._get_pause for "Currently" ends up in an infinite loop. #697

Closed
jannefleischer opened this issue Apr 26, 2021 · 9 comments · Fixed by #704
Labels

Comments

@jannefleischer
Copy link

When runnning a custom overpass api with no slot management, the check in downloader._get_pause for "Currently" ends up in an infinite loop.

overpass/api/status will return "Currently running queries (pid, space limit, time limit, start time):" But with no slots beneath it. Therefore checking for "Currently" isn't sufficient.

I helped myself by hardcoding the pause-parameter in _osm_network_download to 0, but that is not really a solution...

@gboeing
Copy link
Owner

gboeing commented Apr 26, 2021

Thanks @jannefleischer. One other person brought this up (on StackOverflow) a while back. I'm not sure what to suggest about this. Here's what I said to them at the time:

OSMnx is designed to work with the response format of the main Overpass API instance. Any idea why your status endpoint response differs from the standard? Is it due to your rate limiting configuration? Can you change that to generate the same response format?

My understanding of those questions is that the main Overpass API instance has rate limiting (slot management) configured on. If an instance has it configured off, then the status endpoint generates a different format. They key issue here is that the status endpoint is difficult to work with: its format is inconsistent and requires text parsing to extract information due to its semistructured format.

I'm reluctant to add a new feature to enhance OSMnx's codebase to handle custom Overpass API instances for a couple reasons. The primary one is testing: without running a custom instance in CI there's no way to conduct ongoing unit tests. It adds to the code complexity to handle an edge use case and code maintenance becomes difficult given the significant added dependency of installing and running a custom instance to be able to execute those new branches of code.

If there's a minimal solution for this, that doesn't add new untestable code branches, I'm open to considering it. But I don't run a custom API instance myself, so development, ongoing code maintenance, and CI must be considered in the decision.

@mmd-osm
Copy link

mmd-osm commented Apr 27, 2021

How about providing a config option in osmnx to turn off parsing logic & rate limiting altogether. Obviously, that's only meaningful for people hosting their own private, unrestricted Overpass instance.

Coping for Overpass API /api/status variations seems too fragile, as the format is entirely undocumented, hence I wouldn't recommend it.

Besides, /api/status may show incorrect results for overpass-api.de, as there are really two independent servers behind this URL. However, /api/status always reflects the status of a single server only. There's currently no workaround for it, unless you're specifically targeting one of the two machines.

@jannefleischer
Copy link
Author

@mmd-osm I thought an option to disable pausing altogether, by adding a setting, would be the right choice, but liked to ask first, before adding a pull request going that route. Adding yet another option doesn't seem to be the best choice, if an automatic solution would be possible...
Thanks for your feedback!

@gboeing
Copy link
Owner

gboeing commented Apr 27, 2021

@mmd-osm thanks for this background:

Besides, /api/status may show incorrect results for overpass-api.de, as there are really two independent servers behind this URL. However, /api/status always reflects the status of a single server only. There's currently no workaround for it, unless you're specifically targeting one of the two machines.

That's interesting. Currently OSMnx uses overpass-api.de as its (configurable) default. What is the best practice now? Should it instead default to a specific one of the z or lz4 subdomains listed here?

I assumed pointing at overpass-api.de directly was best, to allow for any load balancing/redirecting among the servers. But if I programmatically check the status endpoint to determine when to make the next request, I want to make sure I'm seeing the correct results.

@gboeing
Copy link
Owner

gboeing commented Apr 27, 2021

@jannefleischer it sounds like adding a new setting to the settings module to disable pausing may be the best option, while adding negligible (untestable) code complexity. Would you like to open a pull request?

@mmd-osm
Copy link

mmd-osm commented Apr 27, 2021

It's a bit unfortunate that /api/status hasn't been designed in a way to handle multiple servers in a consistent way. Due to this design issue, it's hard to recommend some best practice. At this time, I wouldn't hardcode any of the subdomains, as they might change over time: new servers might get added, or existing ones decommissioned and replaced by bigger machines.

I don't know if it's feasible for osmnx to decouple the "overpass-api.de" name server lookup from sending the actual request to the server. As an example, curl offers the command line option --resolve to influence name resolving, thereby directing the request to one specific IP addresses. By using the same IP address for both queries and /api/status, you should get a consistent picture.

The clear downside of this approach is that you would essentially need to re-implement DNS round robin, to make sure a request uses another server in case of an issue.

curl -v http://overpass-api.de/api/status --resolve "overpass-api.de:80:178.63.11.215"

based on:

> nslookup overpass-api.de 


Non-authoritative answer:
Name:	overpass-api.de
Address: 178.63.11.215
Name:	overpass-api.de
Address: 178.63.48.217
Name:	overpass-api.de
Address: 2a01:4f8:120:6464::2
Name:	overpass-api.de
Address: 2a01:4f8:110:502c::2

@gboeing
Copy link
Owner

gboeing commented Apr 28, 2021

Thanks @mmd-osm. I'm opening that "subdomain" challenge as a separate issue, #698, to allow this issue here to focus on @jannefleischer original idea for disabling pausing.

@gboeing
Copy link
Owner

gboeing commented May 9, 2021

@jannefleischer see proposed fix in #704

@gboeing
Copy link
Owner

gboeing commented Jun 6, 2021

This feature has been released in v1.1.1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants