Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Function to download osm data #8

Closed
Robinlovelace opened this issue Nov 1, 2016 · 17 comments
Closed

Function to download osm data #8

Robinlovelace opened this issue Nov 1, 2016 · 17 comments
Milestone

Comments

@Robinlovelace
Copy link
Member

Sometimes you just want to download osm data and save it as a .osm file, e.g. to open in QGIS or JOSM. Is this possible?

@Robinlovelace
Copy link
Member Author

Robinlovelace commented Nov 1, 2016

Also I think it would be great to enable printing of overpass-turbo queries for testing, e.g.: http://overpass-turbo.eu/s/jM0

[out:json][timeout:25];
// gather results
(
  // query part for: “leisure=park”
  node["leisure"="park"]({{bbox}});
  way["leisure"="park"]({{bbox}});
  relation["leisure"="park"]({{bbox}});
);
// print results
out body;
>;
out skel qt;

@mpadge
Copy link
Member

mpadge commented Nov 2, 2016

Second Q: Already implemented at line#118 here using this class def.

First Q: Not at the moment, but was possible in osmdatar, and i'll make sure i'll reinstate it.

@Robinlovelace
Copy link
Member Author

Cheers for the feedback and let me know how I can help, although won't have real time until December. Will leave this open as a reminder. On Q2, how would you tell it to output that? Will aim to put in an @example into overpass_query docs when I find out.

@mpadge
Copy link
Member

mpadge commented Nov 3, 2016

Re Q2 output: Former approach can be seen here - I used a raw_data parameter to optionally return the query output directly. This is of course massively against the whole discussion on type stability in sfr issue#39, and so maybe ought be rethunk ...

@Robinlovelace
Copy link
Member Author

Yes indeed. I think we should aim for type stability also.

@mpadge
Copy link
Member

mpadge commented Dec 6, 2016

Robin, what do you think of naming primary functions like this:

> osm_data_xml ()
> osm_data_csv ()
> osm_data_sp ()
> osm_data_sf ()

?
It would make downloading 'raw' (xml/csv) data pretty transparent, and solve #13 to boot. The results of osm_data_xml could optionally be fed to osm_data_sp/sf to avoid repeated downloading, and to enabled extraction of different objects from same raw data. Maybe most importantly: perfect type stability!

Or, given the package name, should these rather be called

> osmdata_xml ()
> osmdata_csv ()
> osmdata_sp ()
> osmdata_sf ()

? (And where osmdata_xml() would return your desired .osm object.)

@maelle
Copy link
Member

maelle commented Dec 18, 2016

(I'm slowly reading all the issues :-)) Just a question regarding "to avoid repeated downloading", do you want to use the memoise package? Do you think it'll happen often that the same query is sent twice to the API?

@mpadge
Copy link
Member

mpadge commented Dec 19, 2016

Thanks @masalmon - this issue has been resolved with #1 - there can be no repeated downloading.

@mpadge
Copy link
Member

mpadge commented Dec 19, 2016

Type consistency can be assured (and #13 resolved) by the four (and in future maybe more) primary functions:

osmdata_xml(x)
osmdata_csv(x)
osmdata_sp(x)
osmdata_sf(x)

The first two of these would enable raw data to be downloaded (enabling it to be saved as .csv, .xml, .osm, or whatever), and x in these cases could only be an overpass_query object. x in the final two could be either an overpass_query object or the result of osm_to_xml. This is important to enable a single query to be converted both to sp and sf objects, but submitting just one raw XML doc to these functions prevents them being able to return a 'proper' osmdata object because there's no way to add timestamp, overpass_call or bbox (only the first of these could conceivably be extracted from the XML).

This suggests that the functions actually need to be:

osmdata_xml(q)
osmdata_csv(q)
osmdata_sp(q, doc)
osmdata_sf(q, doc)

where q is an overpass_query object, and doc in the latter is the optional result of osmdata_xml/csv(q) calls. This makes the four functions not quite consistent in structure, but still pretty clear. Absent insights from @Robinlovelace and @masalmon i'll go ahead and code these up ...

@mpadge
Copy link
Member

mpadge commented Dec 19, 2016

Former read_osm now implemented as osmdata_xml and osmdata_sp, with the other two yet to come.

@mpadge
Copy link
Member

mpadge commented Dec 19, 2016

A question for @Robinlovelace @masalmon:

  • Do we really need osmdata_csv()?

Requesting csv output from overpass requires specifying precisely which fields one desires - noting the abiding importance of the first statement there: ''Besides normal OSM field names ...''. Requesting csv requires users to know about OSM structure, in stark contrast to all other functions which require little or no knowledge. I suspect @hrbrmstr only put this there because that exact example happens to be in the overpass wiki, but i doubt such functionality would ever really be used.

My suggestion would be to implement sf and then use that as a way to get a straight data.frame which can then be export as csv if desired. Thoughts?

(And once we've answered that, I'll close this issue and deal with osmdata_sf() in #17)

@maelle
Copy link
Member

maelle commented Dec 19, 2016

Sounds good. I've only ever exported summary measures as csv (e.g. total distance) so I don't know how I would use this. If I were to export the data corresponding to a spatial object, I think I'd first wrangle the data.frame anyway.

@Robinlovelace
Copy link
Member Author

  • Regarding osmdata_csv() I agree: let's wait until osmdata_sf() is done and then just use that.
  • Regarding the general approach, I think it's good. Using osmdata_sp() now and like the feel and extensibility of it - can't wait to try out osmdata_sf().
  • I would add for the binary format option osmdata_pbf() so people can save a compressed version of the raw osm data, which is very commonly used by OSM core users I think and would mean our work is useful for people wanting to go straight into a PostGIS db.

@mpadge
Copy link
Member

mpadge commented Dec 21, 2016

osmdata_pbf() is a great idea, but i don't know how we'd easily achieve that in a platform independent way? The only simple way i know of is osmconvert, but a no-go for osx. Do you know of any other ways to generate or convert pbfs?

@Robinlovelace
Copy link
Member Author

Robinlovelace commented Dec 21, 2016

osmosis looks pretty good: https://github.com/openstreetmap/osmosis - don't think it's an issue - could be the type of thing where it gives an error message like Error: osmosis must be available to your system for this function to work then it's the user's responsibility to worry about OS specifics etc.

One for later I guess though...

@mpadge mpadge added this to the 0.0.0 milestone Jan 5, 2017
@mpadge
Copy link
Member

mpadge commented Jan 12, 2017

get-osmdata.R now has osmdata_xml, osmdata_sp and osmdata_sf, which closes this issue. (osmdata_sf isn't finished, but will be done with #17).

@mpadge mpadge closed this as completed Jan 12, 2017
@Robinlovelace
Copy link
Member Author

This is awesome, great work @mpadge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants