Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Basic Export #1

Closed
manuelroth opened this issue Feb 22, 2016 · 19 comments
Closed

Basic Export #1

manuelroth opened this issue Feb 22, 2016 · 19 comments
Milestone

Comments

@manuelroth
Copy link

  • Basic Export with bounding box, Name, Point(Centroid)
  • Without ranking, hierarchie
@lukasmartinelli
Copy link
Contributor

We will create a Docker container that attaches to our database and at the
beginning just query the countries and places and dumps them in CSV format.

For the following features

  • Countries (Admin level 0)
  • Places

@manuelroth manuelroth added this to the Sync Meeting milestone Feb 22, 2016
@sfkeller
Copy link
Collaborator

Regarding the export format spec. I propose to use GeoCSV:
http://giswiki.hsr.ch/GeoCSV#CSV_file_format_specification
The specific fields need and their properties (mandatory, data type) still to be defined.

@manuelroth
Copy link
Author

We decided to prepare an export container which takes the table osm_places as example. And exports as much information from the following list as possible:

*osm_id - MUST BE UNIQUE "DOCUMENT ID" accross complete database

display_name - exactly as in Nominatim (may be improved later)

*name (=utf-8)
name_en
name_de
name_es
name_fr
name_ru
name_zh

*class
*type

*north (=boundingbox)
*south
*east
*west

*lat
*lon

scalerank - we have it
place_rank - nominatim has it

importance - exactly as in nominatim calculated

country (=country code, ISO-3166 2-letter country code)

street=<housenumber> <streetname>
city=<city>
county=<county>
state=<state>
country=<country>
postalcode=<postalcode>

(= a la nominatim http://wiki.openstreetmap.org/wiki/Nominatim)

? timestamp - osm modification?

@sfkeller
Copy link
Collaborator

sfkeller commented Mar 3, 2016

Following additions:

[Update - comments moved to #3]

@klokan
Copy link
Member

klokan commented Mar 3, 2016

@lukasmartinelli @manuelroth:

Agree to generate only "osm_places" for now. Basic export should be .tsv with columns which you have already available.
@klokantech team will index it with: https://github.com/klokantech/osmnames-sphinxsearch (docker is now ready).
A basic sample data: https://github.com/klokantech/osmnames-sphinxsearch/blob/master/sample.tsv
Please include as many columns from early spec mentioned above as possible.

This is fine for the Basic export - and once you have docker for this we can close this ticket.

@sfkeller:
Please let the initial basic export be as discussed above - without additions at this point in time - to not block development of osmnames-sphinxesearch.

To discuss details of the data format I have created a ticket #3. I am moving your comments from this ticket to #3. Hopefully you agree.

@lukasmartinelli
Copy link
Contributor

How can we join for example a place label of rapperswil with the boundaries of rapperswil?

Just looked at the data and in OSM the administrative boundaries are not polygons but linestrings which are not linked together - so can we extract administrative area polygons out of OSM data?

There is also no osm_id of the administrative area that matches the osm_id of the place label or other linking data.

Perhaps @sfkeller knows more about how to link administrative boundaries to place labels?

For example Rapperswil is just a point.

screenshot from 2016-03-04 14-56-08

And this is a border of my hometown.

screenshot from 2016-03-04 15-00-20

@lukasmartinelli
Copy link
Contributor

@sfkeller
Copy link
Collaborator

sfkeller commented Mar 4, 2016

How can we join for example a place label of rapperswil with the boundaries of rapperswil?
...
Perhaps @sfkeller knows more about how to link administrative boundaries to place labels?

In OSM this is done in the data with relations: Relations aggregate admin. boundaries and the same relation usually also contans a node with the center of the admin. unit (i.e. "the pace label"). See e.g. City (= county of) Zürich: http://www.openstreetmap.org/relation/1682248 . Unfortunately that's not yet in the OSM data for Uznach http://www.openstreetmap.org/relation/1683953 nor Rapperswil-Jona.

AFAIK that reference from a place or building (having point coordinates) to the next outer admin. boundary is calculated in Nominatim and e.g. also in the "?" function in osm.org which uses Overpass API.

In either case it's probably simply an ST_Witihin function ordered by area size.Now I don't know if you can calculate this?

@lukasmartinelli
Copy link
Contributor

In OSM this is done in the data with relations: Relations aggregate admin. boundaries and the same relation usually also contans a node with the center of the admin. unit (i.e. "the pace label"). See e.g. City (= county of) Zürich: http://www.openstreetmap.org/relation/1682248 . Unfortunately that's not yet in the OSM data for Uznach http://www.openstreetmap.org/relation/1683953 nor Rapperswil-Jona.

Thanks a lot. Forgot about the relations concept.
We just looked at this in more detail. It is definitly quite some work to relate the labels with the polygons.

We need to make polygons of all those relations for the boundaries in imposm3.
This requires a new imposm3 mapping - which means kind of forking osm2vectortiles and changing the mapping. We also need a new mapping anyway because we don't have any data for some fields specified in the export specification (buildings have no adresses mapped for example).

In either case it's probably simply an ST_Witihin function ordered by area size.Now I don't know if you can calculate this?

Once we have the polygons we can start try to relate the labels with the polygons.
And yes the only method I currently see is to guess in which administrative boundary a label is with function like ST_Within and a admin level that corresponds to e.g. place=city.

@manuelroth
Copy link
Author

We decided to make the basic export for the layer roads. Because this layer does not need any joining with an other table and we don't have to rewrite the mapping for now.

The following fields can be extracted:
display_name
*class
*type

*north (=boundingbox)
*south
*east
*west

*lat
*lon

@manuelroth
Copy link
Author

What is exactly the reason why we need the lat/lon value of the geometry?- Do we need the lat/lon value to highlight the geometry like in the picture below:
bildschirmfoto 2016-03-07 um 13 14 36

I understand the cause for the bounding box => zooming in until the bounding box fits the viewport.

Would appreciate help @klokan @sfkeller

@lukasmartinelli
Copy link
Contributor

What is exactly the reason why we need the lat/lon value of the geometry?- Do we need the lat/lon value to highlight the geometry like in the picture below:

This matters to decide whether to use st_centroid or st_pointonsurface to convert linestrings and polygons to points.

@sfkeller
Copy link
Collaborator

sfkeller commented Mar 7, 2016

We really have to be aware that this is about geographic names (not building addresses).
P.S. IMHO streets and housenumbers (and coordinates of them) have nothing to do here.

Geographic names can be represented either as point (landmark plus some "surrounding", "Bundesplatz"), linear ("Albiskette") or areal geometry (city of Zurich).

Bounding Box can be a mandatory field(s) but it's a derived or approximated info from the "real" geometries.

Field lat/lon in this context can only mean that's it's the (center) location of a geographic name.

@klokan
Copy link
Member

klokan commented Mar 7, 2016

Ideally the point should represent where the label typically is located.

The real reason for point in a Search API is the cheap distance calculation. Typically fulltext engines supports scoring based on distance only for a point index (as calculations and indexes on bbox are significantly more expensive and distance is only one part of the final rank for given query).
Points are for this reason aslo typically used for filtering visible features.

@klokan
Copy link
Member

klokan commented Mar 7, 2016

We are looking forward to get the early first export at least for roads - to test the server side on some larger close to real database.

In case any other table can be exported directly of with easy to do joins, please add it too to the initial export. We hope to get the export soon from you.

Important for meeting on Monday is having analyse of data tables described in osm2vectortiles/osm2vectortiles#156

@klokan
Copy link
Member

klokan commented Mar 8, 2016

Thanks for the basic export with download link at osm2vectortiles/osm2vectortiles#157 @manuelroth @lukasmartinelli

We are working on a first deploy of osmnames search indexing the roads to test the performance of the search engine powered by sphinx search: klokantech/osmnames-sphinxsearch#3

@klokan klokan closed this as completed Mar 8, 2016
@klokan
Copy link
Member

klokan commented Mar 9, 2016

Basic data deployed in very early version of the search at: http://osmnames.klokantech.com/.

Thanks @manuelroth @lukasmartinelli

@klokan
Copy link
Member

klokan commented Mar 14, 2016

Tables with/without geometry documented: osm2vectortiles/osm2vectortiles#156

@lukasmartinelli
Copy link
Contributor

Mapzen also offers a similar service: http://whosonfirst.mapzen.com/

philippks pushed a commit that referenced this issue Mar 17, 2017
kharesimran added a commit that referenced this issue May 1, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants