Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

osmWebWizard.py: allow filtering road types in OSM API query to reduce download size #7585

Closed
namdre opened this issue Sep 23, 2020 · 21 comments

Comments

@namdre
Copy link
Contributor

namdre commented Sep 23, 2020

would require a new input field (road-type-filter) and corresponding new osmGet.py option

@namdre
Copy link
Contributor Author

namdre commented Jan 16, 2022

Suggested approach:

@namdre
Copy link
Contributor Author

namdre commented Jan 16, 2022

Another thing that might reduce the OSM query by a lot: if the 'Add polygons' checkbox is disabled, modify the osm query so that it doesn't retrieve building shapes

@sab-inf
Copy link
Contributor

sab-inf commented Jan 20, 2022

I will work through the approach 👍
Thanks @namdre

@sab-inf
Copy link
Contributor

sab-inf commented Jan 23, 2022

Current Progress:

  • add a new road-types tab (in addition to the tabs 'settings', 'demand' and 'copyright')
  • get to know the overpass api
  • pass the list of checked types to osmGet.py
  • modify the osm query in readCompressed to restrict the list of retrieved osm-way-entities
  • add polygon option to osmGet.py

@sab-inf
Copy link
Contributor

sab-inf commented Jan 23, 2022

The road-type-tab looks like this:

road-types-tab

@sab-inf
Copy link
Contributor

sab-inf commented Jan 31, 2022

    if options.area:
        if options.area < 3600000000:
            options.area += 3600000000
        readCompressed(conn, url.path, '<area-query ref="%s"/>' %
                       options.area, options.prefix + "_city.osm.xml")
    if options.bbox or options.polygon:
        if options.tiles == 1:
            readCompressed(conn, url.path, '<bbox-query n="%s" s="%s" w="%s" e="%s"/>' %
                           (north, south, west, east), options.prefix + "_bbox.osm.xml")
        else:
            num = options.tiles
            b = west
            for i in range(num):
                e = b + (east - west) / float(num)
                readCompressed(conn, url.path, '<bbox-query n="%s" s="%s" w="%s" e="%s"/>' % (
                    north, south, b, e), "%s%s_%s.osm.xml" % (options.prefix, i, num))
                b = e

I will replace the query parameter in every readCompressed call. Is this a good approach?

@namdre
Copy link
Contributor Author

namdre commented Jan 31, 2022

In principle, yes. I think all code paths should support the new filtering option.
The code is probably more readable if you add a new roadTypes parameter to readCompressed and construct the whole query string within that function. This makes it easier to see what's happening wherever readCompressed is called.

@sab-inf
Copy link
Contributor

sab-inf commented Feb 1, 2022

I was trying queries of the type:

<has-kv k="highway" modv="" regv="motorway|motorway_link|trunk|trunk_link|primary|primary_link|
secondary|secondary_link|tertiary|tertiary_link|unclassified|residential|living_street|unsurfaced|
service|raceway|bus_guideway|track|footway|pedestrian|path|bridleway|cycleway|step|steps|stairs"/>

It shortend the query but "higher-ed" the complexity.
For small regions it was working fine, but for lager it returned a 504 Gateway Timeout error. So it was too complex for the Overpass API (i think because of using regex pattern matching).

Now i am using queries of type:

            <query type="wr">
                <has-kv k="highway" modv="" v="motorway"/>                                               <!-- placeholder: key value query-string -->
                <bbox-query n="52.68309745423826" s="52.35635375951427" w="12.979457521439901" e="13.818092012406387"/>                                               <!-- placeholder: area, bbox, polygon query-string -->
            </query>
            
            <query type="wr">
                <has-kv k="highway" modv="" v="motorway_link"/>                                               <!-- placeholder: key value query-string -->
                <bbox-query n="52.68309745423826" s="52.35635375951427" w="12.979457521439901" e="13.818092012406387"/>                                               <!-- placeholder: area, bbox, polygon query-string -->
            </query>
            
            <query type="wr">
                <has-kv k="highway" modv="" v="trunk"/>                                               <!-- placeholder: key value query-string -->
                <bbox-query n="52.68309745423826" s="52.35635375951427" w="12.979457521439901" e="13.818092012406387"/>                                               <!-- placeholder: area, bbox, polygon query-string -->
            </query>
...

Which is much longer but therefore more precise for the Overpass API.
In the code the query strings are build up dynamically (so no redundancy here).

@namdre
Copy link
Contributor Author

namdre commented Feb 1, 2022

Sounds good!

@sab-inf
Copy link
Contributor

sab-inf commented Feb 1, 2022

By filtering as described in my previous comment, we only get streets but no buildings, green areas, seas ... so the result will look like this ( its like turning off polygons in the masters version):

road_types

As it is now we reduce the osm_bbox.osm.xml file from 14.3 MB to 2.5 MB for the same area.
I can negate this filtering technique and disable road-types which are not ticked in the GUI from the bbox area. But this wont reduce the download size that much, but i hope we can generate polygons this way again..

Update: After negating query we get 12,4 MB filesize for the same area and enabled all road-types.
When all road-types disabled we get 10 MB.
When only primary, secondary and tertiary enabled we get 10.4 MB.

sab-inf-road-types

What do you think?

@namdre
Copy link
Contributor Author

namdre commented Feb 1, 2022

Webwizard already has a checkbox that selects whether polygons are built. I think the best way forward would be to pass this information as another option to osmGet and use it to control whether non-road data is downloaded or not.

@sab-inf
Copy link
Contributor

sab-inf commented Feb 1, 2022

Webwizard already has a checkbox that selects whether polygons are built. I think the best way forward would be to pass this information as another option to osmGet and use it to control whether non-road data is downloaded or not.

Yes, that is what i meant with "turning off polygons in the masters version". Is there shortcut for non-road data in the Overpass API? Instead of filtering for every key-value pair containing "building, sea, etc.
If not could you tell me where to find which types (buildings, green areas, ...) are rendered by SUMO, so i can try to filter them.
Thanks.

@namdre
Copy link
Contributor Author

namdre commented Feb 2, 2022

I don't think Overpass has special functions for road vs non-road stuff.
The list of things imported by polyconvert is in the typemaps:

In my experience, one of the biggest things data-wise are the building shapes. Even filtering just those depending on the 'polygon' checkbox should achieve big savings.

@sab-inf
Copy link
Contributor

sab-inf commented Feb 2, 2022

Okay thanks, i will try to add the polygon parameter to the process to reduce the download size.

@sab-inf
Copy link
Contributor

sab-inf commented Feb 4, 2022

Final Status:

  • working queries for road-types with and without "add polygon"
  • minimizing download size of OSM data

OSM File Sizes (all roads):

  1. Original (current masters) WebWizard: 14.4 MB (with and without polygons)
  2. New WebWizard: 7.3 MB (with polygons) and 2.5MB (without polygons)

I think this i a pretty good result :)

@namdre
Copy link
Contributor Author

namdre commented Feb 5, 2022

We try to limit the number of dependencies for non-default python libraries and I'd prefer it to merge your commit without the bs4 dependency. For idiomatic xml parsing, sumolib already comes with a small parsing wrapper. Here is a simple replacement guide for the bs4 lines:

        data = osmPolyconvert.read()
        bs_data = BeautifulSoup(data, "xml")
        b_polygon = bs_data.find_all("polygonType")
        for polygon in b_polygon:
            id = (polygon.get('id'))
            keyValue = id.split('.')

replace with

for polygon in sumolib.xml.parse(osmPolyconvert, 'polygonType'):
  keyValue = polygon.id.split('.')

@sab-inf
Copy link
Contributor

sab-inf commented Feb 5, 2022

If the PR gets merged i can also update the Tutorial and extend it with the new road-type feature 👍

@namdre
Copy link
Contributor Author

namdre commented Feb 5, 2022

Can you add `road.png' to the PR?

namdre pushed a commit that referenced this issue Feb 5, 2022
…e download size eclipse#7585 (#10098)

* [#7585] added new road-types tab eclipse#7585

* Pass Road-Type Parameter through script.js to osmGet.py, added new Argument to osmWebWizard.py eclipse#7585

* modify the osm query in readCompressed to restrict the list of retrieved osm-way-entities eclipse#7585

* added "add Polygon" parameter to the download process of the OSM data; getting data from XML files eclipse#7585

* finalize queries; minimizing download size of OSM data eclipse#7585

* replace bs4 with sumolibs xml parser eclipse#7585

* added road.png eclipse#7585
@namdre
Copy link
Contributor Author

namdre commented Feb 5, 2022

Works great!

namdre added a commit that referenced this issue Feb 5, 2022
@namdre
Copy link
Contributor Author

namdre commented Feb 5, 2022

The relations add a significant amount of data to the osm download. Some of them are mandatory (i.e. turn restrictions).
However, some are optional and could be tied to options. I think the biggest optional chunk is related to public transport.
If you can figure out a good query restriction and tie it to the 'public transport' checkbox that could also be useful.

namdre added a commit that referenced this issue Feb 7, 2022
namdre added a commit that referenced this issue Feb 7, 2022
@sab-inf
Copy link
Contributor

sab-inf commented Feb 8, 2022

I will try to append your suggestion with the public transport in the near future, but for now i had to prepare for some exams :/
Currently i am adapting/extending the documentation for the OsmWebWizard on my fork ...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants