Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parser optimization: Seperate filtering and property handling and the two passes in OSMReader #49

Closed
NopMap opened this issue May 25, 2013 · 3 comments

Comments

@NopMap
Copy link
Contributor

NopMap commented May 25, 2013

Currently the filtering for routable ways and the handling of more detailed properties is mixed in AcceptWays. Also, the code is executed twice, during both passes. This will become worse as the parsing becomes more detailed.

Suggestion:

  • seperate filtering and property analysis in two methods
  • run filtering only in pass1
  • run property analysis only in pass2
  • store the result of filtering from pass1 and re-use it in pass2

The result of the filtering should be be stored per way with one bit for each allowed mode of travel and re-used in pass 2 to skip the redundant filtering. This would speed up parsing on expense of memory. (optional switch?)

@karussell
Copy link
Member

This would speed up parsing on expense of memory.

I don't like this idea. For me RAM is more important than speed, e.g. if you want to import planet.osm you already need 20GB and this is increasing by 40% per year due to OSM contributions! So, before increasing the RAM usage we should make it possible to have a fast but on-disc solution for GHLongIntBTree. But I didn't find a good solution. E.g. MapDB was too slow. I'll experiment with Lucene again as we probably need it elsewhere.

Of course you can make this speed-up optional, e.g. in a new OSMReaderHelper subclass or whatever

@NopMap
Copy link
Contributor Author

NopMap commented May 25, 2013

It would probably be worthwhile to have a technology that will run with limited memory for any size of file. In my map compiler I had a similar problem caching the nodes. I wrote my own storage class that is organized in pages which are loaded from disk, with a maximum number that can be loaded at one time.

Until then it can't hurt to make this optional, controlled by a property. If there is no cached data available, the ways are parsed again just like it happens now. Nothing lost. If you have enough memory, it should run faster.

@NopMap
Copy link
Contributor Author

NopMap commented May 27, 2013

This can be closed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants