Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] lower memory requirements for spares profiles #99

Open
wipfli opened this issue Feb 27, 2022 · 4 comments
Open

[FEATURE] lower memory requirements for spares profiles #99

wipfli opened this issue Feb 27, 2022 · 4 comments

Comments

@wipfli
Copy link
Contributor

wipfli commented Feb 27, 2022

Is your feature request related to a problem? Please describe.
I want to render a profile on planet.osm.pbf which creates a small output mbtiles file which is maybe only 100 MB. Currently, I need a machine with 128 GB ram for this.

Describe the solution you'd like
Being able to render profiles which create small mbtiles files on machines with smaller amount of ram, e.g., 8 GB ram.

Describe alternatives you've considered
First, run osmium on planet.osm.pbf to extract only relevant features which results in a smaller .osm.pbf file. Then, run planetiler.

Additional context
Toilts example can do this to some extent:

Osmium: https://docs.osmcode.org/osmium/latest/osmium-tags-filter.html

Relevant planetiler code:

// TODO allow limiting node storage to only ones that profile cares about

@farfromrefug
Copy link
Contributor

@wipfli side question. Could you share your swiss map gl style?

@wipfli
Copy link
Contributor Author

wipfli commented Jun 6, 2022

Swisstopo made the light base map. Let me find the link...

@wipfli
Copy link
Contributor Author

wipfli commented Jun 6, 2022

https://www.swisstopo.admin.ch/en/geodata/maps/smw/smw_lightbase.html

@wipfli
Copy link
Contributor Author

wipfli commented Jun 6, 2022

They even made a schema similar to OpenMapTiles which they call the SwissMapTiles :) has glacier lines for example...

cldellow added a commit to cldellow/tilemaker that referenced this issue Dec 29, 2023
This PR generalizes the idea of `node_keys`, adds `way_keys`, and fixes systemed#402.

I'm not too sure if this is generally useful - it's useful for one of my
use cases, and I see someone asking about it in systemed#190
and, elsewhere, in onthegomap/planetiler#99

If you feel it complicates the maintainer story too much, please reject.

The goal is to reduce memory usage for users doing thematic extracts by
not indexing nodes that are only used by uninteresting ways.

For example, North America has ~1.8B nodes, needing 9.7GB of RAM for its node
store. By contrast, if your interest is only to build a railway map, you
require only ~8M nodes, needing 70MB of RAM.

Currently, a user can achieve this by pre-filtering their PBF using
osmium-tool. If you know exactly what you want, this is a good
long-term solution. But for one-offs and experimenting, it's a bit
cumbersome to iterate.

Sample use cases:

```lua
-- Building a map without building polygons - exclude them
way_keys = {"~building"}
```

```lua
-- Building a railway map
way_keys = {"railway"}
```

```lua
-- Building a map of major roads
way_keys = {"highway=motorway", "highway=trunk", "highway=primary", "highway=secondary"}`
```

Nodes used in ways which are used in relations (as identified by
`relation_scan_function`) will always be indexed, regardless of
`node_keys` and `way_keys` settings that might exclude them.

Notes:

1. This is based on `lua-interop-3`, as it interacts with files that are
   changed by that. I can rebase against master after lua-interop-3 is
   merged.

2. The names `node_keys` and `way_keys` are perhaps out of date, as they
   can now express conditions on the values of tags in addition to their
   keys. Leaving them as-is is nice, as it's not a breaking change.
   But if breaking changes are OK, maybe these should be
   `node_filters` and `way_filters` ?
cldellow added a commit to cldellow/tilemaker that referenced this issue Dec 29, 2023
This PR generalizes the idea of `node_keys`, adds `way_keys`, and fixes systemed#402.

I'm not too sure if this is generally useful - it's useful for one of my
use cases, and I see someone asking about it in systemed#190
and, elsewhere, in onthegomap/planetiler#99

If you feel it complicates the maintainer story too much, please reject.

The goal is to reduce memory usage for users doing thematic extracts by
not indexing nodes that are only used by uninteresting ways.

For example, North America has ~1.8B nodes, needing 9.7GB of RAM for its node
store. By contrast, if your interest is only to build a railway map, you
require only ~8M nodes, needing 70MB of RAM. Or, to build a map of
national/provincial parks, 12M nodes and ~120MB of RAM.

Currently, a user can achieve this by pre-filtering their PBF using
osmium-tool. If you know exactly what you want, this is a good
long-term solution. But if you're me, flailing about in the OSM data
model, it's convenient to be able to tweak something in the Lua script
and observe the results without having to re-filter the PBF and update
your tilemaker command to use the new PBF.

Sample use cases:

```lua
-- Building a map without building polygons, ~ excludes ways whose
-- only tags are matched by the filter.
way_keys = {"~building"}
```

```lua
-- Building a railway map
way_keys = {"railway"}
```

```lua
-- Building a map of major roads
way_keys = {"highway=motorway", "highway=trunk", "highway=primary", "highway=secondary"}`
```

Nodes used in ways which are used in relations (as identified by
`relation_scan_function`) will always be indexed, regardless of
`node_keys` and `way_keys` settings that might exclude them.

A concrete example, given a Lua script like:

```lua
function way_function()
  if Find("railway") ~= "" then
    Layer("lines", false)
  end
end
```

it takes 13GB of RAM and 100 seconds to process North America.

If you add:

```lua
way_keys = {"railway"}
```

It takes 2GB of RAM and 47 seconds.

Notes:

1. This is based on `lua-interop-3`, as it interacts with files that are
   changed by that. I can rebase against master after lua-interop-3 is
   merged.

2. The names `node_keys` and `way_keys` are perhaps out of date, as they
   can now express conditions on the values of tags in addition to their
   keys. Leaving them as-is is nice, as it's not a breaking change.
   But if breaking changes are OK, maybe these should be
   `node_filters` and `way_filters` ?

3. Maybe the value for `node_keys` in the OMT profile should be
   expressed in terms of a negation, e.g. `node_keys = {"~created_by"}`?
   This would avoid issues like systemed#337
cldellow added a commit to cldellow/tilemaker that referenced this issue Dec 29, 2023
This PR generalizes the idea of `node_keys`, adds `way_keys`, and fixes systemed#402.

I'm not too sure if this is generally useful - it's useful for one of my
use cases, and I see someone asking about it in systemed#190
and, elsewhere, in onthegomap/planetiler#99

If you feel it complicates the maintainer story too much, please reject.

The goal is to reduce memory usage for users doing thematic extracts by
not indexing nodes that are only used by uninteresting ways.

For example, North America has ~1.8B nodes, needing 9.7GB of RAM for its node
store. By contrast, if your interest is only to build a railway map, you
require only ~8M nodes, needing 70MB of RAM. Or, to build a map of
national/provincial parks, 12M nodes and ~120MB of RAM.

Currently, a user can achieve this by pre-filtering their PBF using
osmium-tool. If you know exactly what you want, this is a good
long-term solution. But if you're me, flailing about in the OSM data
model, it's convenient to be able to tweak something in the Lua script
and observe the results without having to re-filter the PBF and update
your tilemaker command to use the new PBF.

Sample use cases:

```lua
-- Building a map without building polygons, ~ excludes ways whose
-- only tags are matched by the filter.
way_keys = {"~building"}
```

```lua
-- Building a railway map
way_keys = {"railway"}
```

```lua
-- Building a map of major roads
way_keys = {"highway=motorway", "highway=trunk", "highway=primary", "highway=secondary"}`
```

Nodes used in ways which are used in relations (as identified by
`relation_scan_function`) will always be indexed, regardless of
`node_keys` and `way_keys` settings that might exclude them.

A concrete example, given a Lua script like:

```lua
function way_function()
  if Find("railway") ~= "" then
    Layer("lines", false)
  end
end
```

it takes 13GB of RAM and 100 seconds to process North America.

If you add:

```lua
way_keys = {"railway"}
```

It takes 2GB of RAM and 47 seconds.

Notes:

1. This is based on `lua-interop-3`, as it interacts with files that are
   changed by that. I can rebase against master after lua-interop-3 is
   merged.

2. The names `node_keys` and `way_keys` are perhaps out of date, as they
   can now express conditions on the values of tags in addition to their
   keys. Leaving them as-is is nice, as it's not a breaking change.
   But if breaking changes are OK, maybe these should be
   `node_filters` and `way_filters` ?

3. Maybe the value for `node_keys` in the OMT profile should be
   expressed in terms of a negation, e.g. `node_keys = {"~created_by"}`?
   This would avoid issues like systemed#337
cldellow added a commit to cldellow/tilemaker that referenced this issue Dec 29, 2023
This PR generalizes the idea of `node_keys`, adds `way_keys`, and fixes systemed#402.

I'm not too sure if this is generally useful - it's useful for one of my
use cases, and I see someone asking about it in systemed#190
and, elsewhere, in onthegomap/planetiler#99

If you feel it complicates the maintainer story too much, please reject.

The goal is to reduce memory usage for users doing thematic extracts by
not indexing nodes that are only used by uninteresting ways.

For example, North America has ~1.8B nodes, needing 9.7GB of RAM for its node
store. By contrast, if your interest is only to build a railway map, you
require only ~8M nodes, needing 70MB of RAM. Or, to build a map of
national/provincial parks, 12M nodes and ~120MB of RAM.

Currently, a user can achieve this by pre-filtering their PBF using
osmium-tool. If you know exactly what you want, this is a good
long-term solution. But if you're me, flailing about in the OSM data
model, it's convenient to be able to tweak something in the Lua script
and observe the results without having to re-filter the PBF and update
your tilemaker command to use the new PBF.

Sample use cases:

```lua
-- Building a map without building polygons, ~ excludes ways whose
-- only tags are matched by the filter.
way_keys = {"~building"}
```

```lua
-- Building a railway map
way_keys = {"railway"}
```

```lua
-- Building a map of major roads
way_keys = {"highway=motorway", "highway=trunk", "highway=primary", "highway=secondary"}`
```

Nodes used in ways which are used in relations (as identified by
`relation_scan_function`) will always be indexed, regardless of
`node_keys` and `way_keys` settings that might exclude them.

A concrete example, given a Lua script like:

```lua
function way_function()
  if Find("railway") ~= "" then
    Layer("lines", false)
  end
end
```

it takes 13GB of RAM and 100 seconds to process North America.

If you add:

```lua
way_keys = {"railway"}
```

It takes 2GB of RAM and 47 seconds.

Notes:

1. This is based on `lua-interop-3`, as it interacts with files that are
   changed by that. I can rebase against master after lua-interop-3 is
   merged.

2. The names `node_keys` and `way_keys` are perhaps out of date, as they
   can now express conditions on the values of tags in addition to their
   keys. Leaving them as-is is nice, as it's not a breaking change.
   But if breaking changes are OK, maybe these should be
   `node_filters` and `way_filters` ?

3. Maybe the value for `node_keys` in the OMT profile should be
   expressed in terms of a negation, e.g. `node_keys = {"~created_by"}`?
   This would avoid issues like systemed#337

4. This also adds a SIGUSR1 handler during OSM processing, which prints
   the ID of the object currently being processed. This is helpful for
   tracking down slow geometries.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants