Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

odjitter crashing with big OD data and --max-per-od 1 #5

Closed
lucasccdias opened this issue Jan 13, 2022 · 3 comments
Closed

odjitter crashing with big OD data and --max-per-od 1 #5

lucasccdias opened this issue Jan 13, 2022 · 3 comments

Comments

@lucasccdias
Copy link

Hi, @dabreegster .

I am trying to use odjitter with a subset of the São Paulo OD data and it is crashing when I set --max-per-od 1. It works fine when I try with --max-per-od 100 and --max-per-od 10. My PC freezes in the process, so it is probably a RAM usage related problem -- I have a core i5 6th gen with 8GB running Ubuntu 20.04.3 LTS.

Here is a reproducible example (using R):

piggyback::pb_download(file = "zones_sp_center.geojson", 
                       repo = "spstreets/OD2017"
                       )

piggyback::pb_download(file = "od_sp_center.csv",
                       repo = "spstreets/OD2017"
                       )

system("odjitter --od-csv-path ./od_sp_center.csv --zones-path ./zones_sp_center.geojson --max-per-od 1 --output-path result.geojson")

# Scraped 114 zones from ./zones_sp_center.geojson
# Disaggregating OD data
# Killed
@dabreegster
Copy link
Owner

Thanks for the bug report! I can confirm the problem -- I managed to run it, getting a 2GB output file, but it took 30GB of RAM, which was right near my laptop's limit. :) The problem is that the tool buffers the entire GeoJSON representation in-memory and writes it all at once. There's no reason to do this; I'll work on a fix.

@dabreegster
Copy link
Owner

If you rebuilt it from latest git, it should consume very little memory. Took me about a minute to convert that whole file.

As someone on the georust discord pointed out, we should consider https://flatgeobuf.org/ for working with larger datasets like this!

@lucasccdias
Copy link
Author

I just did it and worked flawless, it took about a minute here. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants