Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error for building 'big' network with germany-latest.osm.pbf #141

Closed
SRN1973 opened this issue Jan 25, 2021 · 22 comments
Closed

Error for building 'big' network with germany-latest.osm.pbf #141

SRN1973 opened this issue Jan 25, 2021 · 22 comments

Comments

@SRN1973
Copy link

SRN1973 commented Jan 25, 2021

I realized that the r5r does not write back a calculated network.dat / respectively fails in the building process when the input files cover a whole country. (If I try the same for a small subset e.g. the city of Hamburg than r5r works as described and expected).

Operating System:
Ubuntu 20.04.1 LTS
RAM: 1 TB
120 cernels

Input files:

germany-latest-osm.pbf (https://download.geofabrik.de/europe/germany-latest.osm.pbf)
-GTFS files (whole public transport for Germany):
https://download.gtfs.de/germany/fv_free/latest.zip
https://download.gtfs.de/germany/nv_free/latest.zip
https://download.gtfs.de/germany/rv_free/latest.zip

Command to build and save the Graph:

options(java.parameters = "-Xmx500G")
library(r5r)
r5r_core <- setup_r5(data_path = , verbose = TRUE)

Error Message:
...
4:13:13.159 [main] WARN com.conveyal.r5.streets.StreetLayer - Vertex 2590721757 not found to use as via node for restriction 4429938, skipping this restriction
14:13:13.159 [main] WARN com.conveyal.r5.streets.StreetLayer - Did not find from/to edges for restriction 4429939, skipping
14:13:13.159 [main] WARN com.conveyal.r5.streets.StreetLayer - Did not find from/to edges for restriction 4429940, skipping
14:13:13.159 [main] WARN com.conveyal.r5.streets.StreetLayer - Did not find from/to edges for restriction 4429941, skipping
14:13:13.159 [main] WARN com.conveyal.r5.streets.StreetLayer - Did not find from/to edges for restriction 4429942, skipping
14:13:13.199 [main] ERROR com.conveyal.r5.streets.StreetLayer - Invalid turn restriction 4437683, does not have from, to and via, skipping
14:13:13.201 [main] ERROR com.conveyal.r5.streets.StreetLayer - Turn restriction 4437852 has multiple 'from' members, skipping.
14:13:13.201 [main] ERROR com.conveyal.r5.streets.StreetLayer - Turn restriction 4437855 has multiple 'from' members, skipping.
Error in rJava::.jnew("org.ipea.r5r.R5RCore", data_path, verbose) :
java.lang.IndexOutOfBoundsException: Index -1 out of bounds for length 19950683

@mvpsaraiva
Copy link
Collaborator

Hi @SRN1973. We've never tested r5r in such a large area. I'll try it out on our server and see if there's something we can do to optimize it to very large networks. I'll keep you posted.

@mattwigway
Copy link
Contributor

I suspect this has something to do with the input data and not the size of the region - I'd expect errors due to large regions to be OutOfMemoryErrors. @mvpsaraiva is there any way to surface the full error message from R5, including line numbers? Without that this will be very difficult to debug.

@mattwigway
Copy link
Contributor

And if @SRN1973 has 1TB of RAM (hey, wanna share?), I would not expect any problems with something like Germany. Might need to raise the Java memory limit though with options(java.parameters = "-Xmx8g") before loading r5r. Replace 8g (gigabytes) with as much as you can spare.

@SRN1973
Copy link
Author

SRN1973 commented Jan 26, 2021 via email

@mattwigway
Copy link
Contributor

The full error message was more a question for @mvpsaraiva. R5 will be printing a full traceback of where the error occurred, but somehow that's not surfacing through R5R. A workaround would be to try building the graph from the command line, without going through R5R. From a shell prompt in the directory where your data is, you'd run java -Xmx500g -jar /usr/local/lib/R/site-library/r5r/jar/r5-v6.0.1-2-gf44e585-all.jar point --build . You may need to replace /usr/local/lib/R/ with the path your R installation. You can find the full path to the JAR file from the R prompt by running system.file("jar/r5-v6.0.1-2-gf44e585-all.jar", package='r5r'). I wouldn't expect this to succeed, but we will likely get a more useful error message.

@SRN1973
Copy link
Author

SRN1973 commented Jan 26, 2021 via email

@mattwigway
Copy link
Contributor

This is definitely an upstream bug in R5, and it has to do with turn restrictions—I'll dig in a bit more later, but it might be a week before I can get to it. If you don't care about turn restrictions (which really only matter for driving, and to a lesser extent biking, but you can always walk your bike), you can use osmconvert with the --drop-relations option to remove them from your PBF. Then R5R should build the network, albeit without information on turn restrictions.

@SRN1973
Copy link
Author

SRN1973 commented Jan 26, 2021 via email

@SRN1973
Copy link
Author

SRN1973 commented Jan 27, 2021 via email

@rafapereirabr
Copy link
Member

@SRN1973 and @mattwigway , thank you both for this real quick exchange ! Since this is an upstream bug in R5, I will be closing this issue for now. Nonetheless, I would encourage @SRN1973 to open an issue about this at R5's github page.

@mvpsaraiva
Copy link
Collaborator

@mattwigway thanks for helping with this one. I'm checking how to make r5r's verbose mode more helpful. The way it is now, r5r outputs all ERROR, WARNING, and INFO messages when verbose = TRUE, and only ERROR messages when verbose = FALSE. Perhaps a debug mode could be useful in situations like this.

@mvpsaraiva
Copy link
Collaborator

Just a heads up before closing this issue: r5r is much more 'verbose' now when verbose = TRUE. It passes through all log messages generated by R5, not just errors and info messages.
It should help debug problems like this in the future.

@rafapereirabr
Copy link
Member

Thanks for looking into this, @mvpsaraiva . Please, don't forget to update the NEWS files ;)

@iamadouhassane
Copy link

iamadouhassane commented Aug 18, 2023

Hello @mvpsaraiva @mattwigway @rafapereirabr @stupidpupil @SRN1973

I hope that you are all fine ?
I have the same issue on python code with r5py and i want to know please if you finalement resolve this problem.
I work currently with France data which size 4.10 GB ( see http://download.geofabrik.de/europe/france.html)
I am open to all suggestions ( some libraries, methodologies or like that )
Also for small data, i can read and implement transport network without difficults but the problem is that in the dataframe of travel time , i have NaN value.

Thank a lot

@SRN1973
Copy link
Author

SRN1973 commented Aug 18, 2023 via email

@iamadouhassane
Copy link

For my analyses I decided to implement a workaround. I create several overlapping networks (three in my case) ... Von: "iamadouhassane" @.> An: "ipeaGIT/r5r" @.> CC: "SRN1973" @.>, "Mention" @.> Gesendet: Freitag, 18. August 2023 10:38:11 Betreff: Re: [ipeaGIT/r5r] Error for building 'big' network with germany-latest.osm.pbf (#141) Hello [ https://github.com/mvpsaraiva | @mvpsaraiva ] [ https://github.com/mattwigway | @mattwigway ] [ https://github.com/rafapereirabr | @rafapereirabr ] [ https://github.com/stupidpupil | @stupidpupil ] [ https://github.com/SRN1973 | @SRN1973 ] I hope that you are all fine ? I have the same issue on python code with r5py and i want to know please if you finalement resolve this problem. I work currently with France data which size 4.10 GB ( see [ http://download.geofabrik.de/europe/france.html | http://download.geofabrik.de/europe/france.html ] ) Thank a lot — Reply to this email directly, [ #141 (comment) | view it on GitHub ] , or [ https://github.com/notifications/unsubscribe-auth/AQEL3BPLX53VQ6F55PCEGVDXV4SXHANCNFSM4WRXLRQQ | unsubscribe ] . You are receiving this because you were mentioned. Message ID: @.>

-- Dr. Stefan Neumeier Thünen-Institute of Rural Studies (Thünen-Institut für Lebensverhältnisse in ländlichen Räumen) Bundesallee 64 38116 Braunschweig @.
Tel.: 0531-596-5241 Homepage: http://www.thuenen.de ----------------------------------- Das Johann Heinrich von Thünen-Institut, Bundesforschungsinstitut für Ländliche Räume, Wald und Fischerei – kurz: Thünen-Institut – besteht aus 15 Fachinstituten, die über sozioökonomische, ökologische und technologische Kompetenz verfügen. Das Thünen-Institut betreibt Forschung und Politikberatung mit Bezug zu ländlichen Räumen, Landwirtschaft, Wald und Fischerei. The Johann Heinrich von Thünen Institute, Federal Research Institute for Rural Areas, Forestry and Fisheries – Thünen Institute in brief – consists of 15 specialized institutes with socioeconomic, ecological and technological expertise. The Thünen Institute conducts research and policy advice related to rural areas, agriculture, forests and fisheries.

thanks you for your quick response @SRN1973
That is my alternative solution too, just i lost 30% of my data because i work in hole France data.
Is it normal to have NaN value when you implement transport network?

thanks

@SRN1973
Copy link
Author

SRN1973 commented Aug 18, 2023 via email

@rafapereirabr
Copy link
Member

This discussion is related to Question 4 in our FAQ, where we discuss altenative solutions to the problem.

@SRN1973, what exactly do you mean by creating "several overlapping networks" ?

@iamadouhassane
Copy link

thank you for your answer @rafapereirabr, @SRN1973

@SRN1973 by several overlapping networks you mean that you split your region/country of study by several sub-regions for example for
FRANCE (http://download.geofabrik.de/europe/france.html) take data from the 27 sub-regions, therefore ALSACE, AQUITAINE, AUVERGNE (...), is that what you mean please?

In my project, I calculate the travel time between 2 given points with all the modes of transport used so by default I put transport_modes = [TransitMode.TRANSIT, LegMode.WALK]
to feed a machine learning algorithm. Unlike with your project, I don't focus on the services available in the area
of the 2 places of residence but of course on the 2 points to have something more precise. Insofar as I did not use only public transport, that is to say that I combined
TransitMode.TRANSIT and LegMode.WALK, I am supposed to have a correct result even if there is no public transport (this is what I have
understood by reading the documentation, but I may be wrong). By NAN value, I mean that the calculated travel time is zero even though I have managed to take points that are in the same sub-region
To better understand me, here is a sample of the data comprising 2457 points on the image r5py_input.png (start coordinates =(ORIGIN_LON,ORIGIN_LAT) ,
and arrival coordinates = (DESTINATION_LON, DESTINATION_LAT)) . Once I apply the algo and calculate the travel time I have these results on the r5py_output.png file with the travel_time column which shows the time
path between the 2 points. On the r5py_output_freq.png image where I applied a simple count on the travel_time column, we see that on 60% of my input data this one is zero.
Hopefully, I explained the problem well.

@rafapereirabr
thank you for your answer, I'll see if it's applicable on python too if not I'm looking at how to implement it on R

r5py_input r5py_output r5py_output_freq

@SRN1973
Copy link
Author

SRN1973 commented Aug 21, 2023 via email

@rafapereirabr
Copy link
Member

Thanks for the clarification, @SRN1973 !

@iamadouhassane
Copy link

Thanks for all of you @rafapereirabr @SRN1973

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants