-
Notifications
You must be signed in to change notification settings - Fork 665
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GeoTIFF segfaulting on Mac ARM python bindings #4623
Comments
@nilsnolde would the bindings produced in valhalla be able to receive the GeoTIFF format? is it just a string in the response for isochrone instead of the contours? EDIT: Upon looking at it, yeh it should be fine as it's literally just a json string |
I confirmed that the error can be avoided by commenting out this. Wish I had a better way to debug this than rebuilding it all to a static library for python though.
EDIT 1: I see this should activate serializeGeoTIFF so I'll try and work out what's going on in there |
I believe the segfault occurs in this line: I added this in for my debugging. It hasn't found a gdal driver it seems. throw valhalla_exception_t{599, "found the needle in the haystack"}; I'm trying to manually make sure it links in CMakeList I outputted some things. I'm unsure if that link libraries ones is meant to find something or not.
|
given what you've shown it seems like we continue with the build and supporting gdal even if we dont actually have full support on your system. seems like we just need to add a bit more protection/checking before continuing to enable it. i am confused about how it even links and runs! also, excellent sleuthing! |
I added this in: (the exception number is random, just copied from another valhalla_exception)
When it can't find the driver, it returns a null pointer as per: https://gdal.org/doxygen/classGDALDriverManager.html#ac26308c182440c8abd584040fa89bd4f |
Makes sense. Then if someone is able to get it working, it can just start working in the future I guess |
This is what I used in CMakeList. I don't know whether the GDAL::GDAL INTERFACE_LINK_LIBRARIES is actually meant to return something other than "gdal_link_libraries-NOTFOUND" as I really don't understand this process at all.
|
maybe gdal loads the drivers (separate libs) at runtime (which i was initially worried about in our implementation because its probably not too performant to do it on every request). your suggestion for checking at runtime is good but as you started doing further down, id rather check at configuration time and disable it then. if the gdal bin package is installed we can simply ask |
I've studied the implementation further and tested some more things. So, in the initial build stages, we're basically just finding the location of GDAL? and then it's not until we try and use it that it links it? Could we attempt to link it in the CMakeList.txt stage (as we seem to do with other binaries?). I think this is what you were suggesting more or less perhaps, but this way we could try and address problems linking it to various OS at that build stage (an otherwise disable it as you suggested) EDIT: Or does it actually get linked elsewhere in the code? The docs mention a global singleton I think somewhere |
I've found this info on it. So it should be linked in thor/worker? and it's just a pointer to that in serializeGeoTIFF? |
yeah that was my suggestion to @chrstnbwnkl for keeping the driver loaded and around but again it would still have to happen at run time. imho during cmake we should find gdal and make sure that it can load the geotiff driver. it might be as easy as checking some cmake variables. let me take a look here...nope struck out on that. looking here: https://formulae.brew.sh/api/formula/gdal.json and also checking on linux i see gdal depends on libgeotiff. so now im wondering how did you get into this situation! apart from us detecting that you have gdal but its not been built with geotiff support, how did you get it without geotiff support!? what does |
It looks like it isn't finding it, or it is failing to link it (in Thor I guess?), as my GDAL does appear to have 'GTiff' |
I just tried on linux with the bindings and I get the same segfault.. I know where the issue is now: we don't register the GTiff Driver properly, we only do it in that means currently anything but not sure where to exactly register the gtiff driver.. directly in |
Are we able to have it as some kind of global parameter? But for the short term, is there a line or two of code we could insert into isochrone_serializer? @nilsnolde |
you can try adding
in fwiw, I don't think it's a problem to call |
I'll give it a try and report back here. Ok, that doesn't sound too bad then. For the python bindings, will it also just load once for when actor is first initialised? |
|
anyway, for now I put it at the top of serializeGeoTIFF() just to get it working I get this error with the python bindings.
|
I get this error for "format": "geotiff" or "pbf" for the python bindings EDIT 1: Oh, pbf is not implemented. I suspect the issue is that geotiff is not working properly. it returns "Unknown serialization error: II*" when I return the "data" variable from serializeGeoTIFF() EDIT 2: oh again, it is? but the case for pbf itself is empty. it seems to be under a conditional for the json one.. |
So to clarify, it returns "II*" as a std::string in valhalla, instead of the json EDIT: a little research suggested that it was just some escape characters causing the string to cutoff when I was outputting it in valhalla, as II is the start of a geoTIFF apparently |
@nilsnolde ok, I got it working by converting the output into base64 and making a json |
Yeah bindings were neglected. We should def fix that before releasing. We'll probably have to do some stuff in the actor first.. |
i slapped together (hopefully) a fix for this: #4628 regarding the second issue imho there are 2 problems there:
imho the bindings should just return bytes and we can let the caller decide if what they expected was json pbf or geotiff. however we have a problem with error handlign where if you give it input that it cant act on it will either return json or pbf. if you asked for geotiff and something went wrogn you'll get back json. im not sure we want to spend the time adding error messages to a geotiffs metadata hahahha so have to think of a way around this. only othe roption would be to allow the actor interface to also return a mime type |
yeah I didn't think about the PBF format at all when I added those.. I'm a bit used to python's convenience stuff, it's just feels more right in python to not return bytes but the expected type, bytes or string or dict. the caller will have to decide either way as well and has the same problem with errors being JSON instead of bytes/string, as expected. since it's already prescribed what has to happen, i.e. each request will have to be handled in exactly one single way, it's not up to the caller to decide what to convert the bytes into to make sense of the output, I'd rather have that decision handled in the bindings so it comes for free to any caller. I didn't look into yet, but I'm sure it's pretty manageable. is that ok for you @kevinkreiser ? I'd add that soon-ish if so. |
yeah im just thinking its more work to manage at runtime. in either case i think you need to change the pybind11 code to return py::bytes. then yeah you can do the work in the actor.py wrapper to first look at the format thats expected and then if it was geotiff first check if the bytes look like a json error isntead of a tiff header. as you said manageable but its just another thing to have to remember to do rather than it being automatic is all. i forgot to say feel free to do it though! |
ah we don't expose the PBF input stuff in the python bindings. though technically you can still use via the hidden and FWIW, that json.loads thing is basically an overload: if you provide a string as input to the bindings functions, then you get the raw response from valhalla, so the issue can even be prevented today if you just call with e.g. |
yeah you're probably right |
very much sounds like it: https://pybind11.readthedocs.io/en/stable/advanced/cast/strings.html#returning-c-strings-to-python
|
yeah sorry i should have said it with more conviction, i knew this stuff as a fact but wanted my suggestion to still come off as a suggestion rather than a demand 😄 communication is hard in text |
haha wasn't sure. but your "I think" is more often than not actually "I know" I think 😄 just returning from the c++ interface a py::bytes would be a fine solution to me, and then do the bit of python wrapper to return it to the caller "properly" and wrap it into the expected (or unexpected) type. I'll see how to best communicate in case of errors. |
This issue was closed, but have the python bindings actually been changed yet to handle what is returned by geoTIFF? It doesn't run currently with latest 'master' in python. |
No you’re right, it was automagically closed by the PR, rather unintended |
I hacked in a fix by applying Is there a more efficient way of doing this do you think? @nilsnolde |
Yeah no need to make it base64, we just want to return raw bytes, see above |
I did read this but would like to confirm, if we send raw bytes, we won't be able to send any info other than the geoTIFF? (ie. no meta data such as the request id etc) |
technically we can put meta data in the tiff but it's not yet implemented. |
geoTIFF seems cool, but it doesn't seem that well supported to me. I've noticed GDAL can convert geoTIFF into PNG (so you can still get the map projections and stuff done before conversion), which is probably (from what I've seen at least) the most common format for displaying raster images on an interactive map (ie. leaflett), and then the bbox would need to be sent separately too. This could be another format type perhaps? Another thing is it'd be good to support a number of different map projections, such as Web Mercator / Pseudo Mercator (EPSG:3857). This is a totally different issue though, so I should probably make this as feature request issue. Currently valhalla is using EPSG:4326 by the looks of it, which is not even the one used by most online maps right? |
the reason geotiff makes sense is that in contains the spatial referencing needed to display the image over a map. yes for raster imagery png and jpg are more common in tile-based systems however how could we make the isochrone endpoint tile-based? just thinking about it in a tiled system brings up a bunch of difficult questions like what: what zoom level there is no practical way to support a slippy map with valhalla for this use-case because we'd have to keep state about what your isochrone was to return the specific tile you are requesting out of it and that is just out of the scope of the project at the moment (the service is stateless/sessionless from the perspective of the caller). yes valhallas data is unprojected (epsg 4326) this is because valhalla is not really focused on the display use-case it is focused on routing applications. why would we spend the time and CPU on tons of trig functions to convert back to lat lon to do distance calculations (crucial for routing but not for display) when we could just keep the raw data in its raw form? it is true that since google embraced it, spherical mercator has taken over webmaps (mapquest used to use equirectangular projection before google was a thing though). but this has many drawbacks including cutting off the poles (how are santa claus or the penguins going to route?) but also distorts distance as it approaches the pole. also with modern tiled maps, the coordinates for each tiles contents are tile relative mercator with varying resolution by zoom level. we could return geometry in global mercator coordinates but you'd still have to translate it to map it to a pixel in a tile. long story short, in my opinion it is ok and actually preferred for us to offer up rasters in unprojected form inside geotiff containers. it is definitely the most natural way for us to deliver the data. after that point it would be totally cool if the end user would use |
These are some interesting points. EDIT: gdal_warp and gdal_retile.py are especially cool ideas |
And GeoTIFF is very widely supported, probably the widest supported geographical raster format. Leaflet is not a good reference anymore IMO, I’d recommend looking into openlayers, it’s (very) actively maintained. |
Thought I'd try out this new GeoTIFF feature on Mac ARM.
I'm running it through python bindings so I can't get a lot of debugging info.
Has anyone successfully ran this on an ARM build of any kind?
I'm doing regular isochrone request (which works), with "format":"geotiff" added
The text was updated successfully, but these errors were encountered: