Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Link to libkml for improved KML support and compatibility with Google Earth #2

Closed
thclark opened this issue Oct 2, 2019 · 7 comments

Comments

@thclark
Copy link

thclark commented Oct 2, 2019

Hi @KevinBrolly great work!

About this new buildpack: Thankyou! If we can get this issue solved, it'll keep me on heroku (I hate to leave, but have been really struggling to get GDAL installed with libkml linked, so have been dockerizing and moving to Google Cloud, which is a nightmare to set up!).

WHAT I'D LOVE
If the version of GDAL vendored into this buildpack was linked against google's libkml library

WHY
GDAL already has a default KML file parser, so why build it with an additional engine for parsing KML files?

  • It is a reference implementation of the standard
  • default implementation only supports vector layers, not raster layers
  • default implementation drops 'description' tags on folders
  • libkml gives more reliable (in my experience) flattening of nested feature sets
  • supports *.kmz files too, allowing persistence of icons and rasters
  • libkml supports Google Earth extensions, pretty common among ppl using KML files (I don't, the other reasons are motivation enough for me, but suspect many do use these).

HOW
If it helps, here are relevant parts of a Dockerfile that adds a version of GDAL with libkml included to the heroku stack (it'll obviously be a bit different for you, since you'll build proj and geos etc, but hopefully is useful):

FROM heroku/heroku:18-build as build

# ...  stuff ...

# These libraries should always be done first. GDAL takes *ages* to build, so if done later in the Dockerfile, then any
# change that invalidates the cache will trigger an extremely long rebuild.

ARG GDAL_VERSION=v2.4.1

# Install proj, geos, libkml and build tools needed to compile gdal
RUN apt-get update -y && apt-get install -y --fix-missing --no-install-recommends \
        libkml-dev libproj-dev libgeos-dev \
        curl autoconf automake bash-completion

RUN ldconfig

# GDAL has a range of install options. Most of them specialized. Some add a lot of size to the slug, so be careful! Some options (may not be exhaustive):
#       python3-dev python3-numpy libboost-dev  libpng-dev libjpeg-dev libgif-dev \
#       libcharls-dev libopenjp2-7-dev libcairo2-dev \
#       liblzma-dev curl libcurl4-gnutls-dev libxml2-dev libexpat-dev libxerces-c-dev \
#       libnetcdf-dev libpoppler-dev libpoppler-private-dev \
#       libspatialite-dev swig libhdf4-alt-dev libhdf5-serial-dev \
#       libfreexl-dev unixodbc-dev libwebp-dev libepsilon-dev \
#       liblcms2-2 libpcre3-dev libcrypto++-dev libdap-dev libfyba-dev \
#       libmysqlclient-dev libogdi3.2-dev \
#       libcfitsio-dev openjdk-8-jdk libzstd1-dev \
#       libpq-dev libssl-dev

# Build GDAL
RUN mkdir gdal \
    && curl -L https://github.com/OSGeo/gdal/archive/${GDAL_VERSION}.tar.gz | tar xz -C gdal --strip-components=1 \
    && cd gdal/gdal \
    && ./configure \
        --without-libtool \
        --with-geos=yes \
        --with-libkml \
        --with-proj \
    && make \
    && make install \
    && cd ../.. \
    && rm -rf gdal

RUN ldconfig

AN OFFER
I'm not that familiar with heroku buildpacks, but have ahem "become" familiar with building GDAL. If it'll save me porting everything to google cloud, I'll happily spend a day of dev time helping with this in any way I can. Just ping me.

@thclark thclark changed the title Link to libkml for compatibility with Google Earth and maps files Link to libkml for improved KML support and compatibility with Google Earth Oct 2, 2019
@CaseyFaist
Copy link

😂 "become" familiar, I know a little about what that feels like

Your work looks more thorough than what I found after a quick google; we'd probably want to move the steps into formulas and trigger files but the steps will likely look similar. If you want to take a crack at formulize-ing it, go for it 💯

@KevinBrolly
Copy link
Contributor

Hey @thclark I have added libkml support in this commit - 7330193

Could you give it a test and let me know how you get on? If you are already using this buildpack in an application the new binary should get pulled down the next time you do a build.

I have done it slightly differently than you have in your PR (#4) as it is faster and easier to just compile libkml and its dependencies to a custom prefix and shipping that over with the gdal binary rather than trying to apt-get during the build phase.

@thclark
Copy link
Author

thclark commented Oct 9, 2019

Getting there! Great work @KevinBrolly - thanks for your efforts on this.

This builds and releases, but unfortunately, the libraries aren't found at runtime... I have an error that looks like:

~ $ gdalinfo
gdalinfo: error while loading shared libraries: libkmlbase.so.1: cannot open shared object file: No such file or directory

I've just made my test app public which deploys successfully to heroku (use this buildpack and heroku/python), so you're welcome to use that to help.

I shelled in to that deployed app, and found the other libraries hd been deployed by the buildpack, but not libkml. Perhaps there's a build cache somewhere that needs to flushed?

@KevinBrolly
Copy link
Contributor

Hey @thclark - Sorry, looks like I forgot to include the libkml files in the gdal package at runtime 🤦‍♂

Thanks for the test app, that's really helpful. I will let you know when libkml is properly deployed.

@KevinBrolly
Copy link
Contributor

Hey @thclark I have updated the gdal packages for the heroku-18 stack (gdal takes an age to build!). I spun up a dyno with your test app and it looks good:

~ $ gdalinfo
Usage: gdalinfo [--help-general] [-json] [-mm] [-stats] [-hist] [-nogcp] [-nomd]
                [-norat] [-noct] [-nofl] [-checksum] [-proj4]
                [-listmdd] [-mdd domain|`all`]*
                [-sd subdataset] [-oo NAME=VALUE]* datasetname

Let me know if that is working for you now.

@thclark
Copy link
Author

thclark commented Oct 11, 2019

Woohooo! This is great, working well. Thankyou so much @KevinBrolly - this issue has been completely killing my project.

Perhaps the following will be useful to add to the README (maybe edited slightly ;) ):

Notes for switching over from using the GDAL that gets vendored with heroku/python

You have to completely remove the BUILD_WITH_GEO_LIBRARIES environment variable :

You have to flush your application build cache.

You have to monkey around with the Heroku CI cache:

  • Heroku CI is a total PITA when it comes to cache; AFAICT it uses a separate cache from the build cache, and there's no documented way of flushing it.
  • Running it in debug mode from the cli, you can run with no cache. This doesn't purge the cache, though, so everything working nicely here will still pick up outdated libraries when rerun or automated.
  • There's an extremely well hidden button in the UI, to the top right of the test console (see screenshot). It seems like the "run again with no cache" option does actually purge the cache, but only if you wait for all other tests to finish (failing, on the old cached GDAL library), then click this button from the UI, wait for that test to finish (now passing) then repeat for every active branch.
    Screenshot 2019-10-11 at 18 59 52

Phew! Job done! Thank you @KevinBrolly and @CaseyFaist you're my heroes of the month, and I'll allow you the honour of closing this issue ;)

@KevinBrolly
Copy link
Contributor

Thanks for the README suggestions @thclark that will be really useful for people switching over, I will get that in the README now.

You are right that the Heroku CI cache functionality could be documented better.

Heroku CI does have a separate build cache for CI builds. The way the "Run Again Without Cache" works in Heroku CI is that there is a single cache for your test run, if you click "Run Again Without Cache" then at the end of that run the build cache from that run replaces the previous build cache.

This is why you have to wait for the tests to fail, then run again without cache, then after that run the new cache from that build will be used and your tests then pass. I will make a note to document this better.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants