-
-
Notifications
You must be signed in to change notification settings - Fork 648
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Updated Dockerfile with SpatiaLite version 5.0 #1249
Comments
Worth noting that the Docker image used by datasette/datasette/utils/__init__.py Lines 349 to 353 in d0fd833
Where the apt extras for SpatiaLite are: datasette/datasette/utils/__init__.py Lines 344 to 345 in d0fd833
|
I tried this patch against diff --git a/Dockerfile b/Dockerfile
index f4b1414..dd659e1 100644
--- a/Dockerfile
+++ b/Dockerfile
@@ -1,25 +1,26 @@
-FROM python:3.7.10-slim-stretch as build
+FROM python:3.9.2-slim-buster as build
# Setup build dependencies
RUN apt update \
-&& apt install -y python3-dev build-essential wget libxml2-dev libproj-dev libgeos-dev libsqlite3-dev zlib1g-dev pkg-config git \
- && apt clean
+ && apt install -y python3-dev build-essential wget libxml2-dev libproj-dev \
+ libminizip-dev libgeos-dev libsqlite3-dev zlib1g-dev pkg-config git \
+ && apt clean
-
-RUN wget "https://www.sqlite.org/2020/sqlite-autoconf-3310100.tar.gz" && tar xzf sqlite-autoconf-3310100.tar.gz \
- && cd sqlite-autoconf-3310100 && ./configure --disable-static --enable-fts5 --enable-json1 CFLAGS="-g -O2 -DSQLITE_ENABLE_FTS3=1 -DSQLITE_ENABLE_FTS3_PARENTHESIS -DSQLITE_ENABLE_FTS4=1 -DSQLITE_ENABLE_RTREE=1 -DSQLITE_ENABLE_JSON1" \
+RUN wget "https://www.sqlite.org/2021/sqlite-autoconf-3340100.tar.gz" && tar xzf sqlite-autoconf-3340100.tar.gz \
+ && cd sqlite-autoconf-3340100 && ./configure --disable-static --enable-fts5 --enable-json1 \
+ CFLAGS="-g -O2 -DSQLITE_ENABLE_FTS3=1 -DSQLITE_ENABLE_FTS3_PARENTHESIS -DSQLITE_ENABLE_FTS4=1 -DSQLITE_ENABLE_RTREE=1 -DSQLITE_ENABLE_JSON1" \
&& make && make install
-RUN wget "http://www.gaia-gis.it/gaia-sins/freexl-sources/freexl-1.0.5.tar.gz" && tar zxf freexl-1.0.5.tar.gz \
- && cd freexl-1.0.5 && ./configure && make && make install
+RUN wget "http://www.gaia-gis.it/gaia-sins/freexl-1.0.6.tar.gz" && tar zxf freexl-1.0.6.tar.gz \
+ && cd freexl-1.0.6 && ./configure && make && make install
-RUN wget "http://www.gaia-gis.it/gaia-sins/libspatialite-sources/libspatialite-4.4.0-RC0.tar.gz" && tar zxf libspatialite-4.4.0-RC0.tar.gz \
- && cd libspatialite-4.4.0-RC0 && ./configure && make && make install
+RUN wget "http://www.gaia-gis.it/gaia-sins/libspatialite-5.0.1.tar.gz" && tar zxf libspatialite-5.0.1.tar.gz \
+ && cd libspatialite-5.0.1 && ./configure --disable-rttopo && make && make install
RUN wget "http://www.gaia-gis.it/gaia-sins/readosm-sources/readosm-1.1.0.tar.gz" && tar zxf readosm-1.1.0.tar.gz && cd readosm-1.1.0 && ./configure && make && make install
-RUN wget "http://www.gaia-gis.it/gaia-sins/spatialite-tools-sources/spatialite-tools-4.4.0-RC0.tar.gz" && tar zxf spatialite-tools-4.4.0-RC0.tar.gz \
- && cd spatialite-tools-4.4.0-RC0 && ./configure && make && make install
+RUN wget "http://www.gaia-gis.it/gaia-sins/spatialite-tools-5.0.0.tar.gz" && tar zxf spatialite-tools-5.0.0.tar.gz \
+ && cd spatialite-tools-5.0.0 && ./configure --disable-rttopo && make && make install
# Add local code to the image instead of fetching from pypi.
@@ -27,7 +28,7 @@ COPY . /datasette
RUN pip install /datasette
-FROM python:3.7.10-slim-stretch
+FROM python:3.9.2-slim-buster
# Copy python dependencies and spatialite libraries
COPY --from=build /usr/local/lib/ /usr/local/lib/ I had to use This works, sort of... I'm getting a weird issue where the |
One reason to prioritize this issue: Homebrew upgraded to SpatiaLite 5.0 recently https://formulae.brew.sh/formula/spatialite-tools and as a result SpatiaLite database created on my laptop don't appear to be compatible with Datasette when published using |
Here's something odd: when I run But when I tried compiling SpatiaLite inside the Docker container I had hanging errors on some pages. This is using https://www.gaia-gis.it/gaia-sins/knn/tuscany_housenumbers.7z from the SpatiaLite KNN tutorial at https://www.gaia-gis.it/fossil/libspatialite/wiki?name=KNN |
Ubuntu Groovy has a package for SpatiaLite 5 - I could try using that instead: https://packages.ubuntu.com/groovy/libspatialite-dev |
Debian sid has it too: https://packages.debian.org/sid/libspatialite-dev |
https://pythonspeed.com/articles/base-image-python-docker-images/ suggests using |
So I'm going to try |
Actually for the loadable module I think I need https://packages.ubuntu.com/groovy/libsqlite3-mod-spatialite |
I'm messing around with the
And then:
|
To debug I'm running:
This gets me a shell I can use. |
This is the shortest Dockerfile that appeared to give me a working SpatiaLite module: FROM ubuntu:20.10
# Setup build dependencies
RUN apt update && apt install -y python3-pip libsqlite3-mod-spatialite && apt clean
# Add local code to the image instead of fetching from pypi.
COPY . /datasette
RUN pip install /datasette
RUN rm -rf /datasette
EXPOSE 8001
CMD ["datasette"] |
It's pretty big though. I tried this version which avoids copying junk from my laptop in: FROM ubuntu:20.10
# Setup build dependencies
RUN apt update && apt install -y python3-pip libsqlite3-mod-spatialite && apt clean
RUN pip install datasette
EXPOSE 8001
CMD ["datasette"] And got this:
|
Building a Dockerfile containing just Building this one: FROM ubuntu:20.10
# Setup build dependencies
RUN apt update && \
apt install -y python3-pip libsqlite3-mod-spatialite && \
apt clean && \
rm -rf /var/lib/{apt,dpkg,cache,log}/ Resulted in a 515MB image. |
I tried that with just |
Here's a trick: install SpatiaLite in FROM ubuntu:20.10 as install_spatialite
RUN apt update && \
apt install -y libsqlite3-mod-spatialite && \
apt clean && \
rm -rf /var/lib/{apt,dpkg,cache,log}/
FROM python:3.9.2-slim as build
RUN pip install datasette
#COPY . /datasette
#RUN pip install /datasette
FROM python:3.9.2-slim
# Copy python dependencies and spatialite libraries
COPY --from=build /usr/local/lib/ /usr/local/lib/
# Copy executables
COPY --from=build /usr/local/bin /usr/local/bin
# Copy spatial extensions
COPY --from=install_spatialite /usr/lib/x86_64-linux-gnu /usr/lib/x86_64-linux-gnu
ENV LD_LIBRARY_PATH=/usr/local/lib
EXPOSE 8001
CMD ["datasette"] That produced a 265MB image. |
Which is great, because the image on Docker Hub right now is 383MB. |
... except my clever image using SpatiaLite installed for Ubuntu doesn't actually work:
|
I tried copying just the FROM ubuntu:20.10 as install_spatialite
RUN apt update && \
apt install -y libsqlite3-mod-spatialite && \
apt clean && \
rm -rf /var/lib/{apt,dpkg,cache,log}/
FROM python:3.9.2-slim as build
RUN pip install datasette
#COPY . /datasette
#RUN pip install /datasette
FROM python:3.9.2-slim
# Copy python dependencies and spatialite libraries
COPY --from=build /usr/local/lib/ /usr/local/lib/
# Copy executables
COPY --from=build /usr/local/bin /usr/local/bin
# Copy spatial extensions
COPY --from=install_spatialite /usr/lib/x86_64-linux-gnu/mod_spatialite.so /usr/lib/x86_64-linux-gnu/mod_spatialite.so
ENV LD_LIBRARY_PATH=/usr/local/lib
EXPOSE 8001
CMD ["datasette"] But when I ran Datasette with
|
I'll try using FROM ubuntu:20.10 as install_spatialite
RUN apt update && \
apt install -y libsqlite3-mod-spatialite && \
apt clean && \
rm -rf /var/lib/{apt,dpkg,cache,log}/
FROM ubuntu:20.10 as build
RUN apt update && \
apt install -y python3-pip && \
apt clean && \
rm -rf /var/lib/{apt,dpkg,cache,log}/
RUN pip install datasette
#COPY . /datasette
#RUN pip install /datasette
FROM ubuntu:20.10
# Copy python dependencies and spatialite libraries
COPY --from=build /usr/local/lib/ /usr/local/lib/
# Copy executables
COPY --from=build /usr/local/bin /usr/local/bin
# Copy spatial extensions
COPY --from=install_spatialite /usr/lib/x86_64-linux-gnu/mod_spatialite.so /usr/lib/x86_64-linux-gnu/mod_spatialite.so
ENV LD_LIBRARY_PATH=/usr/local/lib
EXPOSE 8001
CMD ["datasette"] |
Got this error attempting to run Datasette (with or without SpatiaLite):
|
Here's why:
|
Well this is frustrating. I finally found a Dockerfile that worked and installed an Ubuntu pre-compiled SpatiaLite module that would load... FROM ubuntu:20.10 as install_spatialite
RUN apt update && \
apt install -y libsqlite3-mod-spatialite && \
apt clean && \
rm -rf /var/lib/{apt,dpkg,cache,log}/
FROM ubuntu:20.10
RUN apt update && \
apt install -y python3-pip && \
apt clean && \
rm -rf /var/lib/{apt,dpkg,cache,log}/
RUN pip install datasette
# Copy spatial extensions
COPY --from=install_spatialite /usr/lib/x86_64-linux-gnu /usr/lib/x86_64-linux-gnu/
ENV LD_LIBRARY_PATH=/usr/local/lib
EXPOSE 8001
CMD ["datasette"] (Which produced a 550MB image) And when I ran Datasette I got that same error where the database listing page hangs!
|
I'll spin off a separate ticket to investigate the hang. |
Ideally I'd like to use the Debian stable This pattern might let me do that: https://github.com/helmesjo/cpp_bash_utils/blob/f031e926249f8e2d7f260f22dc8974c6d5be11fe/docker/images/linux-gcc.dockerfile#L20-L24 |
This Dockerfile: FROM python:3.9.2-slim-buster as build
# Setup build dependencies
RUN apt update \
&& apt install -y python3-dev build-essential wget libxml2-dev libproj-dev \
libminizip-dev libgeos-dev libsqlite3-dev zlib1g-dev pkg-config git \
&& apt clean
RUN wget "https://www.sqlite.org/2021/sqlite-autoconf-3340100.tar.gz" && tar xzf sqlite-autoconf-3340100.tar.gz \
&& cd sqlite-autoconf-3340100 && ./configure --disable-static --enable-fts5 --enable-json1 \
CFLAGS="-g -O2 -DSQLITE_ENABLE_FTS3=1 -DSQLITE_ENABLE_FTS3_PARENTHESIS -DSQLITE_ENABLE_FTS4=1 -DSQLITE_ENABLE_RTREE=1 -DSQLITE_ENABLE_JSON1" \
&& make && make install
RUN wget "http://www.gaia-gis.it/gaia-sins/freexl-1.0.6.tar.gz" && tar zxf freexl-1.0.6.tar.gz \
&& cd freexl-1.0.6 && ./configure && make && make install
RUN wget "http://www.gaia-gis.it/gaia-sins/libspatialite-5.0.1.tar.gz" && tar zxf libspatialite-5.0.1.tar.gz \
&& cd libspatialite-5.0.1 && ./configure --disable-rttopo && make && make install
RUN wget "http://www.gaia-gis.it/gaia-sins/readosm-sources/readosm-1.1.0.tar.gz" && tar zxf readosm-1.1.0.tar.gz && cd readosm-1.1.0 && ./configure && make && make install
RUN wget "http://www.gaia-gis.it/gaia-sins/spatialite-tools-5.0.0.tar.gz" && tar zxf spatialite-tools-5.0.0.tar.gz \
&& cd spatialite-tools-5.0.0 && ./configure --disable-rttopo && make && make install
# Add local code to the image instead of fetching from pypi.
#COPY . /datasette
#RUN pip install /datasette
RUN pip install datasette
FROM python:3.9.2-slim-buster
# Copy python dependencies and spatialite libraries
COPY --from=build /usr/local/lib/ /usr/local/lib/
# Copy executables
COPY --from=build /usr/local/bin /usr/local/bin
# Copy spatial extensions
COPY --from=build /usr/lib/x86_64-linux-gnu /usr/lib/x86_64-linux-gnu
ENV LD_LIBRARY_PATH=/usr/local/lib
EXPOSE 8001
CMD ["datasette"] Produced a 448MB image. |
This Dockerfile: FROM python:3.9.2-slim-buster as build
# software-properties-common provides add-apt-repository
RUN apt-get update && \
apt-get -y install software-properties-common && \
add-apt-repository "deb http://httpredir.debian.org/debian sid main" && \
apt-get update && \
apt-get -t sid install -y libsqlite3-mod-spatialite && \
apt clean && \
rm -rf /var/lib/{apt,dpkg,cache,log}/
RUN pip install datasette
EXPOSE 8001
CMD ["datasette"] Produces a 344MB image that includes a working SpatiaLite 5.0 module. And weirdly... it doesn't exhibit the hanging bug! |
Considering the image on Docker Hub right now is |
Replacing
Got the size down to 305MB. |
Does |
I wrote a bunch of tips on creating smaller Docker images here: https://simonwillison.net/2018/Nov/19/smaller-python-docker-images/ |
Adding |
That dropped it to 265MB. |
FROM python:3.9.2-slim-buster as build
# software-properties-common provides add-apt-repository
RUN apt-get update && \
apt-get -y --no-install-recommends install software-properties-common && \
add-apt-repository "deb http://httpredir.debian.org/debian sid main" && \
apt-get update && \
apt-get -t sid install -y --no-install-recommends libsqlite3-mod-spatialite && \
apt clean && \
rm -rf /var/lib/apt && \
rm -rf /var/lib/dpkg
RUN pip install datasette && \
find /usr/local/lib -name '__pycache__' | xargs rm -r && \
rm -rf /root/.cache/pip
EXPOSE 8001
CMD ["datasette"] 262 MB |
I tried copying just the
|
I tried adding |
Considering the image on Docker Hub is 383MB, I'm happy with getting that down to 262MB. I'm going to stop looking for new optimizations here. |
I think part of the reason it's smaller is that I ran |
Final version of Dockerfile which installs the specified version from GitHub:
FROM python:3.9.2-slim-buster as build
# Version of Datasette to install, e.g. 0.55
# docker build . -t datasette --build-arg VERSION=0.55
ARG VERSION
# software-properties-common provides add-apt-repository
# which we need in order to install a more recent release
# of libsqlite3-mod-spatialite from the sid distribution
RUN apt-get update && \
apt-get -y --no-install-recommends install software-properties-common && \
add-apt-repository "deb http://httpredir.debian.org/debian sid main" && \
apt-get update && \
apt-get -t sid install -y --no-install-recommends libsqlite3-mod-spatialite && \
apt-get remove -y software-properties-common && \
apt clean && \
rm -rf /var/lib/apt && \
rm -rf /var/lib/dpkg
RUN pip install https://github.com/simonw/datasette/archive/refs/tags/${VERSION}.zip && \
find /usr/local/lib -name '__pycache__' | xargs rm -r && \
rm -rf /root/.cache/pip
EXPOSE 8001
CMD ["datasette"] Run against 0.55 this produces an image of 262MB |
(Without the |
Don't forget to update this bit of the docs: https://docs.datasette.io/en/0.55/spatialite.html#building-spatialite-from-source
See also #1273 |
One last test of that Dockerfile:
|
I'll close this issue after I ship Datasette 0.56 and confirm that the Dockerfile was correctly built and published to Docker Hub. |
I just shipped Datasette 0.56 - here's the CI run: https://github.com/simonw/datasette/runs/2214701802?check_suite_focus=true It pushed a new
And then:
Outputs: {
"version": "3.27.2",
"fts_versions": [
"FTS5",
"FTS4",
"FTS3"
],
"extensions": {
"json1": null,
"spatialite": "5.0.1"
},
"compile_options": [
"COMPILER=gcc-8.3.0",
"ENABLE_COLUMN_METADATA",
"ENABLE_DBSTAT_VTAB",
"ENABLE_FTS3",
"ENABLE_FTS3_PARENTHESIS",
"ENABLE_FTS3_TOKENIZER",
"ENABLE_FTS4",
"ENABLE_FTS5",
"ENABLE_JSON1",
"ENABLE_LOAD_EXTENSION",
"ENABLE_PREUPDATE_HOOK",
"ENABLE_RTREE",
"ENABLE_SESSION",
"ENABLE_STMTVTAB",
"ENABLE_UNLOCK_NOTIFY",
"ENABLE_UPDATE_DELETE_LIMIT",
"HAVE_ISNAN",
"LIKE_DOESNT_MATCH_BLOBS",
"MAX_SCHEMA_RETRY=25",
"MAX_VARIABLE_NUMBER=250000",
"OMIT_LOOKASIDE",
"SECURE_DELETE",
"SOUNDEX",
"TEMP_STORE=1",
"THREADSAFE=1",
"USE_URI"
]
} |
The version bundled in Datasette's Docker image right now is 4.4.0-RC0
datasette/Dockerfile
Lines 16 to 17 in d0fd833
5 has been out for a couple of months and has a bunch of big improvements, most notable stable KNN support.
The text was updated successfully, but these errors were encountered: