Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updated Dockerfile with SpatiaLite version 5.0 #1249

Closed
simonw opened this issue Mar 8, 2021 · 45 comments
Closed

Updated Dockerfile with SpatiaLite version 5.0 #1249

simonw opened this issue Mar 8, 2021 · 45 comments

Comments

@simonw
Copy link
Owner

simonw commented Mar 8, 2021

The version bundled in Datasette's Docker image right now is 4.4.0-RC0

datasette/Dockerfile

Lines 16 to 17 in d0fd833

RUN wget "http://www.gaia-gis.it/gaia-sins/libspatialite-sources/libspatialite-4.4.0-RC0.tar.gz" && tar zxf libspatialite-4.4.0-RC0.tar.gz \
&& cd libspatialite-4.4.0-RC0 && ./configure && make && make install

5 has been out for a couple of months and has a bunch of big improvements, most notable stable KNN support.

@simonw
Copy link
Owner Author

simonw commented Mar 8, 2021

Worth noting that the Docker image used by datasette publish cloudrun doesn't actually use that Datasette docker image - it does this:

return """
FROM python:3.8
COPY . /app
WORKDIR /app
{apt_get_extras}

Where the apt extras for SpatiaLite are:

if spatialite:
apt_get_extras.extend(["python3-dev", "gcc", "libsqlite3-mod-spatialite"])

libsqlite3-mod-spatialite against that official python:3.8 image doesn't appear to install SpatiaLite 5.0.

@simonw
Copy link
Owner Author

simonw commented Mar 8, 2021

I tried this patch against Dockerfile:

diff --git a/Dockerfile b/Dockerfile
index f4b1414..dd659e1 100644
--- a/Dockerfile
+++ b/Dockerfile
@@ -1,25 +1,26 @@
-FROM python:3.7.10-slim-stretch as build
+FROM python:3.9.2-slim-buster as build
 
 # Setup build dependencies
 RUN apt update \
-&& apt install -y python3-dev build-essential wget libxml2-dev libproj-dev libgeos-dev libsqlite3-dev zlib1g-dev pkg-config git \
- && apt clean
+  && apt install -y python3-dev build-essential wget libxml2-dev libproj-dev \
+  libminizip-dev libgeos-dev libsqlite3-dev zlib1g-dev pkg-config git \
+  && apt clean
 
-
-RUN wget "https://www.sqlite.org/2020/sqlite-autoconf-3310100.tar.gz" && tar xzf sqlite-autoconf-3310100.tar.gz \
-    && cd sqlite-autoconf-3310100 && ./configure --disable-static --enable-fts5 --enable-json1 CFLAGS="-g -O2 -DSQLITE_ENABLE_FTS3=1 -DSQLITE_ENABLE_FTS3_PARENTHESIS -DSQLITE_ENABLE_FTS4=1 -DSQLITE_ENABLE_RTREE=1 -DSQLITE_ENABLE_JSON1" \
+RUN wget "https://www.sqlite.org/2021/sqlite-autoconf-3340100.tar.gz" && tar xzf sqlite-autoconf-3340100.tar.gz \
+    && cd sqlite-autoconf-3340100 && ./configure --disable-static --enable-fts5 --enable-json1 \
+    CFLAGS="-g -O2 -DSQLITE_ENABLE_FTS3=1 -DSQLITE_ENABLE_FTS3_PARENTHESIS -DSQLITE_ENABLE_FTS4=1 -DSQLITE_ENABLE_RTREE=1 -DSQLITE_ENABLE_JSON1" \
     && make && make install
 
-RUN wget "http://www.gaia-gis.it/gaia-sins/freexl-sources/freexl-1.0.5.tar.gz" && tar zxf freexl-1.0.5.tar.gz \
-    && cd freexl-1.0.5 && ./configure && make && make install
+RUN wget "http://www.gaia-gis.it/gaia-sins/freexl-1.0.6.tar.gz" && tar zxf freexl-1.0.6.tar.gz \
+    && cd freexl-1.0.6 && ./configure && make && make install
 
-RUN wget "http://www.gaia-gis.it/gaia-sins/libspatialite-sources/libspatialite-4.4.0-RC0.tar.gz" && tar zxf libspatialite-4.4.0-RC0.tar.gz \
-    && cd libspatialite-4.4.0-RC0 && ./configure && make && make install
+RUN wget "http://www.gaia-gis.it/gaia-sins/libspatialite-5.0.1.tar.gz" && tar zxf libspatialite-5.0.1.tar.gz \
+    && cd libspatialite-5.0.1 && ./configure --disable-rttopo && make && make install
 
 RUN wget "http://www.gaia-gis.it/gaia-sins/readosm-sources/readosm-1.1.0.tar.gz" && tar zxf readosm-1.1.0.tar.gz && cd readosm-1.1.0 && ./configure && make && make install
 
-RUN wget "http://www.gaia-gis.it/gaia-sins/spatialite-tools-sources/spatialite-tools-4.4.0-RC0.tar.gz" && tar zxf spatialite-tools-4.4.0-RC0.tar.gz \
-    && cd spatialite-tools-4.4.0-RC0 && ./configure && make && make install
+RUN wget "http://www.gaia-gis.it/gaia-sins/spatialite-tools-5.0.0.tar.gz" && tar zxf spatialite-tools-5.0.0.tar.gz \
+    && cd spatialite-tools-5.0.0 && ./configure --disable-rttopo && make && make install
 
 
 # Add local code to the image instead of fetching from pypi.
@@ -27,7 +28,7 @@ COPY . /datasette
 
 RUN pip install /datasette
 
-FROM python:3.7.10-slim-stretch
+FROM python:3.9.2-slim-buster
 
 # Copy python dependencies and spatialite libraries
 COPY --from=build /usr/local/lib/ /usr/local/lib/

I had to use --disable-rttopo from the tip in OSGeo/gdal#3443 and also needed to install libminizip-dev.

This works, sort of... I'm getting a weird issue where the /dbname page is hanging some of the time instead of loading correctly. Other than that it seems to work, but a hanging page is bad!

@simonw simonw changed the title Upgrade SpatiaLite to version 5 Upgrade SpatiaLite to version 5.0 Mar 8, 2021
@simonw
Copy link
Owner Author

simonw commented Mar 8, 2021

One reason to prioritize this issue: Homebrew upgraded to SpatiaLite 5.0 recently https://formulae.brew.sh/formula/spatialite-tools and as a result SpatiaLite database created on my laptop don't appear to be compatible with Datasette when published using datasette publish.

@simonw
Copy link
Owner Author

simonw commented Mar 22, 2021

Here's something odd: when I run datasette tuscany_housenumbers.sqlite --load-extension=spatialite on macOS against SpatiaLite installed using Homebrew (which reports "spatialite": "5.0.0" on the /-/versions page) I don't get any weird errors at all, everything works fine.

But when I tried compiling SpatiaLite inside the Docker container I had hanging errors on some pages.

This is using https://www.gaia-gis.it/gaia-sins/knn/tuscany_housenumbers.7z from the SpatiaLite KNN tutorial at https://www.gaia-gis.it/fossil/libspatialite/wiki?name=KNN

@simonw
Copy link
Owner Author

simonw commented Mar 22, 2021

Ubuntu Groovy has a package for SpatiaLite 5 - I could try using that instead: https://packages.ubuntu.com/groovy/libspatialite-dev

@simonw
Copy link
Owner Author

simonw commented Mar 22, 2021

Debian sid has it too: https://packages.debian.org/sid/libspatialite-dev

@simonw
Copy link
Owner Author

simonw commented Mar 22, 2021

https://pythonspeed.com/articles/base-image-python-docker-images/ suggests using python:3.9-slim-buster or ubuntu:20.04 - but 20.04 is focal which still has SpatiaLite 4.3.0a-6build1 - It's 20.10 that has 5.0: https://packages.ubuntu.com/groovy/libspatialite-dev

@simonw
Copy link
Owner Author

simonw commented Mar 22, 2021

So I'm going to try 20.10 and see where that gets me.

@simonw
Copy link
Owner Author

simonw commented Mar 22, 2021

Actually for the loadable module I think I need https://packages.ubuntu.com/groovy/libsqlite3-mod-spatialite

@simonw
Copy link
Owner Author

simonw commented Mar 22, 2021

I'm messing around with the Dockerfile and after each change I'm running:

docker build . -t datasette-spatialite

And then:

docker run -p 8001:8001 -v `pwd`:/mnt datasette-spatialite:latest datasette -p 8001 -h 0.0.0.0 /mnt/fixtures.db

@simonw
Copy link
Owner Author

simonw commented Mar 22, 2021

To debug I'm running:

docker run -it -p 8001:8001 -v `pwd`:/mnt datasette-spatialite:latest bash

This gets me a shell I can use.

@simonw
Copy link
Owner Author

simonw commented Mar 22, 2021

This is the shortest Dockerfile that appeared to give me a working SpatiaLite module:

FROM ubuntu:20.10

# Setup build dependencies
RUN apt update && apt install -y python3-pip libsqlite3-mod-spatialite && apt clean

# Add local code to the image instead of fetching from pypi.
COPY . /datasette

RUN pip install /datasette

RUN rm -rf /datasette

EXPOSE 8001
CMD ["datasette"]

@simonw
Copy link
Owner Author

simonw commented Mar 22, 2021

It's pretty big though. I tried this version which avoids copying junk from my laptop in:

FROM ubuntu:20.10

# Setup build dependencies
RUN apt update && apt install -y python3-pip libsqlite3-mod-spatialite && apt clean

RUN pip install datasette

EXPOSE 8001
CMD ["datasette"]

And got this:

datasette % docker images datasette-spatialite    
REPOSITORY             TAG       IMAGE ID       CREATED         SIZE
datasette-spatialite   latest    0796950653c2   2 seconds ago   528MB

@simonw
Copy link
Owner Author

simonw commented Mar 22, 2021

Building a Dockerfile containing just FROM ubuntu:20.10 gave me 79.5MB.

Building this one:

FROM ubuntu:20.10

# Setup build dependencies
RUN apt update && \
    apt install -y python3-pip libsqlite3-mod-spatialite && \
    apt clean && \
    rm -rf /var/lib/{apt,dpkg,cache,log}/

Resulted in a 515MB image.

@simonw
Copy link
Owner Author

simonw commented Mar 22, 2021

I tried that with just python3-pip (removing libsqlite3-mod-spatialite) and got 435MB.

@simonw
Copy link
Owner Author

simonw commented Mar 22, 2021

Here's a trick: install SpatiaLite in ubuntu:20.10 and then copy it into the final python:3.9.2-slim image.

FROM ubuntu:20.10 as install_spatialite

RUN apt update && \
    apt install -y libsqlite3-mod-spatialite && \
    apt clean && \
    rm -rf /var/lib/{apt,dpkg,cache,log}/

FROM python:3.9.2-slim as build

RUN pip install datasette

#COPY . /datasette
#RUN pip install /datasette

FROM python:3.9.2-slim

# Copy python dependencies and spatialite libraries
COPY --from=build /usr/local/lib/ /usr/local/lib/
# Copy executables
COPY --from=build /usr/local/bin /usr/local/bin
# Copy spatial extensions
COPY --from=install_spatialite /usr/lib/x86_64-linux-gnu /usr/lib/x86_64-linux-gnu

ENV LD_LIBRARY_PATH=/usr/local/lib

EXPOSE 8001
CMD ["datasette"]

That produced a 265MB image.

@simonw
Copy link
Owner Author

simonw commented Mar 22, 2021

Which is great, because the image on Docker Hub right now is 383MB.

@simonw
Copy link
Owner Author

simonw commented Mar 22, 2021

... except my clever image using SpatiaLite installed for Ubuntu doesn't actually work:

datasette % docker run -p 8001:8001 -v `pwd`:/mnt datasette-spatialite:latest datasette -p 8001 -h 0.0.0.0 /mnt/fixtures.db
  File "/usr/local/lib/python3.9/sqlite3/dbapi2.py", line 27, in <module>
    from _sqlite3 import *
ImportError: /lib/x86_64-linux-gnu/libm.so.6: version `GLIBC_2.29' not found (required by /usr/lib/x86_64-linux-gnu/libsqlite3.so.0)

@simonw
Copy link
Owner Author

simonw commented Mar 22, 2021

I tried copying just the mod_spatialite.so file:

FROM ubuntu:20.10 as install_spatialite

RUN apt update && \
    apt install -y libsqlite3-mod-spatialite && \
    apt clean && \
    rm -rf /var/lib/{apt,dpkg,cache,log}/

FROM python:3.9.2-slim as build

RUN pip install datasette

#COPY . /datasette
#RUN pip install /datasette

FROM python:3.9.2-slim

# Copy python dependencies and spatialite libraries
COPY --from=build /usr/local/lib/ /usr/local/lib/
# Copy executables
COPY --from=build /usr/local/bin /usr/local/bin
# Copy spatial extensions
COPY --from=install_spatialite /usr/lib/x86_64-linux-gnu/mod_spatialite.so /usr/lib/x86_64-linux-gnu/mod_spatialite.so

ENV LD_LIBRARY_PATH=/usr/local/lib

EXPOSE 8001
CMD ["datasette"]

But when I ran Datasette with --load-extension=spatialite I got this:

  File "/usr/local/lib/python3.9/site-packages/datasette/database.py", line 151, in in_thread
    self.ds._prepare_connection(conn, self.name)
  File "/usr/local/lib/python3.9/site-packages/datasette/app.py", line 502, in _prepare_connection
    conn.execute(f"SELECT load_extension('{extension}')")
sqlite3.OperationalError: /usr/lib/x86_64-linux-gnu/mod_spatialite.so.so: cannot open shared object file: No such file or directory

@simonw
Copy link
Owner Author

simonw commented Mar 22, 2021

I'll try using ubuntu:20.10 for everything:

FROM ubuntu:20.10 as install_spatialite

RUN apt update && \
    apt install -y libsqlite3-mod-spatialite && \
    apt clean && \
    rm -rf /var/lib/{apt,dpkg,cache,log}/

FROM ubuntu:20.10 as build

RUN apt update && \
    apt install -y python3-pip && \
    apt clean && \
    rm -rf /var/lib/{apt,dpkg,cache,log}/

RUN pip install datasette

#COPY . /datasette
#RUN pip install /datasette

FROM ubuntu:20.10

# Copy python dependencies and spatialite libraries
COPY --from=build /usr/local/lib/ /usr/local/lib/
# Copy executables
COPY --from=build /usr/local/bin /usr/local/bin
# Copy spatial extensions
COPY --from=install_spatialite /usr/lib/x86_64-linux-gnu/mod_spatialite.so /usr/lib/x86_64-linux-gnu/mod_spatialite.so

ENV LD_LIBRARY_PATH=/usr/local/lib

EXPOSE 8001
CMD ["datasette"]

@simonw
Copy link
Owner Author

simonw commented Mar 22, 2021

Got this error attempting to run Datasette (with or without SpatiaLite):

standard_init_linux.go:219: exec user process caused: no such file or directory

@simonw
Copy link
Owner Author

simonw commented Mar 22, 2021

Here's why:

datasette % docker run -it -p 8001:8001 -v `pwd`:/mnt datasette-spatialite:latest bash                                                      
root@3430352ff378:/# datasette
bash: /usr/local/bin/datasette: /usr/bin/python3: bad interpreter: No such file or directory

@simonw
Copy link
Owner Author

simonw commented Mar 22, 2021

Well this is frustrating. I finally found a Dockerfile that worked and installed an Ubuntu pre-compiled SpatiaLite module that would load...

FROM ubuntu:20.10 as install_spatialite

RUN apt update && \
    apt install -y libsqlite3-mod-spatialite && \
    apt clean && \
    rm -rf /var/lib/{apt,dpkg,cache,log}/

FROM ubuntu:20.10

RUN apt update && \
    apt install -y python3-pip && \
    apt clean && \
    rm -rf /var/lib/{apt,dpkg,cache,log}/

RUN pip install datasette

# Copy spatial extensions
COPY --from=install_spatialite /usr/lib/x86_64-linux-gnu /usr/lib/x86_64-linux-gnu/

ENV LD_LIBRARY_PATH=/usr/local/lib

EXPOSE 8001
CMD ["datasette"]

(Which produced a 550MB image)

And when I ran Datasette I got that same error where the database listing page hangs!

docker run -p 8001:8001 -v `pwd`:/mnt datasette-spatialite:latest datasette -p 8001 -h 0.0.0.0 /mnt/tuscany_housenumbers.sqlite --load-extension=spatialite

@simonw
Copy link
Owner Author

simonw commented Mar 22, 2021

I'll spin off a separate ticket to investigate the hang.

@simonw
Copy link
Owner Author

simonw commented Mar 22, 2021

Ideally I'd like to use the Debian stable python:3.9.2-slim-buster base image but install SpatiaLite from Debian unstable here: https://packages.debian.org/sid/libspatialite7

This pattern might let me do that: https://github.com/helmesjo/cpp_bash_utils/blob/f031e926249f8e2d7f260f22dc8974c6d5be11fe/docker/images/linux-gcc.dockerfile#L20-L24

@simonw
Copy link
Owner Author

simonw commented Mar 22, 2021

This Dockerfile:

FROM python:3.9.2-slim-buster as build

# Setup build dependencies
RUN apt update \
    && apt install -y python3-dev build-essential wget libxml2-dev libproj-dev \
    libminizip-dev libgeos-dev libsqlite3-dev zlib1g-dev pkg-config git \
    && apt clean

RUN wget "https://www.sqlite.org/2021/sqlite-autoconf-3340100.tar.gz" && tar xzf sqlite-autoconf-3340100.tar.gz \
    && cd sqlite-autoconf-3340100 && ./configure --disable-static --enable-fts5 --enable-json1 \
    CFLAGS="-g -O2 -DSQLITE_ENABLE_FTS3=1 -DSQLITE_ENABLE_FTS3_PARENTHESIS -DSQLITE_ENABLE_FTS4=1 -DSQLITE_ENABLE_RTREE=1 -DSQLITE_ENABLE_JSON1" \
    && make && make install

RUN wget "http://www.gaia-gis.it/gaia-sins/freexl-1.0.6.tar.gz" && tar zxf freexl-1.0.6.tar.gz \
    && cd freexl-1.0.6 && ./configure && make && make install

RUN wget "http://www.gaia-gis.it/gaia-sins/libspatialite-5.0.1.tar.gz" && tar zxf libspatialite-5.0.1.tar.gz \
    && cd libspatialite-5.0.1 && ./configure --disable-rttopo && make && make install

RUN wget "http://www.gaia-gis.it/gaia-sins/readosm-sources/readosm-1.1.0.tar.gz" && tar zxf readosm-1.1.0.tar.gz && cd readosm-1.1.0 && ./configure && make && make install

RUN wget "http://www.gaia-gis.it/gaia-sins/spatialite-tools-5.0.0.tar.gz" && tar zxf spatialite-tools-5.0.0.tar.gz \
    && cd spatialite-tools-5.0.0 && ./configure --disable-rttopo && make && make install

# Add local code to the image instead of fetching from pypi.
#COPY . /datasette
#RUN pip install /datasette
RUN pip install datasette

FROM python:3.9.2-slim-buster

# Copy python dependencies and spatialite libraries
COPY --from=build /usr/local/lib/ /usr/local/lib/
# Copy executables
COPY --from=build /usr/local/bin /usr/local/bin
# Copy spatial extensions
COPY --from=build /usr/lib/x86_64-linux-gnu /usr/lib/x86_64-linux-gnu

ENV LD_LIBRARY_PATH=/usr/local/lib

EXPOSE 8001
CMD ["datasette"]

Produced a 448MB image.

@simonw
Copy link
Owner Author

simonw commented Mar 22, 2021

This Dockerfile:

FROM python:3.9.2-slim-buster as build

# software-properties-common provides add-apt-repository
RUN apt-get update && \
    apt-get -y install software-properties-common && \
    add-apt-repository "deb http://httpredir.debian.org/debian sid main" && \
    apt-get update && \
    apt-get -t sid install -y libsqlite3-mod-spatialite && \
    apt clean && \
    rm -rf /var/lib/{apt,dpkg,cache,log}/

RUN pip install datasette

EXPOSE 8001
CMD ["datasette"]

Produces a 344MB image that includes a working SpatiaLite 5.0 module. And weirdly... it doesn't exhibit the hanging bug!

@simonw
Copy link
Owner Author

simonw commented Mar 22, 2021

Considering the image on Docker Hub right now is 383MB this is actually an improvement.

@simonw
Copy link
Owner Author

simonw commented Mar 22, 2021

Replacing rm -rf /var/lib/{apt,dpkg,cache,log}/ with

    rm -rf /var/lib/apt && \
    rm -rf /var/lib/dpkg

Got the size down to 305MB.

@simonw
Copy link
Owner Author

simonw commented Mar 22, 2021

Does --no-install-recommends make a difference?

@simonw
Copy link
Owner Author

simonw commented Mar 22, 2021

I wrote a bunch of tips on creating smaller Docker images here: https://simonwillison.net/2018/Nov/19/smaller-python-docker-images/

@simonw
Copy link
Owner Author

simonw commented Mar 22, 2021

Adding --no-install-recommends dropped it to 275MB

@simonw
Copy link
Owner Author

simonw commented Mar 22, 2021

RUN pip install datasette && \
    find /usr/local/lib -name '__pycache__' | xargs rm -r

That dropped it to 265MB.

@simonw
Copy link
Owner Author

simonw commented Mar 22, 2021

FROM python:3.9.2-slim-buster as build

# software-properties-common provides add-apt-repository
RUN apt-get update && \
    apt-get -y --no-install-recommends install software-properties-common && \
    add-apt-repository "deb http://httpredir.debian.org/debian sid main" && \
    apt-get update && \
    apt-get -t sid install -y --no-install-recommends libsqlite3-mod-spatialite && \
    apt clean && \
    rm -rf /var/lib/apt && \
    rm -rf /var/lib/dpkg

RUN pip install datasette && \
    find /usr/local/lib -name '__pycache__' | xargs rm -r && \
    rm -rf /root/.cache/pip

EXPOSE 8001
CMD ["datasette"]

262 MB

@simonw
Copy link
Owner Author

simonw commented Mar 22, 2021

I tried copying just the mod_spatialite.so file into a second stage build but it failed. So I ran bash in a working image and used ldd to figure out what it was linked to:

root@39683f91e588:/usr/lib/x86_64-linux-gnu# ldd mod_spatialite.so
	linux-vdso.so.1 (0x00007ffd021f4000)
	libxml2.so.2 => /usr/lib/x86_64-linux-gnu/libxml2.so.2 (0x00007f5c75412000)
	libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f5c753f0000)
	libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f5c752ac000)
	libminizip.so.1 => /usr/lib/x86_64-linux-gnu/libminizip.so.1 (0x00007f5c750a0000)
	librttopo.so.1 => /usr/lib/x86_64-linux-gnu/librttopo.so.1 (0x00007f5c75028000)
	libfreexl.so.1 => /usr/lib/x86_64-linux-gnu/libfreexl.so.1 (0x00007f5c7501c000)
	libproj.so.19 => /usr/lib/x86_64-linux-gnu/libproj.so.19 (0x00007f5c74ca7000)
	libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007f5c74a89000)
	libsqlite3.so.0 => /usr/lib/x86_64-linux-gnu/libsqlite3.so.0 (0x00007f5c74967000)
	libgeos_c.so.1 => /usr/lib/x86_64-linux-gnu/libgeos_c.so.1 (0x00007f5c7492b000)
	libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f5c74766000)
	libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f5c74760000)
	libicuuc.so.67 => /usr/lib/x86_64-linux-gnu/libicuuc.so.67 (0x00007f5c74575000)
	liblzma.so.5 => /lib/x86_64-linux-gnu/liblzma.so.5 (0x00007f5c7454d000)
	/lib64/ld-linux-x86-64.so.2 (0x00007f5c75d49000)
	libtiff.so.5 => /usr/lib/x86_64-linux-gnu/libtiff.so.5 (0x00007f5c744c7000)
	libcurl-gnutls.so.4 => /usr/lib/x86_64-linux-gnu/libcurl-gnutls.so.4 (0x00007f5c74439000)
	libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f5c7426c000)
	libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f5c74250000)
	libgeos-3.9.0.so => /usr/lib/x86_64-linux-gnu/libgeos-3.9.0.so (0x00007f5c74040000)
	libicudata.so.67 => /usr/lib/x86_64-linux-gnu/libicudata.so.67 (0x00007f5c72527000)
	libwebp.so.6 => /usr/lib/x86_64-linux-gnu/libwebp.so.6 (0x00007f5c724bc000)
	libzstd.so.1 => /usr/lib/x86_64-linux-gnu/libzstd.so.1 (0x00007f5c7241c000)
	libjbig.so.0 => /usr/lib/x86_64-linux-gnu/libjbig.so.0 (0x00007f5c7220e000)
	libjpeg.so.62 => /usr/lib/x86_64-linux-gnu/libjpeg.so.62 (0x00007f5c72188000)
	libdeflate.so.0 => /usr/lib/x86_64-linux-gnu/libdeflate.so.0 (0x00007f5c7216c000)
	libnghttp2.so.14 => /usr/lib/x86_64-linux-gnu/libnghttp2.so.14 (0x00007f5c72144000)
	libidn2.so.0 => /usr/lib/x86_64-linux-gnu/libidn2.so.0 (0x00007f5c72125000)
	librtmp.so.1 => /usr/lib/x86_64-linux-gnu/librtmp.so.1 (0x00007f5c71f08000)
	libssh2.so.1 => /usr/lib/x86_64-linux-gnu/libssh2.so.1 (0x00007f5c71eda000)
	libpsl.so.5 => /usr/lib/x86_64-linux-gnu/libpsl.so.5 (0x00007f5c71ec5000)
	libnettle.so.6 => /usr/lib/x86_64-linux-gnu/libnettle.so.6 (0x00007f5c71e8d000)
	libgnutls.so.30 => /usr/lib/x86_64-linux-gnu/libgnutls.so.30 (0x00007f5c71ce0000)
	libgssapi_krb5.so.2 => /usr/lib/x86_64-linux-gnu/libgssapi_krb5.so.2 (0x00007f5c71c93000)
	libkrb5.so.3 => /usr/lib/x86_64-linux-gnu/libkrb5.so.3 (0x00007f5c71bb3000)
	libk5crypto.so.3 => /usr/lib/x86_64-linux-gnu/libk5crypto.so.3 (0x00007f5c71b7f000)
	libcom_err.so.2 => /lib/x86_64-linux-gnu/libcom_err.so.2 (0x00007f5c71b77000)
	libldap_r-2.4.so.2 => /usr/lib/x86_64-linux-gnu/libldap_r-2.4.so.2 (0x00007f5c71b23000)
	liblber-2.4.so.2 => /usr/lib/x86_64-linux-gnu/liblber-2.4.so.2 (0x00007f5c71b12000)
	libunistring.so.2 => /usr/lib/x86_64-linux-gnu/libunistring.so.2 (0x00007f5c7198e000)
	libhogweed.so.4 => /usr/lib/x86_64-linux-gnu/libhogweed.so.4 (0x00007f5c71955000)
	libgmp.so.10 => /usr/lib/x86_64-linux-gnu/libgmp.so.10 (0x00007f5c718d0000)
	libgcrypt.so.20 => /lib/x86_64-linux-gnu/libgcrypt.so.20 (0x00007f5c717b2000)
	libp11-kit.so.0 => /usr/lib/x86_64-linux-gnu/libp11-kit.so.0 (0x00007f5c71683000)
	libtasn1.so.6 => /usr/lib/x86_64-linux-gnu/libtasn1.so.6 (0x00007f5c71470000)
	libkrb5support.so.0 => /usr/lib/x86_64-linux-gnu/libkrb5support.so.0 (0x00007f5c71461000)
	libkeyutils.so.1 => /lib/x86_64-linux-gnu/libkeyutils.so.1 (0x00007f5c71458000)
	libresolv.so.2 => /lib/x86_64-linux-gnu/libresolv.so.2 (0x00007f5c7143e000)
	libsasl2.so.2 => /usr/lib/x86_64-linux-gnu/libsasl2.so.2 (0x00007f5c71421000)
	libgpg-error.so.0 => /lib/x86_64-linux-gnu/libgpg-error.so.0 (0x00007f5c713fe000)
	libffi.so.6 => /usr/lib/x86_64-linux-gnu/libffi.so.6 (0x00007f5c713f4000)

@simonw
Copy link
Owner Author

simonw commented Mar 22, 2021

I tried adding apt-get remove -y software-properties-common && to remove software-properties-common but it made no difference to the image size.

@simonw
Copy link
Owner Author

simonw commented Mar 22, 2021

Considering the image on Docker Hub is 383MB, I'm happy with getting that down to 262MB. I'm going to stop looking for new optimizations here.

@simonw
Copy link
Owner Author

simonw commented Mar 22, 2021

I think part of the reason it's smaller is that I ran pip install datasette instead of using COPY . /datasette followed by pip install /datasette.

@simonw simonw changed the title Upgrade SpatiaLite to version 5.0 Updated Dockerfile with SpatiaLite version 5.0 Mar 22, 2021
@simonw
Copy link
Owner Author

simonw commented Mar 22, 2021

Final version of Dockerfile which installs the specified version from GitHub:

docker build . -t datasette-spatialite --build-arg VERSION=0.55
FROM python:3.9.2-slim-buster as build

# Version of Datasette to install, e.g. 0.55
#   docker build . -t datasette --build-arg VERSION=0.55
ARG VERSION

# software-properties-common provides add-apt-repository
# which we need in order to install a more recent release
# of libsqlite3-mod-spatialite from the sid distribution
RUN apt-get update && \
    apt-get -y --no-install-recommends install software-properties-common && \
    add-apt-repository "deb http://httpredir.debian.org/debian sid main" && \
    apt-get update && \
    apt-get -t sid install -y --no-install-recommends libsqlite3-mod-spatialite && \
    apt-get remove -y software-properties-common && \
    apt clean && \
    rm -rf /var/lib/apt && \
    rm -rf /var/lib/dpkg

RUN pip install https://github.com/simonw/datasette/archive/refs/tags/${VERSION}.zip && \
    find /usr/local/lib -name '__pycache__' | xargs rm -r && \
    rm -rf /root/.cache/pip

EXPOSE 8001
CMD ["datasette"]

Run against 0.55 this produces an image of 262MB

@simonw
Copy link
Owner Author

simonw commented Mar 22, 2021

(Without the apt-get update ... SpatiaLite line it's 125MB)

@simonw
Copy link
Owner Author

simonw commented Mar 23, 2021

Don't forget to update this bit of the docs: https://docs.datasette.io/en/0.55/spatialite.html#building-spatialite-from-source

The packaged versions of SpatiaLite usually provide SpatiaLite 4.3.0a. For an example of how to build the most recent unstable version, 4.4.0-RC0 (which includes the powerful VirtualKNN module), take a look at the Datasette Dockerfile.

See also #1273

@simonw
Copy link
Owner Author

simonw commented Mar 27, 2021

One last test of that Dockerfile:

(datasette) datasette % docker build -f Dockerfile -t datasetteproject/datasette:0.55a --build-arg VERSION=0.55 .
(datasette) datasette % docker run datasetteproject/datasette:0.55a datasette --get '/-/versions.json' | jq
{
  "python": {
    "version": "3.9.2",
    "full": "3.9.2 (default, Feb 19 2021, 17:23:45) \n[GCC 8.3.0]"
  },
  "datasette": {
    "version": "0.55"
  },
  "asgi": "3.0",
  "uvicorn": "0.13.4",
  "sqlite": {
    "version": "3.27.2",
    "fts_versions": [
      "FTS5",
      "FTS4",
      "FTS3"
    ],
    "extensions": {
      "json1": null
    },
    "compile_options": [
      "COMPILER=gcc-8.3.0",
      "ENABLE_COLUMN_METADATA",
      "ENABLE_DBSTAT_VTAB",
      "ENABLE_FTS3",
      "ENABLE_FTS3_PARENTHESIS",
      "ENABLE_FTS3_TOKENIZER",
      "ENABLE_FTS4",
      "ENABLE_FTS5",
      "ENABLE_JSON1",
      "ENABLE_LOAD_EXTENSION",
      "ENABLE_PREUPDATE_HOOK",
      "ENABLE_RTREE",
      "ENABLE_SESSION",
      "ENABLE_STMTVTAB",
      "ENABLE_UNLOCK_NOTIFY",
      "ENABLE_UPDATE_DELETE_LIMIT",
      "HAVE_ISNAN",
      "LIKE_DOESNT_MATCH_BLOBS",
      "MAX_SCHEMA_RETRY=25",
      "MAX_VARIABLE_NUMBER=250000",
      "OMIT_LOOKASIDE",
      "SECURE_DELETE",
      "SOUNDEX",
      "TEMP_STORE=1",
      "THREADSAFE=1",
      "USE_URI"
    ]
  }
}
(datasette) datasette % docker run datasetteproject/datasette:0.55a datasette --get '/-/versions.json' --load-extension=spatialite | jq
{
  "python": {
    "version": "3.9.2",
    "full": "3.9.2 (default, Feb 19 2021, 17:23:45) \n[GCC 8.3.0]"
  },
  "datasette": {
    "version": "0.55"
  },
  "asgi": "3.0",
  "uvicorn": "0.13.4",
  "sqlite": {
    "version": "3.27.2",
    "fts_versions": [
      "FTS5",
      "FTS4",
      "FTS3"
    ],
    "extensions": {
      "json1": null,
      "spatialite": "5.0.1"
    },
    "compile_options": [
      "COMPILER=gcc-8.3.0",
      "ENABLE_COLUMN_METADATA",
      "ENABLE_DBSTAT_VTAB",
      "ENABLE_FTS3",
      "ENABLE_FTS3_PARENTHESIS",
      "ENABLE_FTS3_TOKENIZER",
      "ENABLE_FTS4",
      "ENABLE_FTS5",
      "ENABLE_JSON1",
      "ENABLE_LOAD_EXTENSION",
      "ENABLE_PREUPDATE_HOOK",
      "ENABLE_RTREE",
      "ENABLE_SESSION",
      "ENABLE_STMTVTAB",
      "ENABLE_UNLOCK_NOTIFY",
      "ENABLE_UPDATE_DELETE_LIMIT",
      "HAVE_ISNAN",
      "LIKE_DOESNT_MATCH_BLOBS",
      "MAX_SCHEMA_RETRY=25",
      "MAX_VARIABLE_NUMBER=250000",
      "OMIT_LOOKASIDE",
      "SECURE_DELETE",
      "SOUNDEX",
      "TEMP_STORE=1",
      "THREADSAFE=1",
      "USE_URI"
    ]
  }
}

@simonw
Copy link
Owner Author

simonw commented Mar 27, 2021

I'll close this issue after I ship Datasette 0.56 and confirm that the Dockerfile was correctly built and published to Docker Hub.

@simonw
Copy link
Owner Author

simonw commented Mar 29, 2021

I just shipped Datasette 0.56 - here's the CI run: https://github.com/simonw/datasette/runs/2214701802?check_suite_focus=true

It pushed a new latest tag to https://hub.docker.com/r/datasetteproject/datasette/tags?page=1&ordering=last_updated

docker pull datasetteproject/datasette:latest

And then:

docker run datasetteproject/datasette:latest datasette \
  --load-extension=spatialite \
  --get /-/versions.json | jq .sqlite

Outputs:

{
  "version": "3.27.2",
  "fts_versions": [
    "FTS5",
    "FTS4",
    "FTS3"
  ],
  "extensions": {
    "json1": null,
    "spatialite": "5.0.1"
  },
  "compile_options": [
    "COMPILER=gcc-8.3.0",
    "ENABLE_COLUMN_METADATA",
    "ENABLE_DBSTAT_VTAB",
    "ENABLE_FTS3",
    "ENABLE_FTS3_PARENTHESIS",
    "ENABLE_FTS3_TOKENIZER",
    "ENABLE_FTS4",
    "ENABLE_FTS5",
    "ENABLE_JSON1",
    "ENABLE_LOAD_EXTENSION",
    "ENABLE_PREUPDATE_HOOK",
    "ENABLE_RTREE",
    "ENABLE_SESSION",
    "ENABLE_STMTVTAB",
    "ENABLE_UNLOCK_NOTIFY",
    "ENABLE_UPDATE_DELETE_LIMIT",
    "HAVE_ISNAN",
    "LIKE_DOESNT_MATCH_BLOBS",
    "MAX_SCHEMA_RETRY=25",
    "MAX_VARIABLE_NUMBER=250000",
    "OMIT_LOOKASIDE",
    "SECURE_DELETE",
    "SOUNDEX",
    "TEMP_STORE=1",
    "THREADSAFE=1",
    "USE_URI"
  ]
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant