-
-
Notifications
You must be signed in to change notification settings - Fork 259
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
docker: upgrade ubuntu version #1093
Conversation
Upgrade rhe ubuntu version used in the main Dockerfile to latest stable version `20.04`
20.04 is not the latest stable ubuntu release. Please use jammy. |
amended ubuntu version 20 -> 22 but then the following error: ``` 0 added, 0 removed; done. Running hooks in /etc/ca-certificates/update.d... done. Processing triggers for dbus (1.12.20-2ubuntu4.1) ... gpg: error running '/usr/bin/gpg-agent': probably not installed gpg: failed to start agent '/usr/bin/gpg-agent': Configuration error gpg: can't connect to the agent: Configuration error Traceback (most recent call last): File "/usr/lib/python3/dist-packages/softwareproperties/shortcuthandler.py", line 423, in add_key PPA publishes dbgsym, you may need to include 'main/debug' component Repository: 'deb https://ppa.launchpadcontent.net/ubuntugis/ubuntugis-unstable/ubuntu/ jammy main' Description: Unstable releases of Ubuntu GIS packages. These releases are more bleeding edge and while generally they should work well, they dont receive the same amount of quality assurance as our stable releases do. More info: https://launchpad.net/~ubuntugis/+archive/ubuntu/ubuntugis-unstable Adding repository. Adding deb entry to /etc/apt/sources.list.d/ubuntugis-ubuntu-ubuntugis-unstable-jammy.list Adding disabled deb-src entry to /etc/apt/sources.list.d/ubuntugis-ubuntu-ubuntugis-unstable-jammy.list Adding key to /etc/apt/trusted.gpg.d/ubuntugis-ubuntu-ubuntugis-unstable.gpg with fingerprint 6B827C12C2D425E227EDCA75089EBE08314DF160 subprocess.run(cmd.split(), check=True, input=keys) File "/usr/lib/python3.10/subprocess.py", line 524, in run raise CalledProcessError(retcode, process.args, subprocess.CalledProcessError: Command '['gpg', '-q', '--no-options', '--no-default-keyring', '--batch', '--keyring', '/etc/apt/trusted.gpg.d/ubuntugis-ubuntu-ubuntugis-unstable.gpg', '--homedir', '/tmp/tmpjlr98185', '--import']' returned non-zero exit status 2. During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/usr/bin/add-apt-repository", line 364, in <module> sys.exit(0 if addaptrepo.main() else 1) File "/usr/bin/add-apt-repository", line 357, in main shortcut.add() File "/usr/lib/python3/dist-packages/softwareproperties/shortcuthandler.py", line 222, in add self.add_key() File "/usr/lib/python3/dist-packages/softwareproperties/shortcuthandler.py", line 425, in add_key raise ShortcutException(e) softwareproperties.shortcuthandler.ShortcutException: Command '['gpg', '-q', '--no-options', '--no-default-keyring', '--batch', '--keyring', '/etc/apt/trusted.gpg.d/ubuntugis-ubuntu-ubuntugis-unstable.gpg', '--homedir', '/tmp/tmpjlr98185', '--import']' returned non-zero exit status 2. ```` Removing the ubuntugis PPA build without error - is ubuntugis a requirements? probably Jammy has up-to-date packages for pygeoapi dependencies
@kalxas it seems that 22.04 introduce an issue with the ubutnugis PPA - it returns the following error:
using Jammy without extra PPA builds without errors |
Which packages are fetched from the PPA? What about removing ubuntugis PPA and just install software using pip? |
Using a ppa for GDAL and other dependencies is far more stable than pip. Plus the build time is smaller. |
removing ubuntugis PPA
I do not understand what you mean with "more stable". PPA means low reproducibility, since packages are updated and versions cannot be pinned, which instead is possible with pip. The current Dockerfile is a mix of Python deb packages, some from the official repository some from a PPA, and pip packages; some are even built from source using GCC. Version are not pinned. This looks everything but stable.
That can happen when there are no wheels available. Caching can be used, by caching Docker layers (default) and using buildx capability of mounting cache volumes for pip. So you are right, you would lose some time the first time you build the image, but you would gain greater reproducibility and align Docker and non-Docker setups |
Versions in distributions like Debian/Ubuntu/RedHat/Fedora/SUSE etc are pinned by default. Reproducibility is 100% in this case (see https://wiki.debian.org/ReproducibleBuilds or https://en.opensuse.org/openSUSE:Reproducible_Builds). Core package versions never change. This is why pip (or similar packaging solutions) are never used by GNU/Linux distributions. In ppas things are a bit more relaxed, packages can be updated, but those changes are slow and depend on the ppa policy. In UbuntuGIS we update packages from Debian once or twice a year based on stability. Furthermore the packaging quality of debian or rpm packages are of higher quality than any wheel out there... this is the reason many organizations require installations only from debian/rpm packages in production environments.
Yes, that is technically possible. |
Dockerfile
Outdated
&& apt-get --no-install-recommends install -y ${DEB_PACKAGES} \ | ||
&& update-locale LANG=${LANG} \ | ||
&& echo "For ${TZ} date=$(date)" && echo "Locale=$(locale)" | ||
|
||
# temporary remode |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just remove, no need to keep in comment
Dockerfile
Outdated
@@ -33,7 +33,7 @@ | |||
# | |||
# ================================================================= | |||
|
|||
FROM ubuntu:focal | |||
FROM ubuntu:22.04 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer "jammy"
The upstream version is pinned in stable distributions (not in Fedora, actually), but the package itself is not pinned, as security patches and critical fixes are addressed. That means that by installing
The reproducibility of a Docker image is not related to the reproducibility of packages built from source. For example, a Docker image could be rebuilt with an 100 % reproducibility and just include closed source software, or a Docker image rebuilt on the same distribution, just with some updates, could be quite different from the very same image built a month ago, even if the packages are fully reproducible. Distributions offer Python packages inside their repository because they have their own policy, requirements, and such, not because of reproducibility, which is a discussion that became interesting to a wider audience way after the creation of Python packages in distributions :)
Repackaging a Python library into a deb package does not make it higher quality automatically. What is happening in this pull requests is that many Python dependencies are being updated because the base system is changing, thus the final result would be quite different. Using pinned Python dependencies is a good solution to avoid that.
I guess we could have proposals for the improvement of the Dockerfile in different discussions :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we can live without UbuntuGIS-PPA, fine with me.
It remains a huge build, it is still building here locally, it seems that Python 3.10 is installed even.
My main concern now is line 143:
pip3 install -r requirements-provider.txt
. This installs similar packages like Building wheel for pandas, that were installed as python3-*
.deb packages above (or am I missing something?). I don't think we can leave this requirements spec out, but the current Dockerfile here may lead to (versioning) issues.
Plus the Image is now almost 2GB uncompressed.
Yes, we can live without the UbuntuGIS ppa in Jammy, the packages are fairly new. |
So I am trying to reduce the image size and apt vs pip Python package. There are many places in the Dockerfile which leads to an almost 2GB Image:
I now got the Image down to just over 1GB, but there is still quite some "cruft" within the Image. You can run and snoop inside the running Container:
etc. Image runs Python 3.10. Also built and ran the test-enabled Image: and ran tests: Will be quite an undertaking to get it all in shape. I am attaching my WIP here (3 files .txt extensions to allow upload in GH). The dockerignore.txt I leave this for now, maybe you folks can build further on this. |
Also for the tests we don't need a separate Docker Image. I see running tests is commented out in
Or just also install Only |
I can try to add pytest and then see how to decrease the size of the whole image |
@epifanio good to hear. Let me/us know here where we can help. This is often a time-consuming task, waiting for builds, streams of output. For one thing some unit tests will fail anyway, from what I've seen:
Updates (jan 12, 2023, 19:18 CEST):
|
adding `python3-pytest` and `python3-pyld` dependencies
I added |
Most, if not all, the packages in the several requirments.txt files can be installed via apt-get.
There will be some version issue I guess, I see some of the packages have pinned version ... the only missing are:
Should I try replacing the |
@epifanio IMO you don't need all these |
@epifanio pls see my latest Dockerfile, attached here (and supporting files like Just a single |
@justb4 thanks! now building a test image with your suggested changes - I got 1.23 GB vs 1.91 |
.dockerignore
Outdated
.pytest_cache | ||
build | ||
dist | ||
**/*.zip |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, this line should be taken out: some (Shapefile) tests need .zip files...
Dockerfile
Outdated
if [ "$BUILD_DEV_IMAGE" = "true" ] ; then pip3 install -r requirements-dev.txt; fi \ | ||
# Install pygeoapi providers | ||
&& pip3 install -r requirements-provider.txt \ | ||
&& if [ "$BUILD_DEV_IMAGE" = "true" ] ; then pip3 install -r requirements-dev.txt; fi \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be taken out: we don't need the DEV Image anymore (as we have pytest+pyld now).
Dockerfile
Outdated
@@ -63,6 +63,7 @@ ARG TZ="Etc/UTC" | |||
ARG LANG="en_US.UTF-8" | |||
ARG BUILD_DEV_IMAGE="false" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can be removed: no DEV Image anymore.
@epifanio This looks much cleaner now! I've added some single-line comments above.
|
I removed two comment-lines for the DEV build ARG. It may be good to know that folks can build a smaller Image by populating ADD_DEB_PACKAGES with their own required Providers. Only flaw, was my additions, in the Dockerfile is that |
Sorry I am on a slow internet (4G in Spain), building the current Dockerfile takes ages...Yesterday was better, couple of minutes. |
skip the SensorThings and other tests that require a local server
I edited the entrypoint to skip the sensor test |
@epifanio Think we're in good shape now. When I run the Container I see some errors. Not fatal. WFS-related See below.
I think main point is that so many versions have changed as we moved to a new base Image, skip Ubuntu-GIS and have most packages as Edit: I remember the error cause above now: it is usually from a stale WFS endpoint in the OGR Provider, but the three WFS-es in But I see it comes from the
Maybe this is a subtle error as we may not have the latest/greatest GDAL version in the Image? We have This may be an example of subtle errors because of going back in versions (pip/PyPi and Ubuntu-GIS usually have higher versions....)Maybe @kalxas @frafra @tomkralidis can comment? But I think we should go ahead and fix stuff in upcoming issues/PRs.. |
@justb4 I build an image without using UbuntuPPA dependencies, based on Ubuntu 22.04, and still get that error. I guess it could be a server issue. I wonder why gdal is one of the few pinned dependencies, set to |
@frafra like I stated above the error is not fatal. It is almost an edge use case: a particular driver (WFS) in the pygeoapi OGR backend Provider for a particular (version) of the Ideally, and I did several tries, but failed on key errors (@kalxas does UbuntuGIS have a key?) would like to give Ubuntu-GIS again a try. But without
Then adding the PPA key (how?). Think older images forgot to run Yes, the PyPi (pip, conda, poetry) we may have more control over versioning and having latest versions. But at the price of a more complex Dockerfile (will need binaries for GDAL and the like anyway), very long (wheel) builds, maybe even larger image size. Reproducible builds are nice, but we don't bring out that new versions of pygeoapi, these are frozen on DockerHub, that frequently and the |
Steps to add the ppa manually:
|
@kalxas still get errors, will not accept PPA:
This is also a bit beyond my grasp: installed gnupg, gnupg2. Was confused in GDAL versions, but still get Jammy's version: |
So maybe forget about the PPA approve the current version. Then fix all unit tests, maybe by skipping some. Fix the Docker build push in
|
For sake of progress: let's move ahead, i.e. merge. Ok, GDAL will be a downgrade from 3.4.3 to 3.4.1. Can't tell for all the other N packages versions. But in the end: this is a "Latest" Docker Image to be published, not a pygeoapi version. Like a US Company, I once worked for, phrased: "We'll deal with the problem when it comes along" :-). Also to provide a (re)base for @frafra in #1105 on which we can then all focus. |
Pff the Docker build on GH Workflow 1.5 hour underway, stay building... |
@@ -75,8 +75,10 @@ case ${entry_cmd} in | |||
case ${test_py} in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aside: is there any reason why we run tests in our Docker entrypoint? Is this needed (given we have CI)? Perhaps we should consider removing @francbartoli @justb4
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am working on it already on it on the other PR, as the entrypoint should not have commands, but just prepare the environment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should be able to run (most of) the Unit-tests within a Docker Container. Especially currently, the entire installation with OS, Python and package-versions will surely differ from a custom install. Hopefully this will largely go away with PDM and #1105, but still, just relying on main.yml
is fragile or we should somehow invoke there. Ok, maybe via entrypoint.sh
is not the cleanest way but we will also need some logic to skip tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And btw the above Docker build does not even invoke Unit tests. There is some larger problem there. After 6 hours the above build just was canceled, think by GH, no logs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also containers.yml is now running for 5 hours after PR #1109 (unrelated to Docker build). At least the image is still built twice so that should be removed ASAP.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Takes around 8 minutes on my system.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I'll do it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And the ran the tests: docker run -p 5000:80 --rm --name pygeoapi geopython/pygeoapi:latest test > test.txt 2>&1
. Output attached:
test.txt
Actually only 3 main failure-causes, most can be explained:
-
several tests in test_api.py
Many tests IMO should not be here! Like tests for Postgresql CQL,Coverage, EDR. They should be in their Provider (PG) or Service-specific testfile.test_api.py
should perform generic API tests, not specific, and relies on unavailable local servers. -
WFS tests: still the wrong GeoSolutions URL is in, thought we fixed that ages ago...
-
test_async_hello_world_process_parallel
fails, cannot explain
All other tests succeed, that is the good news, but still we should open an issue to cleanup tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Takes less than 3 minutes on mine
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Opened issue: #1111.
Upgrade the ubuntu version used in the main Dockerfile to latest stable version
20.04
Overview
Upgrade ubuntu version used to build pygeoapi docker image to use latest stable version (ubuntu 20.04 Jammy)
Related Issue / Discussion
Additional Information
Contributions and Licensing
(as per https://github.com/geopython/pygeoapi/blob/master/CONTRIBUTING.md#contributions-and-licensing)