Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docker: upgrade ubuntu version #1093

Merged
merged 13 commits into from
Jan 18, 2023
Merged

Conversation

epifanio
Copy link
Contributor

Upgrade the ubuntu version used in the main Dockerfile to latest stable version 20.04

Overview

Upgrade ubuntu version used to build pygeoapi docker image to use latest stable version (ubuntu 20.04 Jammy)

Related Issue / Discussion

Additional Information

Contributions and Licensing

(as per https://github.com/geopython/pygeoapi/blob/master/CONTRIBUTING.md#contributions-and-licensing)

  • I'd like to contribute [feature X|bugfix Y|docs|something else] to pygeoapi. I confirm that my contributions to pygeoapi will be compatible with the pygeoapi license guidelines at the time of contribution.
  • I have already previously agreed to the pygeoapi Contributions and Licensing Guidelines

Upgrade rhe ubuntu version used in the main Dockerfile to latest stable version `20.04`
@kalxas
Copy link
Member

kalxas commented Jan 10, 2023

20.04 is not the latest stable ubuntu release. Please use jammy.

amended ubuntu version 20 -> 22

but then the following error:

```
0 added, 0 removed; done.
Running hooks in /etc/ca-certificates/update.d...
done.
Processing triggers for dbus (1.12.20-2ubuntu4.1) ...
gpg: error running '/usr/bin/gpg-agent': probably not installed
gpg: failed to start agent '/usr/bin/gpg-agent': Configuration error
gpg: can't connect to the agent: Configuration error
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/softwareproperties/shortcuthandler.py", line 423, in add_key
PPA publishes dbgsym, you may need to include 'main/debug' component
Repository: 'deb https://ppa.launchpadcontent.net/ubuntugis/ubuntugis-unstable/ubuntu/ jammy main'
Description:
Unstable releases of Ubuntu GIS packages. These releases are more bleeding edge and while generally they should work well, they dont receive the same amount of quality assurance as our stable releases do.
More info: https://launchpad.net/~ubuntugis/+archive/ubuntu/ubuntugis-unstable
Adding repository.
Adding deb entry to /etc/apt/sources.list.d/ubuntugis-ubuntu-ubuntugis-unstable-jammy.list
Adding disabled deb-src entry to /etc/apt/sources.list.d/ubuntugis-ubuntu-ubuntugis-unstable-jammy.list
Adding key to /etc/apt/trusted.gpg.d/ubuntugis-ubuntu-ubuntugis-unstable.gpg with fingerprint 6B827C12C2D425E227EDCA75089EBE08314DF160
    subprocess.run(cmd.split(), check=True, input=keys)
  File "/usr/lib/python3.10/subprocess.py", line 524, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['gpg', '-q', '--no-options', '--no-default-keyring', '--batch', '--keyring', '/etc/apt/trusted.gpg.d/ubuntugis-ubuntu-ubuntugis-unstable.gpg', '--homedir', '/tmp/tmpjlr98185', '--import']' returned non-zero exit status 2.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/bin/add-apt-repository", line 364, in <module>
    sys.exit(0 if addaptrepo.main() else 1)
  File "/usr/bin/add-apt-repository", line 357, in main
    shortcut.add()
  File "/usr/lib/python3/dist-packages/softwareproperties/shortcuthandler.py", line 222, in add
    self.add_key()
  File "/usr/lib/python3/dist-packages/softwareproperties/shortcuthandler.py", line 425, in add_key
    raise ShortcutException(e)
softwareproperties.shortcuthandler.ShortcutException: Command '['gpg', '-q', '--no-options', '--no-default-keyring', '--batch', '--keyring', '/etc/apt/trusted.gpg.d/ubuntugis-ubuntu-ubuntugis-unstable.gpg', '--homedir', '/tmp/tmpjlr98185', '--import']' returned non-zero exit status 2.
````

Removing the ubuntugis PPA  build without error - is ubuntugis a requirements? probably Jammy has up-to-date packages for pygeoapi dependencies
@epifanio
Copy link
Contributor Author

@kalxas it seems that 22.04 introduce an issue with the ubutnugis PPA - it returns the following error:

Processing triggers for ca-certificates (20211016ubuntu0.22.04.1) ...
Updating certificates in /etc/ssl/certs...
0 added, 0 removed; done.
Running hooks in /etc/ca-certificates/update.d...
done.
Processing triggers for dbus (1.12.20-2ubuntu4.1) ...
gpg: error running '/usr/bin/gpg-agent': probably not installed
gpg: failed to start agent '/usr/bin/gpg-agent': Configuration error
gpg: can't connect to the agent: Configuration error
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/softwareproperties/shortcuthandler.py", line 423, in add_key
PPA publishes dbgsym, you may need to include 'main/debug' component
Repository: 'deb https://ppa.launchpadcontent.net/ubuntugis/ubuntugis-unstable/ubuntu/ jammy main'
Description:
Unstable releases of Ubuntu GIS packages. These releases are more bleeding edge and while generally they should work well, they dont receive the same amount of quality assurance as our stable releases do.
More info: https://launchpad.net/~ubuntugis/+archive/ubuntu/ubuntugis-unstable
Adding repository.
Adding deb entry to /etc/apt/sources.list.d/ubuntugis-ubuntu-ubuntugis-unstable-jammy.list
Adding disabled deb-src entry to /etc/apt/sources.list.d/ubuntugis-ubuntu-ubuntugis-unstable-jammy.list
Adding key to /etc/apt/trusted.gpg.d/ubuntugis-ubuntu-ubuntugis-unstable.gpg with fingerprint 6B827C12C2D425E227EDCA75089EBE08314DF160
    subprocess.run(cmd.split(), check=True, input=keys)
  File "/usr/lib/python3.10/subprocess.py", line 524, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['gpg', '-q', '--no-options', '--no-default-keyring', '--batch', '--keyring', '/etc/apt/trusted.gpg.d/ubuntugis-ubuntu-ubuntugis-unstable.gpg', '--homedir', '/tmp/tmpg_0hlu07', '--import']' returned non-zero exit status 2.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/bin/add-apt-repository", line 364, in <module>
    sys.exit(0 if addaptrepo.main() else 1)
  File "/usr/bin/add-apt-repository", line 357, in main
    shortcut.add()
  File "/usr/lib/python3/dist-packages/softwareproperties/shortcuthandler.py", line 222, in add
    self.add_key()
  File "/usr/lib/python3/dist-packages/softwareproperties/shortcuthandler.py", line 425, in add_key
    raise ShortcutException(e)
softwareproperties.shortcuthandler.ShortcutException: Command '['gpg', '-q', '--no-options', '--no-default-keyring', '--batch', '--keyring', '/etc/apt/trusted.gpg.d/ubuntugis-ubuntu-ubuntugis-unstable.gpg', '--homedir', '/tmp/tmpg_0hlu07', '--import']' returned non-zero exit status 2.
The command '/bin/sh -c apt-get update -y     && apt-get upgrade -y     && apt-get install -y --fix-missing --no-install-recommends ${DEB_BUILD_DEPS}      && add-apt-repository ppa:ubuntugis/ubuntugis-unstable     && apt-get --no-install-recommends install -y ${DEB_PACKAGES}     && update-locale LANG=${LANG}     && echo "For ${TZ} date=$(date)" && echo "Locale=$(locale)"' returned a non-zero code: 1

using Jammy without extra PPA builds without errors

@frafra
Copy link
Contributor

frafra commented Jan 10, 2023

Which packages are fetched from the PPA?

What about removing ubuntugis PPA and just install software using pip?
A multistage Dockerfile can be used, so that build dependencies are installed in the first stage, but they would not end up in the final image. Something like python3 -m pip wheel -r requirements.txt -w wheels in the first stage and python3 -m pip install wheels/*.whl in the second stage.

@kalxas
Copy link
Member

kalxas commented Jan 10, 2023

Using a ppa for GDAL and other dependencies is far more stable than pip. Plus the build time is smaller.

removing ubuntugis PPA
@frafra
Copy link
Contributor

frafra commented Jan 11, 2023

Using a ppa for GDAL and other dependencies is far more stable than pip.

I do not understand what you mean with "more stable". PPA means low reproducibility, since packages are updated and versions cannot be pinned, which instead is possible with pip. The current Dockerfile is a mix of Python deb packages, some from the official repository some from a PPA, and pip packages; some are even built from source using GCC. Version are not pinned. This looks everything but stable.

Plus the build time is smaller.

That can happen when there are no wheels available. Caching can be used, by caching Docker layers (default) and using buildx capability of mounting cache volumes for pip. So you are right, you would lose some time the first time you build the image, but you would gain greater reproducibility and align Docker and non-Docker setups

@kalxas
Copy link
Member

kalxas commented Jan 11, 2023

Using a ppa for GDAL and other dependencies is far more stable than pip.

I do not understand what you mean with "more stable". PPA means low reproducibility, since packages are updated and versions cannot be pinned, which instead is possible with pip. The current Dockerfile is a mix of Python deb packages, some from the official repository some from a PPA, and pip packages; some are even built from source using GCC. Version are not pinned. This looks everything but stable.

Versions in distributions like Debian/Ubuntu/RedHat/Fedora/SUSE etc are pinned by default. Reproducibility is 100% in this case (see https://wiki.debian.org/ReproducibleBuilds or https://en.opensuse.org/openSUSE:Reproducible_Builds). Core package versions never change. This is why pip (or similar packaging solutions) are never used by GNU/Linux distributions. In ppas things are a bit more relaxed, packages can be updated, but those changes are slow and depend on the ppa policy. In UbuntuGIS we update packages from Debian once or twice a year based on stability.

Furthermore the packaging quality of debian or rpm packages are of higher quality than any wheel out there... this is the reason many organizations require installations only from debian/rpm packages in production environments.

Plus the build time is smaller.

That can happen when there are no wheels available. Caching can be used, by caching Docker layers (default) and using buildx capability of mounting cache volumes for pip. So you are right, you would lose some time the first time you build the image, but you would gain greater reproducibility and align Docker and non-Docker setups

Yes, that is technically possible.

@kalxas kalxas requested a review from justb4 January 11, 2023 10:36
Dockerfile Outdated
&& apt-get --no-install-recommends install -y ${DEB_PACKAGES} \
&& update-locale LANG=${LANG} \
&& echo "For ${TZ} date=$(date)" && echo "Locale=$(locale)"

# temporary remode
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just remove, no need to keep in comment

Dockerfile Outdated
@@ -33,7 +33,7 @@
#
# =================================================================

FROM ubuntu:focal
FROM ubuntu:22.04
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer "jammy"

@frafra
Copy link
Contributor

frafra commented Jan 11, 2023

Versions in distributions like Debian/Ubuntu/RedHat/Fedora/SUSE etc are pinned by default.

The upstream version is pinned in stable distributions (not in Fedora, actually), but the package itself is not pinned, as security patches and critical fixes are addressed. That means that by installing python3-jinja on the latest stable Ubuntu could give you a different result. There is the hope that this does not affect other software, but this is not always the case.

Reproducibility is 100% in this case (see https://wiki.debian.org/ReproducibleBuilds or https://en.opensuse.org/openSUSE:Reproducible_Builds). Core package versions never change. This is why pip (or similar packaging solutions) are never used by GNU/Linux distributions. In ppas things are a bit more relaxed, packages can be updated, but those changes are slow and depend on the ppa policy. In UbuntuGIS we update packages from Debian once or twice a year based on stability.

The reproducibility of a Docker image is not related to the reproducibility of packages built from source. For example, a Docker image could be rebuilt with an 100 % reproducibility and just include closed source software, or a Docker image rebuilt on the same distribution, just with some updates, could be quite different from the very same image built a month ago, even if the packages are fully reproducible.

Distributions offer Python packages inside their repository because they have their own policy, requirements, and such, not because of reproducibility, which is a discussion that became interesting to a wider audience way after the creation of Python packages in distributions :)

Furthermore the packaging quality of debian or rpm packages are of higher quality than any wheel out there... this is the reason many organizations require installations only from debian/rpm packages in production environments.

Repackaging a Python library into a deb package does not make it higher quality automatically.
"Many organizations" seems a bit vague. Container works a bit differently, and the underlying distribution is usually not so relevant. Binary dependencies are added, additional dependencies are built, the software is shipped. Versions are pinned, which gives a pretty similar resulting image and consistent results.

What is happening in this pull requests is that many Python dependencies are being updated because the base system is changing, thus the final result would be quite different. Using pinned Python dependencies is a good solution to avoid that.

Caching can be used, by caching Docker layers (default) and using buildx capability of mounting cache volumes for pip. So you are right, you would lose some time the first time you build the image, but you would gain greater reproducibility and align Docker and non-Docker setups

Yes, that is technically possible.

I guess we could have proposals for the improvement of the Dockerfile in different discussions :)

Copy link
Member

@justb4 justb4 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we can live without UbuntuGIS-PPA, fine with me.
It remains a huge build, it is still building here locally, it seems that Python 3.10 is installed even.

My main concern now is line 143:
pip3 install -r requirements-provider.txt. This installs similar packages like Building wheel for pandas, that were installed as python3-* .deb packages above (or am I missing something?). I don't think we can leave this requirements spec out, but the current Dockerfile here may lead to (versioning) issues.

Plus the Image is now almost 2GB uncompressed.

@kalxas
Copy link
Member

kalxas commented Jan 11, 2023

Yes, we can live without the UbuntuGIS ppa in Jammy, the packages are fairly new.

@justb4
Copy link
Member

justb4 commented Jan 11, 2023

So I am trying to reduce the image size and apt vs pip Python package. There are many places in the Dockerfile which leads to an almost 2GB Image:

  • double installed deps: created a requirements-docker.txt from requirements-provider.txt with only required pure Python packages. Now even Wheel is not required anymore.
  • too many Docker Image Layers (RUNs)
  • no .dockerignore (e.g. ADD . /pygeoapi will adds entire .git dir)
  • Image includes packages not required to run pygeopapi like gcc

I now got the Image down to just over 1GB, but there is still quite some "cruft" within the Image. /usr/ is like 1GB. In other projects I use Staged builds but is somewhat tricky here.

You can run and snoop inside the running Container:

docker run -p 5000:80 geopython/pygeoapi:latest
docker exec -it <container name> /bin/bash

$ du -sh /*
.
43M	/pygeoapi
8.9M	/root
4.7M	/schemas.opengis.net
1.2G	/usr
19M	/var

etc. Image runs Python 3.10.

Also built and ran the test-enabled Image:
docker build --build-arg BUILD_DEV_IMAGE=true -t geopython/pygeoapi:latest .

and ran tests:
docker run -p 5000:80 geopython/pygeoapi:latest test. Many tests succeed but also failures. Not sure if before we had 100%.

Will be quite an undertaking to get it all in shape. I am attaching my WIP here (3 files .txt extensions to allow upload in GH). The Dockerfile is IMO more compact to work from there. e.g. All RUN in single Layer.

dockerignore.txt
Dockerfile.txt
requirements-docker.txt

I leave this for now, maybe you folks can build further on this.

@justb4
Copy link
Member

justb4 commented Jan 12, 2023

Also for the tests we don't need a separate Docker Image. I see running tests is commented out in .github/workflows/containers.yml. We can simply use the pygeoapi Docker Image and in the docker/entrypoint.sh do:

.
case ${entry_cmd} in
	# Run Unit tests
	test)
	  pip3 install pytest
	  for test_py in $(ls tests/test_*.py)
.
.

Or just also install pytest as this would hardly contribute to the Image size.

Only pytest is needed, not all the packages as in requirements-dev.txt. Saves an enormous amount of time (half) for the build. Possibly best to do this in separate issue/PR "Cleanup Docker build and workflows". Also, any failing tests would not be signalled to the Workflow.

@epifanio
Copy link
Contributor Author

I can try to add pytest and then see how to decrease the size of the whole image

@justb4
Copy link
Member

justb4 commented Jan 12, 2023

@epifanio good to hear. Let me/us know here where we can help. This is often a time-consuming task, waiting for builds, streams of output.

For one thing some unit tests will fail anyway, from what I've seen:

  • newer tests like SensorThings API expect a local server to be running, so should be commented out like ES-tests, (see entrypoint.sh)
  • some OGR WFS tests fail as a (GeoSolutions) WFS is not available (or URL changed)

Updates (jan 12, 2023, 19:18 CEST):

  • in addition to pytest package pyld is required for running tests possibly as python3-pyld
  • indeed the GeoSolutions WFS URL was changed to https://gs-stable.geosolutionsgroup.com/geoserver/wfs but there are also some Proj problems so best to exclude tests/test_ogr_wfs_provider.py until we have fixed

adding  `python3-pytest` and `python3-pyld`  dependencies
@epifanio
Copy link
Contributor Author

I added pytest and pyld as debian packages - I will test the build locally and see if/how decrease the image size - I guess some cleanup of building dependencies could be done after the installation is completed.

@epifanio
Copy link
Contributor Author

epifanio commented Jan 12, 2023

Most, if not all, the packages in the several requirments.txt files can be installed via apt-get.
The following are available:

    python3-flake8 \
    twine \
    python3-wheel \
    python3-openssl \
    python3-ndg-httpsclient \
    python3-pyasn1 \
    python3-sphinx \
    python3-sphinx-rtd-theme \
    python3-flask-cors \
    python3-pytest \
    python3-pytest-cov \
    python3-coverage \
    python3-pyld \
    python3-django \
    python3-pydantic \
    python3-elasticsearch \
    fiona \
    python3-fiona \
    python3-geoalchemy2 \
    python3-netcdf4 \
    python3-pandas \
    python3-psycopg2 \
    pygeoif \
    python3-pymongo \
    python3-scipy \
    python3-xarray \
    python3-zarr \
    python3-aiofiles \
    python3-starlette \
    python3-uvicorn \
    python3-nest-asyncio

There will be some version issue I guess, I see some of the packages have pinned version ... the only missing are:

    sodapy \
   pygeofilter \
   pygeometa 

Should I try replacing the pip install instructions with the apt-get alternative ?

@justb4
Copy link
Member

justb4 commented Jan 12, 2023

@epifanio IMO you don't need all these -dev packages within the Docker Image for testing. AFAICS just pytest and pyld. e.g. Sphinx is not used, docs are published elsewhere. Flake8 is already in de main.yml CI/CD.

@justb4
Copy link
Member

justb4 commented Jan 12, 2023

@epifanio pls see my latest Dockerfile, attached here (and supporting files like .dockerignore and requirements-docker.txt above.)
Dockerfile.txt
above).

Just a single RUN and more. That reduced all with about 1GB.

@epifanio
Copy link
Contributor Author

epifanio commented Jan 12, 2023

@justb4 thanks! now building a test image with your suggested changes - I got 1.23 GB vs 1.91

@epifanio
Copy link
Contributor Author

Running test - @justb4 I got the following log I left the test unchanged

.dockerignore Outdated
.pytest_cache
build
dist
**/*.zip
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, this line should be taken out: some (Shapefile) tests need .zip files...

Dockerfile Outdated
if [ "$BUILD_DEV_IMAGE" = "true" ] ; then pip3 install -r requirements-dev.txt; fi \
# Install pygeoapi providers
&& pip3 install -r requirements-provider.txt \
&& if [ "$BUILD_DEV_IMAGE" = "true" ] ; then pip3 install -r requirements-dev.txt; fi \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be taken out: we don't need the DEV Image anymore (as we have pytest+pyld now).

Dockerfile Outdated
@@ -63,6 +63,7 @@ ARG TZ="Etc/UTC"
ARG LANG="en_US.UTF-8"
ARG BUILD_DEV_IMAGE="false"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can be removed: no DEV Image anymore.

@justb4
Copy link
Member

justb4 commented Jan 13, 2023

@epifanio This looks much cleaner now! I've added some single-line comments above.
Can imagine why many tests fail. But that is not due to the Dockerfile. We need to fix these anyway. What helps is this snippet in docker/entrypoint.sh to skip the SensorThings and other tests that require a local server:

	  for test_py in $(ls tests/test_*.py)
	  do
	    # Skip tests requireing backend server or libs installed
	    case ${test_py} in
	        tests/test_elasticsearch__provider.py)
	        ;&
	        tests/test_sensorthings_provider.py)
	        ;&
	        tests/test_postgresql_provider.py)
			    ;&
	        tests/test_mongo_provider.py)
	        	echo "Skipping: ${test_py}"
	        ;;
	        *)
	        	python3 -m pytest ${test_py}
	         ;;
	    esac
	  done
	  ;;

@justb4
Copy link
Member

justb4 commented Jan 13, 2023

I removed two comment-lines for the DEV build ARG. It may be good to know that folks can build a smaller Image by populating ADD_DEB_PACKAGES with their own required Providers.

Only flaw, was my additions, in the Dockerfile is that ADD . /pygeoapi comes before all the package installs. This results in that the Image is always completely built when something is changed in the pygeoapi tree. May leave this for now.

@justb4
Copy link
Member

justb4 commented Jan 13, 2023

Sorry I am on a slow internet (4G in Spain), building the current Dockerfile takes ages...Yesterday was better, couple of minutes.
Tests are skipped now anyway in containers.yml.
Possibly we should comment out building the DOCKER_TEST_IMAGE (line 49) in this PR and create a new issue/PR to clean up the Unit tests and Docker CI/CD. Or in this PR if there is still energy...

skip the SensorThings and other tests that require a local server
@epifanio
Copy link
Contributor Author

I edited the entrypoint to skip the sensor test

@epifanio epifanio requested a review from justb4 January 15, 2023 12:06
@justb4
Copy link
Member

justb4 commented Jan 15, 2023

@epifanio Think we're in good shape now. When I run the Container I see some errors. Not fatal. WFS-related See below.

$ docker run -p 5000:80 --rm geopython/pygeoapi:latest

START /entrypoint.sh
Trying to generate openapi.yml
Error Number: 1, Type: Failure, Msg: HTTP error code : 400
Error Number: 1, Type: Failure, Msg: Error returned by server : HTTP error code : 400 (0)
Error Number: 1, Type: Failure, Msg: HTTP error code : 400
Generating /pygeoapi/local.openapi.yml
Done
openapi.yml generated continue to pygeoapi

I think main point is that so many versions have changed as we moved to a new base Image, skip Ubuntu-GIS and have most packages as python3- Debs i.s.o. pip that we can't foresee all use cases. But we need to move ahead. This will be a latest Image anyway. We can verify on the pygeoapi demo server where it is automatically deployed as
https://demo.pygeoapi.io/master on commit/push.

Edit: I remember the error cause above now: it is usually from a stale WFS endpoint in the OGR Provider, but the three WFS-es in docker/default.config.yml should be ok (just checked).

But I see it comes from the deegree 'Utah' WFS. Also when browsing Collection:

[2023-01-15T12:56:43Z] {/pygeoapi/pygeoapi/provider/ogr.py:811} ERROR - Error Number: 1, Type: Failure, Msg: HTTP error code : 400
[2023-01-15T12:56:43Z] {/pygeoapi/pygeoapi/provider/ogr.py:811} ERROR - Error Number: 1, Type: Failure, Msg: Error returned by server : HTTP error code : 400 (0)
[2023-01-15T12:56:44Z] {/pygeoapi/pygeoapi/provider/ogr.py:811} ERROR - Error Number: 1, Type: Failure, Msg: HTTP error code : 400

Maybe this is a subtle error as we may not have the latest/greatest GDAL version in the Image? We have libgdal.so.30.0.1. The previous latest, running on demo-server has libgdal.so.30.0.3.
Maybe the WFS uses app-schema? I've seen some special WFS-handling code for deegree once in GDAL-OGR source code...

This may be an example of subtle errors because of going back in versions (pip/PyPi and Ubuntu-GIS usually have higher versions....)Maybe @kalxas @frafra @tomkralidis can comment? But I think we should go ahead and fix stuff in upcoming issues/PRs..

@frafra frafra mentioned this pull request Jan 16, 2023
2 tasks
@frafra
Copy link
Contributor

frafra commented Jan 16, 2023

@justb4 I build an image without using UbuntuPPA dependencies, based on Ubuntu 22.04, and still get that error. I guess it could be a server issue.

I wonder why gdal is one of the few pinned dependencies, set to ==3.4.1, as there is no comment close to that requirement line.

@justb4
Copy link
Member

justb4 commented Jan 16, 2023

@frafra like I stated above the error is not fatal. It is almost an edge use case: a particular driver (WFS) in the pygeoapi OGR backend Provider for a particular (version) of the deegree WFS. I looked into the GDAL GitHub Code where changes were made after 3.0.1 (summer 2022). This is the consequence of not having the same (higher) GDAL version as before in the Dockerfile. Maybe that is why GDAL is pinned.

Ideally, and I did several tries, but failed on key errors (@kalxas does UbuntuGIS have a key?) would like to give Ubuntu-GIS again a try. But without software-properties-common .deb package. This installs an enormous amount of packages, adding to the Image size, for just using apt-add-repository. Maybe not the cleanest way but, old-school:

echo "deb https://ppa.launchpadcontent.net/ubuntugis/ubuntugis-unstable/ubuntu/ devel main\n" > /etc/apt/sources.list.d/ubuntugis-ubuntu-ubuntugis-unstable-jammy.list 

Then adding the PPA key (how?). Think older images forgot to run apt-get -y update after adding the PPA (?).

Yes, the PyPi (pip, conda, poetry) we may have more control over versioning and having latest versions. But at the price of a more complex Dockerfile (will need binaries for GDAL and the like anyway), very long (wheel) builds, maybe even larger image size. Reproducible builds are nice, but we don't bring out that new versions of pygeoapi, these are frozen on DockerHub, that frequently and the latest is in principle not for production purposes.

@kalxas
Copy link
Member

kalxas commented Jan 16, 2023

Steps to add the ppa manually:

  • Open file /etc/apt/source.list and add the url of the ppa to the bottom of the file or in a separate file as @justb4 pointed above
  • Import the PGP key
    sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys 314DF160
  • sudo apt update

@justb4
Copy link
Member

justb4 commented Jan 16, 2023

@kalxas still get errors, will not accept PPA:

#8 32.14 Running hooks in /etc/ca-certificates/update.d...
#8 32.15 done.
#8 32.40 Warning: apt-key is deprecated. Manage keyring files in trusted.gpg.d instead (see apt-key(8)).
#8 32.44 Executing: /tmp/apt-key-gpghome.qQKwdRMwHe/gpg.1.sh --keyserver keyserver.ubuntu.com --recv-keys 314DF160
#8 32.70 gpg: key ABFB0707314DF160: public key "Totally Legit Signing Key <mallory@example.org>" imported
#8 32.71 gpg: Total number processed: 1
#8 32.71 gpg:               imported: 1
#8 33.17 Hit:1 http://security.ubuntu.com/ubuntu jammy-security InRelease
#8 33.29 Hit:2 http://archive.ubuntu.com/ubuntu jammy InRelease
#8 33.43 Get:3 https://ppa.launchpadcontent.net/ubuntugis/ubuntugis-unstable/ubuntu jammy InRelease [23.8 kB]
#8 33.45 Hit:4 http://archive.ubuntu.com/ubuntu jammy-updates InRelease
#8 33.60 Hit:5 http://archive.ubuntu.com/ubuntu jammy-backports InRelease
#8 33.73 Err:3 https://ppa.launchpadcontent.net/ubuntugis/ubuntugis-unstable/ubuntu jammy InRelease
#8 33.73   The following signatures couldn't be verified because the public key is not available: NO_PUBKEY 089EBE08314DF160
#8 33.91 Reading package lists...
#8 35.19 W: GPG error: https://ppa.launchpadcontent.net/ubuntugis/ubuntugis-unstable/ubuntu jammy InRelease: The following signatures couldn't be verified because the public key is not available: NO_PUBKEY 089EBE08314DF160
#8 35.19 E: The repository 'https://ppa.launchpadcontent.net/ubuntugis/ubuntugis-unstable/ubuntu jammy InRelease' is not signed.
#8 ERROR: executor failed running [/bin/sh -c apt-get update -y     && apt-get upgrade -y     && apt-get --no-install-recommends install -y ${DEB_BUILD_DEPS}  	&& echo "deb https://ppa.launchpadcontent.net/ubuntugis/ubuntugis-unstable/ubuntu/ jammy main\n" > /etc/apt/sources.list.d/ubuntugis-ubuntu-ubuntugis-unstable-jammy.list 	&& apt-key adv --keyserver keyserver.ubuntu.com --recv-keys 314DF160     && apt-get update -y     && apt-get --no-install-recommends install -y ${DEB_PACKAGES}     && localedef -i en_US -c -f UTF-8 -A /usr/share/locale/locale.alias en_US.UTF-8     && echo "For ${TZ} date=$(date)" && echo "Locale=$(locale)"      && mkdir /schemas.opengis.net     && curl -O http://schemas.opengis.net/SCHEMAS_OPENGIS_NET.zip     && unzip ./SCHEMAS_OPENGIS_NET.zip "ogcapi/*" -d /schemas.opengis.net     && rm -f ./SCHEMAS_OPENGIS_NET.zip     && pip3 install -r requirements-docker.txt     && pip3 install -e .     && cp /pygeoapi/docker/default.config.yml /pygeoapi/local.config.yml     && cp /pygeoapi/docker/entrypoint.sh /entrypoint.sh      && apt-get remove --purge -y gcc ${DEB_BUILD_DEPS}     && apt-get clean     && apt autoremove -y      && rm -rf /var/lib/apt/lists/*]: exit code: 100
------
 > [4/4] RUN     apt-get update -y     && apt-get upgrade -y     && apt-get --no-install-recommends install -y     curl     unzip     ca-certificates     gnupg2  	&& echo "deb https://ppa.launchpadcontent.net/ubuntugis/ubuntugis-unstable/ubuntu/ jammy main\n" > /etc/apt/sources.list.d/ubuntugis-ubuntu-ubuntugis-unstable-jammy.list 	&& apt-key adv --keyserver keyserver.ubuntu.com --recv-keys 314DF160     && apt-get update -y     && apt-get --no-install-recommends install -y     locales     tzdata     gunicorn     python3-dateutil     python3-flask     python3-flask-cors     python3-gevent     python3-greenlet     python3-pip     python3-tz     python3-unicodecsv     python3-yaml         libsqlite3-mod-spatialite     python3-dask     python3-elasticsearch     python3-fiona     python3-gdal     python3-netcdf4     python3-pandas     python3-psycopg2     python3-pymongo     python3-pyproj     python3-rasterio     python3-scipy     python3-shapely     python3-tinydb     python3-xarray     python3-zarr     python3-mapscript     python3-pytest     python3-pyld     && localedef -i en_US -c -f UTF-8 -A /usr/share/locale/locale.alias en_US.UTF-8     && echo "For Etc/UTC date=$(date)" && echo "Locale=$(locale)"      && mkdir /schemas.opengis.net     && curl -O http://schemas.opengis.net/SCHEMAS_OPENGIS_NET.zip     && unzip ./SCHEMAS_OPENGIS_NET.zip "ogcapi/*" -d /schemas.opengis.net     && rm -f ./SCHEMAS_OPENGIS_NET.zip     && pip3 install -r requirements-docker.txt     && pip3 install -e .     && cp /pygeoapi/docker/default.config.yml /pygeoapi/local.config.yml     && cp /pygeoapi/docker/entrypoint.sh /entrypoint.sh      && apt-get remove --purge -y gcc     curl     unzip     ca-certificates     gnupg2     && apt-get clean     && apt autoremove -y      && rm -rf /var/lib/apt/lists/*:
------
executor failed running [/bin/sh -c apt-get update -y     && apt-get upgrade -y     && apt-get --no-install-recommends install -y ${DEB_BUILD_DEPS}  	&& echo "deb https://ppa.launchpadcontent.net/ubuntugis/ubuntugis-unstable/ubuntu/ jammy main\n" > /etc/apt/sources.list.d/ubuntugis-ubuntu-ubuntugis-unstable-jammy.list 	&& apt-key adv --keyserver keyserver.ubuntu.com --recv-keys 314DF160     && apt-get update -y     && apt-get --no-install-recommends install -y ${DEB_PACKAGES}     && localedef -i en_US -c -f UTF-8 -A /usr/share/locale/locale.alias en_US.UTF-8     && echo "For ${TZ} date=$(date)" && echo "Locale=$(locale)"      && mkdir /schemas.opengis.net     && curl -O http://schemas.opengis.net/SCHEMAS_OPENGIS_NET.zip     && unzip ./SCHEMAS_OPENGIS_NET.zip "ogcapi/*" -d /schemas.opengis.net     && rm -f ./SCHEMAS_OPENGIS_NET.zip     && pip3 install -r requirements-docker.txt     && pip3 install -e .     && cp /pygeoapi/docker/default.config.yml /pygeoapi/local.config.yml     && cp /pygeoapi/docker/entrypoint.sh /entrypoint.sh      && apt-get remove --purge -y gcc ${DEB_BUILD_DEPS}     && apt-get clean     && apt autoremove -y      && rm -rf /var/lib/apt/lists/*]: exit code: 100

This is also a bit beyond my grasp: installed gnupg, gnupg2. Was confused in GDAL versions, but still get Jammy's version: GDAL 3.4.1, released 2021/12/27 while the current Docker latest version, like on demo has
GDAL 3.4.3, released 2022/04/22. Don't know for the other versions. Stuck how to get the PPA in without growing the Dockerfile....

@justb4
Copy link
Member

justb4 commented Jan 16, 2023

So maybe forget about the PPA approve the current version. Then fix all unit tests, maybe by skipping some. Fix the Docker build push in containers.yml and the proceed with #1105. We will have GDAL 3.4.1 (not 3.0.1 as I stated somewhere).
I now check by bash-ing into the container, apt-install gdal-bin and run gdalinfo --version. Ok, also from from within Python bindings:

$ python3
# gives 3.10.6
from osgeo import gdal as osgeo_gdal
osgeo_gdal.VersionInfo('VERSION_NUM')
'3040100'

@justb4
Copy link
Member

justb4 commented Jan 17, 2023

For sake of progress: let's move ahead, i.e. merge. Ok, GDAL will be a downgrade from 3.4.3 to 3.4.1. Can't tell for all the other N packages versions. But in the end: this is a "Latest" Docker Image to be published, not a pygeoapi version. Like a US Company, I once worked for, phrased: "We'll deal with the problem when it comes along" :-).

Also to provide a (re)base for @frafra in #1105 on which we can then all focus.
@kalxas @tomkralidis @francbartoli ok?

@justb4 justb4 merged commit 2d4d8e6 into geopython:master Jan 18, 2023
@justb4
Copy link
Member

justb4 commented Jan 18, 2023

Pff the Docker build on GH Workflow 1.5 hour underway, stay building...

@@ -75,8 +75,10 @@ case ${entry_cmd} in
case ${test_py} in
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aside: is there any reason why we run tests in our Docker entrypoint? Is this needed (given we have CI)? Perhaps we should consider removing @francbartoli @justb4

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am working on it already on it on the other PR, as the entrypoint should not have commands, but just prepare the environment.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should be able to run (most of) the Unit-tests within a Docker Container. Especially currently, the entire installation with OS, Python and package-versions will surely differ from a custom install. Hopefully this will largely go away with PDM and #1105, but still, just relying on main.yml is fragile or we should somehow invoke there. Ok, maybe via entrypoint.sh is not the cleanest way but we will also need some logic to skip tests.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And btw the above Docker build does not even invoke Unit tests. There is some larger problem there. After 6 hours the above build just was canceled, think by GH, no logs.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also containers.yml is now running for 5 hours after PR #1109 (unrelated to Docker build). At least the image is still built twice so that should be removed ASAP.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Takes around 8 minutes on my system.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I'll do it

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And the ran the tests: docker run -p 5000:80 --rm --name pygeoapi geopython/pygeoapi:latest test > test.txt 2>&1. Output attached:
test.txt

Actually only 3 main failure-causes, most can be explained:

  • several tests in test_api.py
    Many tests IMO should not be here! Like tests for Postgresql CQL,Coverage, EDR. They should be in their Provider (PG) or Service-specific testfile. test_api.py should perform generic API tests, not specific, and relies on unavailable local servers.

  • WFS tests: still the wrong GeoSolutions URL is in, thought we fixed that ages ago...

  • test_async_hello_world_process_parallel fails, cannot explain

All other tests succeed, that is the good news, but still we should open an issue to cleanup tests.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Takes less than 3 minutes on mine

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Opened issue: #1111.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants