Skip to content

Latest commit

 

History

History
3549 lines (2616 loc) · 174 KB

CHANGES.md

File metadata and controls

3549 lines (2616 loc) · 174 KB

Changes

Unreleased (latest)

1.20.3 (2022-08-18)

Fixes:

  • Canarie-api: fix unable to verify LetsEncrypt SSL certs

    LetsEncrypt older root certificate "DST Root CA X3" expired on September 30, 2021, see https://letsencrypt.org/docs/dst-root-ca-x3-expiration-september-2021/

    All the major browsers and OS platform has previously added the new root certificate "ISRG Root X1" ahead of time so the transition to the new root certificate is seemless for all clients.

    Python requests package bundle their own copy of known root certificates and is late to add this new root cert "ISRG Root X1". Had it automatically fallback to the OS copy of the root cert bundle, this would have been seemless.

    The fix is to force requests to use the OS copy of the root cert bundle.

    Fix for this error:

    $ docker exec proxy python -c "import requests; requests.request('GET', 'https://lvupavicsmaster.ouranos.ca/geoserver')"
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/usr/local/lib/python2.7/dist-packages/requests/api.py", line 50, in request
        response = session.request(method=method, url=url, **kwargs)
      File "/usr/local/lib/python2.7/dist-packages/requests/sessions.py", line 468, in request
        resp = self.send(prep, **send_kwargs)
      File "/usr/local/lib/python2.7/dist-packages/requests/sessions.py", line 576, in send
        r = adapter.send(request, **kwargs)
      File "/usr/local/lib/python2.7/dist-packages/requests/adapters.py", line 433, in send
        raise SSLError(e, request=request)
    requests.exceptions.SSLError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:661)
    

    Default SSL root cert bundle of requests:

    $ docker exec proxy python -c "import requests; print requests.certs.where()"
    /usr/local/lib/python2.7/dist-packages/requests/cacert.pem
    

    Confirm the fix works:

    $ docker exec -it proxy bash
    root@37ed3a2a03ae:/opt/local/src/CanarieAPI/canarieapi# REQUESTS_CA_BUNDLE=/etc/ssl/certs/ca-certificates.crt python -c "import requests; requests.request('GET', 'https://lvupavicsmaster.ouranos.ca/geoserver')"
    root@37ed3a2a03ae:/opt/local/src/CanarieAPI/canarieapi#
    
    $ docker exec proxy env |grep REQ
    REQUESTS_CA_BUNDLE=/etc/ssl/certs/ca-certificates.crt
    

    Fixes #198

1.20.2 (2022-08-17)

Changes:

  • birdhouse-deploy: fix missing bump of server version reported in canarie service configuration

1.20.1 (2022-08-11)

Changes:

1.20.0 (2022-08-10)

Changes

  • Weaver: update weaver component default version from 4.12.0 to 4.20.0. See full CHANGELOG for details.

    Breaking changes

    • Docker commands that target weaver-worker to start or use celery must be adjusted according to how its new CLI resolves certain global parameters. Since the celery-healthcheck script uses this CLI, celery commands were adjusted to consider those changes. If custom scripts or command overrides are used to call celery, similar changes will need to be applied according to employed Weaver version. See details in Weaver 4.15.0 changes.

    Relevant changes

    • Support OpenAPI-based schema field for Process I/O definitions to align with latest OGC API - Processes changes.
    • Support Prefer header to define execution mode of jobs according to latest OGC API - Processes recommendations.
    • Support transmissionMode to return file-based outputs by HTTP Link header references as desired.
    • Support deployment of new processes using YAML and CWL based request contents directly to remove the need to convert and indirectly embed their definitions in specific JSON schema structures.
    • Support process revisions allowing users to iteratively update process metadata and their definitions without full un/re-deployment of the complete process for each change. This also allows multiple process revisions to live simultaneously on the instance, which can be described or launched for job executions with specific tagged versions.
    • Add control query parameters to retrieve outputs in different JSON schema variations according to desired structure.
    • Add statistics collection following job execution to obtain machine resource usage by the executed process.
    • Improve handling of Content-Type definitions for reporting inputs, outputs and logs retrieval from job executions.
    • Fixes related to reporting of job results with different formats and URL references based on requested execution methods and control parameters.
    • Fixes to resolve pending vulnerabilities or feature integrations by package dependencies (celery, pywps, etc.).
    • Fixes related to parsing of WPS-1/2 remote providers URL from a CWL definition using GetCapabilities endpoint.
    • Fixes and addition of multiple Weaver CLI capabilities to employ new features.

1.19.2 (2022-07-20)

Changes

  • Finch: new release for new Xclim

    Finch release notes:

    0.9.2 (2022-07-19)

    • Fix Finch unable to startup in the Docker image.

    0.9.1 (2022-07-07)

    • Avoid using a broken version of libarchive in the Docker image.

    0.9.0 (2022-07-06)

    • Fix use of output_name, add output_format to xclim indicators.

    • Change all outputs to use output as the main output field name (instead of output_netcdf).

    • Updated to xclim 0.37:

      • Percentile inputs of xclim indicators have been renamed with generic names, excluding an explicit mention to the target percentile.
      • In ensemble processes, these percentiles can now be chosen through perc_[var] inputs. The default values are inherited from earlier versions of xclim.
    • Average shape process downgraded to be single-threaded, as ESMF seems to have issues with multithreading.

    • Removed deprecated processes subset_ensemble_bbox_BCCAQv2, subset_ensemble_BCCAQv2 and BCCAQv2_heat_wave_frequency_gridpoint.

    • Added csv_precision to all processes allowing CSV output. When given, it controls the number of decimal places in the output.

1.19.1 (2022-07-19)

Changes

  • Various changes to get the new production host up and running

    Non-breaking changes

    • Bootstrap testsuite: only crawl the subset enough to pass canarie-api monitoring: faster when system under test has too much other stuff.
    • New script: check-autodeploy-repos: to ensure autodeploy will trigger normally.
    • New script: sync-data: to pull data from existing production host to a new production host or to a staging host to emulate the production host.
    • thredds, geoserver, generic_bird: set more appropriate production values, taken from https://github.com/Ouranosinc/birdhouse-deploy/commit/316439e310e915e0a4ef35d25744cab76722fa99
    • monitoring: fix redundant network_mode: host and ports binding since host network_mode will already automatically perform port bindings

    Breaking changes

    • None

Related Issue / Discussion

1.19.0 (2022-06-08)

Changes:

  • Magpie/Twitcher: update magpie service from 3.21.0 to 3.26.0 and bundled twitcher from 0.6.2 to 0.7.0.

    • Adds Service Hooks allowing Twitcher to apply HTTP pre-request/post-response modifications to requested services and resources in accordance to MagpieAdapter implementation and using plugin Python scripts when matched against specific request parameters.

    • Using Service Hooks, inject X-WPS-Output-Context header in Weaver job submission requests through the proxied request by Twitcher and MagpieAdapter. This header contains the user ID that indicates to Weaver were to store job output results, allowing to save them in the corresponding user's workspace directory under wpsoutputs path. More details found in PR #244.

    • Using Service Hooks, filter processes returned by Weaver in JSON response from /processes endpoint using respective permissions applied onto each /processes/{processID} for the requesting user. Users will only be able to see processes for which they have read access to retrieve the process description. More details found in PR #245.

    • Using Service Hooks, automatically apply permissions for the user that successfully deployed a Weaver process using POST /processes request, granting it direct access to this process during process listing, process description request and for submitting job execution of this process. Only this user deploying the process will have access to it until further permissions are added in Magpie to share or publish it with other users, groups and/or publicly. The user must have the necessary permission to deploy a new process in the first place. More details found in PR #247.

1.18.13 (2022-06-07)

Changes:

  • deploy-data: new env var DEPLOY_DATA_RSYNC_USER_GRP to avoid running cronjobs as root

    When deploy-data is used by the scheduler component, it is run as root. This new env var will force the rsync process to run as a regular user to follow security best practice to avoid running as root when not needed.

    Note that the git checkout step done by deploy-data is still run as root. This is because deploy-data is currently still run as root so it can execute docker commands (ex: spawning the rsync command above in its own docker container).

    To fix this limitation, the regular user inside the deploy-data container need to have docker access inside the container and outside on the host at the same time. If we make that regular user configurable so the script deploy-data is generic and can work for any organisations, this is tricky for the moment so will have to be handle in another PR.

    So for the moment we have not achieved full non-root user in cronjobs launched by the scheduler compoment but the most important part, the part that perform the actual job (rsync or execute custom command using an external docker container) is running as non-root.

    See PR bird-house/birdhouse-deploy-ouranos#18 that make use of this new env var.

    When deploy-data is invoking an external script that itself spawn a new docker run, then it is up to this external script to ensure the proper non-root user is used by docker run. See PR Ouranosinc/pavics-vdb#50 that handle that case.

1.18.12 (2022-05-05)

Changes:

  • Jupyter env: new build for new XClim and to get Dask dashboard and Panel server app to work

    Deploy new Jupyter env from PR Ouranosinc/PAVICS-e2e-workflow-tests#105 on PAVICS.

    Detailed changes can be found at Ouranosinc/PAVICS-e2e-workflow-tests#105.

    Dask dashboard no manual URL mangling required:

    Screenshot from 2022-04-13 22-37-49

    "Render with Panel" button works: Screenshot from 2022-05-04 15-18-03

    Relevant changes:

    # new
    >   - dask-labextension=5.2.0=pyhd8ed1ab_0
    >   - jupyter-panel-proxy=0.2.0a2=py_0
    >   - jupyter-server-proxy=3.2.1=pyhd8ed1ab_0
    
    # removed, interfere with panel
    <     - handcalcs==1.4.1
    
    <   - xclim=0.34.0=pyhd8ed1ab_0
    >   - xclim=0.36.0=pyhd8ed1ab_0
    
    <   - cf_xarray=0.6.3=pyhd8ed1ab_0
    >   - cf_xarray=0.7.2=pyhd8ed1ab_0
    
    <   - clisops=0.8.0=pyh6c4a22f_0
    >   - clisops=0.9.0=pyh6c4a22f_0
    
    # downgrade by clisops
    <   - pandas=1.4.1=py38h43a58ef_0
    >   - pandas=1.3.5=py38h43a58ef_0
    
    <   - rioxarray=0.10.3=pyhd8ed1ab_0
    >   - rioxarray=0.11.1=pyhd8ed1ab_0
    
    <   - nc-time-axis=1.4.0=pyhd8ed1ab_0
    >   - nc-time-axis=1.4.1=pyhd8ed1ab_0
    
    <   - roocs-utils=0.5.0=pyh6c4a22f_0
    >   - roocs-utils=0.6.1=pyh6c4a22f_0
    
    <   - panel=0.12.7=pyhd8ed1ab_0
    >   - panel=0.13.1a2=py_0
    
    <   - plotly=5.6.0=pyhd8ed1ab_0
    >   - plotly=5.7.0=pyhd8ed1ab_0

1.18.11 (2022-04-21)

Changes:

  • Finch: new release for dask performance problem

    PR to deploy new Finch releases in bird-house/finch#233 on PAVICS.

    See the Finch PR for more info.

    Finch release notes:

    0.8.3 (2022-04-21)

1.18.10 (2022-04-07)

Changes:

  • Jupyter env: new xlrd, pre-commit, pin dask, distributed, cf_xarray, latest of everything else

    Deploy new Jupyter env from PR Ouranosinc/PAVICS-e2e-workflow-tests#101 on PAVICS.

    Detailed changes can be found at Ouranosinc/PAVICS-e2e-workflow-tests#101.

    Relevant changes:

    >   - pre-commit=2.17.0=py38h578d9bd_0
    >   - xlrd=2.0.1=pyhd8ed1ab_3
    
    <   - xclim=0.32.1=pyhd8ed1ab_0
    >   - xclim=0.34.0=pyhd8ed1ab_0
    
    <   - cfgrib=0.9.9.1=pyhd8ed1ab_1
    >   - cfgrib=0.9.10.1=pyhd8ed1ab_0
    
    <   - cftime=1.5.1.1=py38h6c62de6_1
    >   - cftime=1.6.0=py38h3ec907f_0
    
    <   - intake-xarray=0.5.0=pyhd8ed1ab_0
    >   - intake-xarray=0.6.0=pyhd8ed1ab_0
    
    <   - pandas=1.3.5=py38h43a58ef_0
    >   - pandas=1.4.1=py38h43a58ef_0
    
    <   - regionmask=0.8.0=pyhd8ed1ab_1
    >   - regionmask=0.9.0=pyhd8ed1ab_0
    
    <   - rioxarray=0.9.1=pyhd8ed1ab_0
    >   - rioxarray=0.10.3=pyhd8ed1ab_0
    
    <   - xarray=0.20.2=pyhd8ed1ab_0
    >   - xarray=2022.3.0=pyhd8ed1ab_0
    
    <   - zarr=2.10.3=pyhd8ed1ab_0
    >   - zarr=2.11.1=pyhd8ed1ab_0

1.18.9 (2022-03-16)

Changes:

  • Finch: update finch component from 0.7.7 to 0.8.2

    Relevant Changes:

    • v0.8.0
      • Add hourly_to_daily process
      • Avoid annoying warnings by updating birdy (environment-docs)
      • Upgrade to clisops 0.8.0 to accelerate spatial averages over regions.
      • Upgrade to xesmf 0.6.2 to fix spatial averaging bug not weighing correctly cells with varying areas.
      • Update to PyWPS 4.5.1 to allow the creation of recursive directories for outputs.
    • v0.8.2
      • Add geoseries_to_netcdf process, converting a geojson (like a OGC-API request) to a CF-compliant netCDF.
      • Add output_name argument to most processes (excepted subsetting and averaging processes), to control the name (or prefix) of the output file.
      • New dependency python-slugify to ensure filenames are safe and valid.
      • Pinning dask to <=2022.1.0 to avoid a performance issue with 2022.1.1.

1.18.8 (2022-03-09)

Changes:

  • Weaver: fix tests

    Relevant changes:

    • Increase default timeout (60s -> 120s) for components/weaver/post-docker-compose-up script to allow it to complete with many WPS bird taking a long time to boot. Before this fix, test instances only managed to register catalog, finch, and flyingpigeon providers, but timed out for hummingbird and following WPS birds.

      This resolves the first few cell tests by having birds ready for use:

      [2022-03-09T02:13:34.966Z] pavics-sdi-master/docs/source/notebook-components/weaver_example.ipynb . [ 57%]
      [2022-03-09T02:13:46.069Z] .......FF.                                                               [ 61%]
      
    • Add override request_options.yml in birdhouse/optional-components/test-weaver that disables SSL verification specifically for the remaining 2 F cell above. Error is related to the job execution itself on the test instance, which fails when Weaver sends requests to hummingbird's ncdump process. An SSL verification error happens, because the test instance uses a self-signed SSL certificate.

1.18.7 (2022-03-08)

Changes:

  • Weaver: update weaver component from 4.5.0 to 4.12.0.

    Relevant changes:

    • Adds WeaverClient and Weaver CLI. Although not strictly employed by the platform itself to offer Weaver as a service, these can be employed to interact with Weaver using Python or shell commands, providing access to all WPS birds offered by the platform using the common OGC API - Processes interface through Weaver Providers.
    • Adds Vault functionality allowing temporary and secure storage to upload files for single-use process execution.
    • Various bugfixes and conformance resolution related to OGC API - Processes.
    • Fix weaver-mongodb link references for weaver-worker. New default variables WEAVER_MONGODB_[HOST|PORT|URL] are defined to construct different INI configuration formats employed by weaver and weaver-worker images.
    • Fix missing EXTRA_VARS variables in Weaver's default.env.
    • Fix celery-healthcheck of weaver-worker to consider multiple tasks.

1.18.6 (2022-03-08)

  • Magpie: update magpie service from 3.19.1 to 3.21.0.

    Relevant changes:

    • Update WFS, WMS and WPS related services to properly implement the relevant Permissions and Resources according to their specific implementation details. For example, GeoServer-based WMS implementation supports Workspaces and additional operations that are not offered by standard OGC-based WMS. Some of these implementation specific operations can be taken advantage of with improved Permissions and Resources resolution.
    • Add multi-Resource effective access resolution for Services that support it. For example, accessing multiple Layers under a permission-restricted WFS with parameters that allow multiple values within a single request is now possible, if the user is granted to all specified Resources. Previously, users would require to access each Layer Resource individually with distinct requests.
    • Magpie's API and UI are more verbose about supported hierarchical Resource structure under a given Service type. When creating Resources, specific structures have to be respected, and only valid cases are proposed in the UI.
    • Minor UI fixes.

1.18.5 (2022-01-27)

Changes:

  • Jupyter: update Jupyter env for latest XClim, RavenPy and all packages

    Deploy new Jupyter env from PR Ouranosinc/PAVICS-e2e-workflow-tests#95 on PAVICS.

    Detailed changes can be found at Ouranosinc/PAVICS-e2e-workflow-tests#95.

    Relevant changes:

    <   - xclim=0.31.0=pyhd8ed1ab_0
    >   - xclim=0.32.1=pyhd8ed1ab_0
    
    <   - ravenpy=0.7.5=pyhff6ddc9_0
    >   - ravenpy=0.7.8=pyh8a188c0_0
    
    <   - python=3.7.12=hb7a2778_100_cpython
    >   - python=3.8.12=hb7a2778_2_cpython
    
    # removed
    <   - vcs=8.2.1=pyh9f0ad1d_0
    
    <   - numpy=1.21.4=py37h31617e3_0
    >   - numpy=1.21.5=py38h87f13fb_0
    
    <   - xarray=0.20.1=pyhd8ed1ab_0
    >   - xarray=0.20.2=pyhd8ed1ab_0
    
    <   - rioxarray=0.8.0=pyhd8ed1ab_0
    >   - rioxarray=0.9.1=pyhd8ed1ab_0
    
    <   - cf_xarray=0.6.1=pyh6c4a22f_0
    >   - cf_xarray=0.6.3=pyhd8ed1ab_0
    
    <   - gdal=3.3.2=py37hd5a0ba4_2
    >   - gdal=3.3.3=py38hcf2042a_0
    
    <   - rasterio=1.2.6=py37hc20819c_2
    >   - rasterio=1.2.10=py38hfd64e68_0
    
    <   - climpred=2.1.6=pyhd8ed1ab_1
    >   - climpred=2.2.0=pyhd8ed1ab_0
    
    <   - clisops=0.7.0=pyh6c4a22f_0
    >   - clisops=0.8.0=pyh6c4a22f_0
    
    <   - xesmf=0.6.0=pyhd8ed1ab_0
    >   - xesmf=0.6.2=pyhd8ed1ab_0
    
    <   - birdy=v0.8.0=pyh6c4a22f_1
    >   - birdy=0.8.1=pyh6c4a22f_1
    
    <   - cartopy=0.20.0=py37hbe109c4_0
    >   - cartopy=0.20.1=py38hf9a4893_1
    
    <   - dask=2021.11.2=pyhd8ed1ab_0
    >   - dask=2022.1.0=pyhd8ed1ab_0
    
    <   - numba=0.53.1=py37hb11d6e1_1
    >   - numba=0.55.0=py38h4bf6c61_0
    
    <   - pandas=1.3.4=py37he8f5f7f_1
    >   - pandas=1.3.5=py38h43a58ef_0

1.18.4 (2022-01-25)

Changes:

  • vagrant: support RockyLinux

    RockyLinux 8 is the successor to Centos 7.

    Centos 8 has become like a "RHEL 8 beta" than the equivalent of RHEL 8.

    RockyLinux 8 is the new equivalent of RHEL 8, following the original spirit of the Centos project.

    More info at https://rockylinux.org/about.

1.18.3 (2021-12-17)

Changes:

  • Jupyter: new build with latest changes

    See PR Ouranosinc/PAVICS-e2e-workflow-tests#94 for more info.

    Change summary:

    <   - xclim=0.28.1=pyhd8ed1ab_0
    >   - xclim=0.31.0=pyhd8ed1ab_0
    
    <   - ravenpy=0.7.4=pyh7f9bfb9_0
    >   - ravenpy=0.7.5=pyhff6ddc9_0
    
    <   - xarray=0.19.0=pyhd8ed1ab_1
    >   - xarray=0.20.1=pyhd8ed1ab_0
    
    <   - rasterio=1.2.1=py37ha549118_0
    >   - rasterio=1.2.6=py37hc20819c_2
    
    <   - bokeh=2.3.3=py37h89c1867_0
    >   - bokeh=2.4.2=py37h89c1867_0
    
    <   - cartopy=0.19.0.post1=py37h0c48da3_1
    >   - cartopy=0.20.0=py37hbe109c4_0
    
    <   - cffi=1.14.6=py37hc58025e_0
    >   - cffi=1.15.0=py37h036bc23_0
    
    <   - climpred=2.1.5.post1=pyhd8ed1ab_0
    >   - climpred=2.1.6=pyhd8ed1ab_1
    
    <   - clisops=0.6.5=pyh6c4a22f_0
    >   - clisops=0.7.0=pyh6c4a22f_0
    
    <   - dask=2021.9.0=pyhd8ed1ab_0
    >   - dask=2021.11.2=pyhd8ed1ab_0
    
    <   - gdal=3.1.4=py37h2ec2946_8
    >   - gdal=3.3.2=py37hd5a0ba4_2
    
    <   - geopandas=0.9.0=pyhd8ed1ab_1
    >   - geopandas=0.10.2=pyhd8ed1ab_0
    
    <   - nc-time-axis=1.3.1=pyhd8ed1ab_2
    >   - nc-time-axis=1.4.0=pyhd8ed1ab_0
    
    <   - pandas=1.2.5=py37h219a48f_0
    >   - pandas=1.3.4=py37he8f5f7f_
    
    <   - poppler=0.89.0=h2de54a5_5
    >   - poppler=21.09.0=ha39eefc_3
    
    <   - rioxarray=0.7.0=pyhd8ed1ab_0
    >   - rioxarray=0.8.0=pyhd8ed1ab_0
    
    <   - roocs-utils=0.4.2=pyh6c4a22f_0
    >   - roocs-utils=0.5.0=pyh6c4a22f_0

1.18.2 (2021-12-13)

Fixes

  • Thredds: update for Log4j Vulnerability CVE-2021-44228

    Quebec gouvernment has shutdown its website due to this vulnerability so it's pretty serious (https://montrealgazette.com/news/quebec/quebec-government-shutting-down-websites-report).

    Thredds release notes: https://github.com/Unidata/thredds/releases

    https://www.oracle.com/security-alerts/alert-cve-2021-44228.html

    Oracle Security Alert Advisory - CVE-2021-44228 Description

    This Security Alert addresses CVE-2021-44228, a remote code execution vulnerability in Apache Log4j. It is remotely exploitable without authentication, i.e., may be exploited over a network without the need for a username and password.

    Due to the severity of this vulnerability and the publication of exploit code on various sites, Oracle strongly recommends that customers apply the updates provided by this Security Alert as soon as possible.

    Affected Products and Versions Apache Log4j, versions 2.0-2.14.1

    We have 4 Java component but only 1 is vulnerable: Thredds:

    After fix:

    $ docker run -it --rm unidata/thredds-docker:4.6.18 bash
    root@f65aadd2955c:/usr/local/tomcat# find -iname '**log4j**'
    ./webapps/thredds/WEB-INF/classes/log4j2.xml
    ./webapps/thredds/WEB-INF/lib/log4j-api-2.15.0.jar
    ./webapps/thredds/WEB-INF/lib/log4j-core-2.15.0.jar
    ./webapps/thredds/WEB-INF/lib/log4j-slf4j-impl-2.15.0.jar
    ./webapps/thredds/WEB-INF/lib/log4j-web-2.15.0.jar
    

    Before fix (unidata/thredds-docker:4.6.15):

    $ docker exec -it thredds find / -iname '**log4j**'
    find: ‘/proc/1/map_files’: Operation not permitted
    find: ‘/proc/12/map_files’: Operation not permitted
    find: ‘/proc/20543/map_files’: Operation not permitted
    /usr/local/tomcat/webapps/twitcher#ows#proxy#thredds/WEB-INF/classes/log4j2.xml
    /usr/local/tomcat/webapps/twitcher#ows#proxy#thredds/WEB-INF/lib/log4j-api-2.13.3.jar
    /usr/local/tomcat/webapps/twitcher#ows#proxy#thredds/WEB-INF/lib/log4j-core-2.13.3.jar
    /usr/local/tomcat/webapps/twitcher#ows#proxy#thredds/WEB-INF/lib/log4j-slf4j-impl-2.13.3.jar
    /usr/local/tomcat/webapps/twitcher#ows#proxy#thredds/WEB-INF/lib/log4j-web-2.13.3.jar
    

    Other components (ncwms2, geoserver, solr) have log4j older than version 2.0 so supposedly not affected:

    $ docker exec -it ncwms2 find / -iname '**log4j**'
    /opt/conda/envs/birdhouse/opt/apache-tomcat/webapps/ncWMS2/WEB-INF/classes/log4j.properties
    /opt/conda/envs/birdhouse/opt/apache-tomcat/webapps/ncWMS2/WEB-INF/lib/log4j-1.2.17.jar
    /opt/conda/envs/birdhouse/opt/apache-tomcat/webapps/ncWMS2/WEB-INF/lib/slf4j-log4j12-1.7.2.jar
    
    $ docker exec -it geoserver find / -iname '**log4j**'
    /build_data/log4j.properties
    find: ‘/etc/ssl/private’: Permission denied
    find: ‘/proc/tty/driver’: Permission denied
    find: ‘/proc/1/map_files’: Operation not permitted
    find: ‘/proc/15/task/47547’: No such file or directory
    find: ‘/proc/15/map_files’: Operation not permitted
    find: ‘/proc/47492/map_files’: Operation not permitted
    find: ‘/root’: Permission denied
    /usr/local/tomcat/log4j.properties
    /usr/local/tomcat/webapps/geoserver/WEB-INF/lib/log4j-1.2.17.jar
    /usr/local/tomcat/webapps/geoserver/WEB-INF/lib/metrics-log4j-3.0.2.jar
    /usr/local/tomcat/webapps/geoserver/WEB-INF/lib/slf4j-log4j12-1.6.4.jar
    find: ‘/var/cache/apt/archives/partial’: Permission denied
    find: ‘/var/cache/ldconfig’: Permission denied
    
    $ docker exec -it solr find / -iname '**log4j**'
    /data/solr/log4j.properties
    /opt/birdhouse/eggs/birdhousebuilder.recipe.solr-0.1.5-py2.7.egg/birdhousebuilder/recipe/solr/templates/log4j.properties
    /opt/conda/envs/birdhouse/opt/solr/docs/solr-core/org/apache/solr/logging/log4j
    /opt/conda/envs/birdhouse/opt/solr/docs/solr-core/org/apache/solr/logging/log4j/Log4jInfo.html
    /opt/conda/envs/birdhouse/opt/solr/docs/solr-core/org/apache/solr/logging/log4j/Log4jWatcher.html
    /opt/conda/envs/birdhouse/opt/solr/docs/solr-core/org/apache/solr/logging/log4j/class-use/Log4jInfo.html
    /opt/conda/envs/birdhouse/opt/solr/docs/solr-core/org/apache/solr/logging/log4j/class-use/Log4jWatcher.html
    /opt/conda/envs/birdhouse/opt/solr/example/resources/log4j.properties
    /opt/conda/envs/birdhouse/opt/solr/licenses/log4j-1.2.17.jar.sha1
    /opt/conda/envs/birdhouse/opt/solr/licenses/log4j-LICENSE-ASL.txt
    /opt/conda/envs/birdhouse/opt/solr/licenses/log4j-NOTICE.txt
    /opt/conda/envs/birdhouse/opt/solr/licenses/slf4j-log4j12-1.7.7.jar.sha1
    /opt/conda/envs/birdhouse/opt/solr/server/lib/ext/log4j-1.2.17.jar
    /opt/conda/envs/birdhouse/opt/solr/server/lib/ext/slf4j-log4j12-1.7.7.jar
    /opt/conda/envs/birdhouse/opt/solr/server/resources/log4j.properties
    /opt/conda/envs/birdhouse/opt/solr/server/scripts/cloud-scripts/log4j.properties
    /opt/conda/pkgs/solr-5.2.1-1/opt/solr/docs/solr-core/org/apache/solr/logging/log4j
    /opt/conda/pkgs/solr-5.2.1-1/opt/solr/docs/solr-core/org/apache/solr/logging/log4j/Log4jInfo.html
    /opt/conda/pkgs/solr-5.2.1-1/opt/solr/docs/solr-core/org/apache/solr/logging/log4j/Log4jWatcher.html
    /opt/conda/pkgs/solr-5.2.1-1/opt/solr/docs/solr-core/org/apache/solr/logging/log4j/class-use/Log4jInfo.html
    /opt/conda/pkgs/solr-5.2.1-1/opt/solr/docs/solr-core/org/apache/solr/logging/log4j/class-use/Log4jWatcher.html
    /opt/conda/pkgs/solr-5.2.1-1/opt/solr/example/resources/log4j.properties
    /opt/conda/pkgs/solr-5.2.1-1/opt/solr/licenses/log4j-1.2.17.jar.sha1
    /opt/conda/pkgs/solr-5.2.1-1/opt/solr/licenses/log4j-LICENSE-ASL.txt
    /opt/conda/pkgs/solr-5.2.1-1/opt/solr/licenses/log4j-NOTICE.txt
    /opt/conda/pkgs/solr-5.2.1-1/opt/solr/licenses/slf4j-log4j12-1.7.7.jar.sha1
    /opt/conda/pkgs/solr-5.2.1-1/opt/solr/server/lib/ext/log4j-1.2.17.jar
    /opt/conda/pkgs/solr-5.2.1-1/opt/solr/server/lib/ext/slf4j-log4j12-1.7.7.jar
    /opt/conda/pkgs/solr-5.2.1-1/opt/solr/server/resources/log4j.properties
    /opt/conda/pkgs/solr-5.2.1-1/opt/solr/server/scripts/cloud-scripts/log4j.properties
    

1.18.1 (2021-12-08)

Fixes

1.18.0 (2021-12-08)

Changes

  • Upgrade default Weaver version to 4.5.0 (from 4.2.1) for new features and fixes. Most notable changes are:
    • Adds support of X-WPS-Output-Context header to define the WPS output nested directory (for user context).
    • Adds support of X-Auth-Docker header to define a private Docker registry authentication token when the referenced Docker image in the deployed Application Package requires it to fetch it for Process execution.
    • Require MongoDB==5.0 Docker image for Weaver's database.
    • Fixes related to handling dismiss operation of job executions and retrieval of their results.
    • Fixes related to fetching remote files and propagation of intermediate results between Workflow steps.

Important

Because of the new MongoDB==5.0 database requirement for Weaver that uses (potentially) distinct version from other birds (notably phoenix with MongoDB==3.4), a separate Docker image is employed only for Weaver. If some processes, jobs, or other Weaver-related data was already defined on one of your server instances, manual transfer between the generic ${DATA_PERSIST_ROOT}/mongodb_persist to new ${DATA_PERSIST_ROOT}/mongodb_weaver_persist directory must be accomplished. The data in the new directory should then be migrated to the new version following the procedure described in Database Migration.

Legal Notice

While migrating from MongoDB==3.4 to MongoDB==5.0, its license changes from AGPL to SSPL (reference: mongodb/mongo@6ea81c8/README#L89-L95). This should not impact users using the platform for public and Open Source uses, but should be considered otherwise.

1.17.6 (2021-12-03)

Changes

  • Upgrade Magpie/Twitcher to 3.19.0, and add new related environment variables.
    • Adjust Twitcher runner to employ gunicorn instead of waitress.
    • Add new environment variables to handle email usage, used for features such as user registration/approval and user assignment to groups with terms and conditions.
    • Add expiration variable for temporary tokens.

1.17.5 (2021-11-16)

Changes

1.17.4 (2021-11-03)

Fixes

  • Add missing config/canarie-api/weaver_config.py entry to .gitignore of ./components/weaver that is generated from the corresponding template file.

    If upgrading from previous 1.17.x version, autodeploy will not resume automatically even with this fix because of the dirty state of the repository. A manual git pull will be required to fix subsequent autodeploy triggers.

1.17.3 (2021-11-03)

Fixes

  • Minor fix to install-docker.sh and comment update for other scripts due to Magpie upgrade

    install-docker.sh: fix to work with users with sudo privilege. Before it needed user root.

    Other comments in scripts are due to new Magpie in PR #107.

1.17.2 (2021-11-03)

Changes

  • scripts: add extract-jupyter-users-from-magpie-db

    Extract Jupyter users from Magpie DB so we can send announcements to all Jupyter users.

    Sample output:

    $ ./scripts/extract-jupyter-users-from-magpie-db  > /tmp/out
    + echo SELECT email,user_name FROM users ORDER BY email
    + docker exec -i postgres-magpie psql -U postgres-magpie magpiedb
    
    $ cat /tmp/out
             email          |   user_name
    ------------------------+---------------
     admin-catalog@mail.com | admin-catalog
     admin@mail.com         | admin
     anonymous@mail.com     | anonymous
     authtest@example.com   | authtest
    (4 rows)
    

1.17.1 (2021-11-02)

Fixes

  • Apply mongodb network to mongodb image in order to allow phoenix to properly reference it.
  • Remove mongodb definition from ./components/weaver since the extended mongodb network is already provided.

1.17.0 (2021-11-01)

Changes

  • Adds Weaver to the stack (optional) when ./components/weaver is added to EXTRA_CONF_DIRS. For more details, refer to Weaver Component Following happens when enabled:

    • Service weaver (API) gets added with endpoints /twitcher/ows/proxy/weaver and /weaver.

    • All birds offering a WPS 1.x/2.x endpoint are automatically added as providers known by Weaver (birds: catalog, finch, flyingpigeon, hummingbird, malleefowl and raven). This offers an automatic mapping of WPS 1.x/2.x requests of process descriptions and execution nested under the birds to corresponding OGC-API - Processes RESTful interface (and added functionalities).

    • New processes can be deployed and executed using Dockerized Application Packages. Additionally, all existing processes (across bird providers and Dockerized Application Packages) can be chained into Workflows

    • Images weaver-worker (Weaver's job executor) and docker-proxy (sibling Docker container dispatcher) are added to the stack to support above functionalities.

    • Adds Magpie permissions and service for Weaver endpoints.

    • Adds ./optional-components/test-weaver for even more Magpie extended permissions for Weaver for getting access to resources for functionalities required by Weaver Testing notebook.

1.16.2 (2021-10-27)

Changes

  • geoserver: enable geopkg plugin

    https://docs.geoserver.org/latest/en/user/community/geopkg/

    ==========

    This plugin brings in the ability to write GeoPackage files in GeoServer. Reading GeoPackage files is part of the core functionality of GeoServer, and does not require this extension.

    GeoPackage is an SQLite based standard format that is able to hold multiple vector and raster data layers in a single file.

    GeoPackage can be used as an output format for WFS GetFeature (creating one vector data layer) as well as WMS GetMap (creating one raster data layer). The GeoServer GeoPackage extension also allows to create a completely custom made GeoPackage with multiple layers, using the GeoPackage process.

    ==========

    Concretely this plugin adds a new GeoPackage download format, see screenshot below: Screenshot from 2021-10-27 17-09-05

1.16.1 (2021-10-25)

Changes

  • Thredds: Enable Netcdf Subset Service (NCSS)

    "The Netcdf Subset Service (NCSS) is one of the ways that the TDS can serve data. It is an experimental REST protocol for returning subsets of CDM datasets." https://www.unidata.ucar.edu/software/tds/current/reference/NetcdfSubsetServiceConfigure.html

    More NCSS docs: https://www.unidata.ucar.edu/software/tds/current/reference/NetcdfSubsetServiceReference.html

    Briefly, the advantage to enable NCSS is to be able to perform subsetting directly in the browser (manipulating URL parameters), avoiding the overhead for using OpenDAP (needs another client than the existing browser). This even works for .ncml files.

    Recall previously using "HTTPServer" link type, we were able to download directly the .nc files but for .ncml we got the xml content instead. With this new "NetcdfSubset" link type, we can actually download the NetCDF content of a .ncml file directly from the browser.

    Sample screenshots:

    Screenshot 2021-10-21 at 21-32-14 Catalog Services

    Screenshot 2021-10-21 at 21-31-13 NetCDF Subset Service for Grids

    dataset.xml:

    <?xml version="1.0" encoding="UTF-8"?>
    <gridDataset location="/twitcher/ows/proxy/thredds/ncss/birdhouse/testdata/flyingpigeon/cmip3/tasmin.sresa2.miub_echo_g.run1.atm.da.nc" path="path">
      <axis name="lat" shape="6" type="double" axisType="Lat">
        <attribute name="units" value="degrees_north"/>
        <attribute name="long_name" value="latitude"/>
        <attribute name="standard_name" value="latitude"/>
        <attribute name="bounds" value="lat_bnds"/>
        <attribute name="axis" value="Y"/>
        <attribute name="_ChunkSizes" type="int" value="6"/>
        <attribute name="_CoordinateAxisType" value="Lat"/>
        <values>42.67760468 46.38855743 50.09945297 53.81027222 57.52099228 61.2315712</values>
      </axis>
      <axis name="lon" shape="7" type="double" axisType="Lon">
        <attribute name="units" value="degrees_east"/>
        <attribute name="long_name" value="longitude"/>
        <attribute name="standard_name" value="longitude"/>
        <attribute name="bounds" value="lon_bnds"/>
        <attribute name="axis" value="X"/>
        <attribute name="_ChunkSizes" type="int" value="7"/>
        <attribute name="_CoordinateAxisType" value="Lon"/>
        <values start="281.25" increment="3.75" npts="7"/>
      </axis>
      <axis name="time" shape="7200" type="double" axisType="Time">
        <attribute name="units" value="days since 1860-1-1"/>
        <attribute name="calendar" value="360_day"/>
        <attribute name="bounds" value="time_bnds"/>
        <attribute name="_ChunkSizes" type="int" value="7200"/>
        <attribute name="_CoordinateAxisType" value="Time"/>
        <values start="66960.5" increment="1.0" npts="7200"/>
      </axis>
      <gridSet name="time lat lon">
        <projectionBox>
          <minx>279.375</minx>
          <maxx>305.625</maxx>
          <miny>40.82210731506348</miny>
          <maxy>63.08675956726074</maxy>
        </projectionBox>
        <axisRef name="time"/>
        <axisRef name="lat"/>
        <axisRef name="lon"/>
        <grid name="tasmin" desc="Minimum Daily Surface Air Temperature" shape="time lat lon" type="float">
          <attribute name="original_name" value="T2MIN"/>
          <attribute name="coordinates" value="height"/>
          <attribute name="long_name" value="Minimum Daily Surface Air Temperature"/>
          <attribute name="standard_name" value="air_temperature"/>
          <attribute name="cell_methods" value="time: minimum (interval: 30 minutes)"/>
          <attribute name="units" value="K"/>
          <attribute name="missing_value" type="float" value="1.0E20"/>
          <attribute name="history" value="tas=max(195,tas) applied to raw data; min of 194.73 detected;"/>
          <attribute name="_ChunkSizes" type="int" value="7200 6 7"/>
        </grid>
      </gridSet>
      <LatLonBox>
        <west>-78.7500</west>
        <east>-56.2500</east>
        <south>42.6776</south>
        <north>61.2315</north>
      </LatLonBox>
      <TimeSpan>
        <begin>2046-01-01T12:00:00Z</begin>
        <end>2065-12-30T12:00:00Z</end>
      </TimeSpan>
      <AcceptList>
        <GridAsPoint>
          <accept displayName="xml">xml</accept>
          <accept displayName="xml (file)">xml_file</accept>
          <accept displayName="csv">csv</accept>
          <accept displayName="csv (file)">csv_file</accept>
          <accept displayName="geocsv">geocsv</accept>
          <accept displayName="geocsv (file)">geocsv_file</accept>
          <accept displayName="netcdf">netcdf</accept>
          <accept displayName="netcdf4">netcdf4</accept>
        </GridAsPoint>
        <Grid>
          <accept displayName="netcdf">netcdf</accept>
          <accept displayName="netcdf4">netcdf4</accept>
        </Grid>
      </AcceptList>
    </gridDataset>

1.16.0 (2021-10-20)

Changes

1.15.2 (2021-09-22)

Changes

1.15.1 (2021-09-21)

Changes

  • Finch: Increase maxrequestsize from 100mb to 400mb to enable ERA5 data subset. Should be possible to bring this back down with smarter averaging processes.

1.15.0 (2021-09-20)

Changes

  • Backward-incompatible change: do not, by default, volume-mount the Jupyter env README file since that file has been deleted in this repo. That file is fairly specific to Ouranos while we want this repo to be generic. PR Ouranosinc/PAVICS-landing#31 restored that file in PAVICS-landing repo that is Ouranos specific.

    • Previous default added as a comment in env.local for existing deployment to restore the previous behavior. Although the README file has been deleted in this PR, it has already been previously deployed so existing system can restore the previous behavior of having the existing README file. This file will simply be not updated anymore.
  • Delete the deployment of that README file as well since that README file is deleted. PR bird-house/birdhouse-deploy-ouranos#15 restore the deployment for Ouranos.

  • Each Org will be responsible for the deployment of their own README file. PR bird-house/birdhouse-deploy-ouranos#15 can be used as a working example from Ouranos.

  • Add sample code for simple and naive notebook sharing between Jupyter users.

Notebook sharing details

Shared notebooks will be visible to all users logged in, even the public demo user so do not share any notebooks containing sensitive private info.

Can not share to a specific user.

Anyone will see the login id of everyone else so if the login id needs to be kept private, change this sample code.

Inside Jupyter, user will have the following additional folders:

.
├── mypublic/  # writable by current user
│   ├── current-user-public-share-file.ipynb
│   ├── (...)
├── public/  # read-only for everyone
│   ├── loginid-1-public/
│   │   └── loginid-1-shared-file.ipynb
│   │   └── (...)
│   ├── loginid-2-public/
│   │   └── loginid-2-shared-file.ipynb
│   │   └── (...)
│   ├── (...)-public/
│   │   └── (...)

User can drop their files to be shared under folder mypublic and see other users share under public/{other-loginid}-public.

Matching PR Ouranosinc/PAVICS-landing#31 updating README inside the Jupyter env to explain this new sharing mechanism.

Deployed to https://medus.ouranos.ca/jupyter/ for acceptance testing.

1.14.4 (2021-09-10)

Changes

  • Jupyter: update for new RavenPy and other new packages

    Bokeh png export now also works.

    Other noticeable changes:

    <   - ravenpy=0.7.0=pyh1bb2064_0
    >   - ravenpy=0.7.4=pyh7f9bfb9_0
    
    <   - xclim=0.28.0=pyhd8ed1ab_0
    >   - xclim=0.28.1=pyhd8ed1ab_0
    
    >   - geckodriver=0.29.1=h3146498_0
    >   - selenium=3.141.0=py37h5e8e339_1002
    >   - nested_dict=1.61=pyhd3deb0d_0
    >   - paramiko=2.7.2=pyh9f0ad1d_0
    >   - scp=0.14.0=pyhd8ed1ab_0
    >   - s3fs=2021.8.1=pyhd8ed1ab_0
    
    # Downgrade !
    <   - pandas=1.3.1=py37h219a48f_0
    >   - pandas=1.2.5=py37h219a48f_0
    
    <   - owslib=0.24.1=pyhd8ed1ab_0
    >   - owslib=0.25.0=pyhd8ed1ab_0
    
    <   - cf_xarray=0.6.0=pyh6c4a22f_0
    >   - cf_xarray=0.6.1=pyh6c4a22f_0
    
    <   - rioxarray=0.5.0=pyhd8ed1ab_0
    >   - rioxarray=0.7.0=pyhd8ed1ab_0
    
    <   - climpred=2.1.4=pyhd8ed1ab_0
    >   - climpred=2.1.5.post1=pyhd8ed1ab_0
    
    <   - dask=2021.7.1=pyhd8ed1ab_0
    >   - dask=2021.9.0=pyhd8ed1ab_0

    See PR Ouranosinc/PAVICS-e2e-workflow-tests#89 for more info.

1.14.3 (2021-09-08)

Changes

1.14.2 (2021-09-01)

Changes

1.14.1 (2021-08-31)

  • monitoring: make some prometheus alert threshold configurable via env.local

    Default values are previous hardcoded values so this is fully backward compatible.

    Different organizations with different policies and hardware can now adapt the alert threshold to their specific needs, decreasing false positive alerts.

    Too much false positive alerts will decrease the importance and usefulness of each alert. Alerts should not feel like spams.

    Not all alert thresholds are changed to make configurable. Only thresholds that are most likely to need customization or that logically should be configurable are made configurable.

    Fixes #66.

1.14.0 (2021-08-02)

Changes

  • Add request caching settings in TWitcher INI configuration to work with Magpie to help reduce permission and access control computation time.

  • Add magpie logger under Twitcher INI configuration to provide relevant logging details provided by MagpieAdapter it employs for service and resource access resolution.

  • Change logging level of sqlalchemy.engine under Magpie INI configuration to WARN in order to avoid by default over verbose database queries.

  • Update Magpie version to 3.14.0 with corresponding Twitcher using MagpieAdapter to obtain fixes about request caching and logging improvements during Twitcher security check failure following raised exception.

    Please note that because the previous default version was 3.12.0, a security fix introduced in 3.13.0 is included. (see details here: 3.13.0 (2021-06-29))

    This security fix explicitly disallows duplicate emails for different user accounts, which might require manual database updates if such users exist on your server instance. To look for possible duplicates, the following command can be used. Duplicate entries must be updated or removed such that only unique emails are present.

    echo "select email,user_name from users" | \
    docker exec -i postgres-magpie psql -U $POSTGRES_MAGPIE_USERNAME magpiedb | \
    sort > /tmp/magpie_users.txt

Fixes

  • Adjust incorrect magpie.url value in Magpie INI configuration.

1.13.14 (2021-07-29)

  • jupyter: update for JupyterLab v3, fix memory monitor display and RavenPy-0.7.0

    See PR Ouranosinc/PAVICS-e2e-workflow-tests#85 for more info.

    Relevant changes:

    <   - jupyterlab=2.2.9=pyhd8ed1ab_0
    >   - jupyterlab=3.1.0=pyhd8ed1ab_0
    
    <   - jupyterlab_server=1.2.0=py_0
    >   - jupyterlab_server=2.6.1=pyhd8ed1ab_0
    
    <   - jupyter-archive=2.2.0=pyhd8ed1ab_0
    >   - jupyter-archive=3.0.1=pyhd8ed1ab_0
    
    <   - jupyter_bokeh=2.0.4=pyhd8ed1ab_0
    >   - jupyter_bokeh=3.0.2=pyhd8ed1ab_0
    
    <   - jupyterlab-git=0.24.0=pyhd8ed1ab_0
    >   - jupyterlab-git=0.31.0=pyhd8ed1ab_0
    
    <   - nbdime=2.1.0=py_0
    >   - nbdime=3.1.0=pyhd8ed1ab_0
    
    # Pip to Conda package
    <     - nbresuse==0.4.0
    >   - nbresuse=0.4.0=pyhd8ed1ab_0
    
    >   - nbclassic=0.3.1=pyhd8ed1ab_1
    
    >   - jupyterlab-system-monitor=0.8.0=pyhd8ed1ab_1
    >   - jupyter-resource-usage=0.5.1=pyhd8ed1ab_0
    >   - jupyterlab-topbar=0.6.1=pyhd8ed1ab_2
    >     - jupyterlab-logout=0.5.0
    
    <   - jupyter_conda=5.1.1=hd8ed1ab_0
    
    <   - ravenpy=0.6.0=pyh1bb2064_2
    >   - ravenpy=0.7.0=pyh1bb2064_0
    
    <   - pandas=1.2.5=py37h219a48f_0
    >   - pandas=1.3.1=py37h219a48f_0
    
    <   - xarray=0.18.2=pyhd8ed1ab_0
    >   - xarray=0.19.0=pyhd8ed1ab_1
    
    <   - dask=2021.7.0=pyhd8ed1ab_0
    >   - dask=2021.7.1=pyhd8ed1ab_0
    
    <   - regionmask=0.6.2=pyhd8ed1ab_0
    >   - regionmask=0.7.0=pyhd8ed1ab_0

1.13.13 (2021-07-26)

Changes

  • jupyter: update for RavenPy-0.6.0, Xclim-0.28.0 and latest of everything else

    See PR Ouranosinc/PAVICS-e2e-workflow-tests#84 for more info.

    Relevant changes:

    <   - ravenpy=0.5.2=pyh7f9bfb9_0
    >   - ravenpy=0.6.0=pyh1bb2064_2
    
    <   - xclim=0.27.0=pyhd8ed1ab_0
    >   - xclim=0.28.0=pyhd8ed1ab_0
    
    # birdy rebuild
    <   - birdy=v0.8.0=pyh6c4a22f_0
    >   - birdy=v0.8.0=pyh6c4a22f_1
    
    <   - cf_xarray=0.5.2=pyh6c4a22f_0
    >   - cf_xarray=0.6.0=pyh6c4a22f_0
    
    <   - cftime=1.4.1=py37h902c9e0_0
    >   - cftime=1.5.0=py37h6f94858_0
    
    <   - dask=2021.6.0=pyhd8ed1ab_0
    >   - dask=2021.7.0=pyhd8ed1ab_0
    
    <   - nc-time-axis=1.2.0=py_1
    >   - nc-time-axis=1.3.1=pyhd8ed1ab_2
    
    <   - rioxarray=0.4.1.post0=pyhd8ed1ab_0
    >   - rioxarray=0.5.0=pyhd8ed1ab_0
    
    <   - numpy=1.20.3=py37h038b26d_1
    >   - numpy=1.21.1=py37h038b26d_0
    
    <   - pandas=1.2.4=py37h219a48f_0
    >   - pandas=1.2.5=py37h219a48f_0
    
    <   - plotly=4.14.3=pyh44b312d_0
    >   - plotly=5.1.0=pyhd8ed1ab_1
    
    <     - nbconvert==5.6.1
    >   - nbconvert=6.1.0=py37h89c1867_0

1.13.12 (2021-07-13)

Changes

  • Add csv files to Thredds filter

1.13.11 (2021-07-06)

Changes

  • Notebook deployment: allow to specify required branch for any tutorial notebook repos in env.local.

    Example: set WORKFLOW_TESTS_BRANCH and any other notebook deploy config like PAVICS_LANDING_BRANCH in env.local.

    To support testing of this PR Ouranosinc/PAVICS-e2e-workflow-tests#79.

  • jupyter: minor update to add unzip package

    unzip needed to test PAVICS-landing notebooks under Jenkins. No other package updates.

    See PR Ouranosinc/PAVICS-e2e-workflow-tests#79 for more details.

1.13.10 (2021-06-30)

Changes

  • Add bump2version configuration to allow self-update of files that refer to new version releases and apply update of features listed in this changelog.
  • Add this CHANGES.md file with all previous version details extracted for PR merge commit messages.
  • Add listing of change history to generated documentation on bird-house/birdhouse-deploy ReadTheDocs.
  • Update CONTRIBUTING.rst file to include note about updating this changelog for future PR.

Fixes

1.13.9 (2021-06-18)

  • jupyter: update for raven notebooks

    To deploy the new Jupyter env to PAVICS.

    Given it's an incremental build, these are the only differences:

    >   - intake-geopandas=0.2.4=pyhd8ed1ab_0
    >   - intake-thredds=2021.6.16=pyhd8ed1ab_0
    >   - intake-xarray=0.5.0=pyhd8ed1ab_0

    See PR Ouranosinc/PAVICS-e2e-workflow-tests#76.

1.13.8 (2021-06-15)

  • jupyter: new version for updated ravenpy, birdy and xclim

    PR to deploy the new Jupyter env to PAVICS.

    See PR Ouranosinc/PAVICS-e2e-workflow-tests#75 for more details.

    Changes

    <   - ravenpy=0.4.2=py37_1
    >   - ravenpy=0.5.2=pyh7f9bfb9_0
    
    # Renamed.
    <   - raven=3.0.4.318=hc9bffa2_2
    >   - raven-hydro=3.0.4.322=h516393e_0
    
    <   - ostrich=21.03.16=h2bc3f7f_0
    >   - ostrich=21.03.16=h4bd325d_1
    
    <   - xclim=0.25.0=pyhd8ed1ab_0
    >   - xclim=0.27.0=pyhd8ed1ab_0
    
    # Old version was from pip.
    <     - birdhouse-birdy==0.7.0
    >   - birdy=v0.8.0=pyh6c4a22f_0
    
    # Was previously included in another package, now it is standalone.
    >   - pydantic=1.8.2=py37h5e8e339_0
    
    # New libs for upcoming Raven notebooks
    >   - gcsfs=2021.6.0=pyhd8ed1ab_0
    >   - intake=0.6.2=pyhd8ed1ab_0
    >   - intake-esm=2021.1.15=pyhd8ed1ab_0
    >   - zarr=2.8.3=pyhd8ed1ab_0
    
    <   - xarray=0.17.0=pyhd8ed1ab_0
    >   - xarray=0.18.2=pyhd8ed1ab_0
    
    <   - owslib=0.23.0=pyhd8ed1ab_0
    >   - owslib=0.24.1=pyhd8ed1ab_0
    
    <   - cf_xarray=0.5.1=pyh44b312d_0
    >   - cf_xarray=0.5.2=pyh6c4a22f_0
    
    <   - clisops=0.6.3=pyh44b312d_0
    >   - clisops=0.6.5=pyh6c4a22f_0
    
    <   - dask=2021.2.0=pyhd8ed1ab_0
    >   - dask=2021.6.0=pyhd8ed1ab_0
    
    # Downgrade !
    <   - gdal=3.2.1=py37hc5bc4e4_7
    >   - gdal=3.1.4=py37h2ec2946_8
    
    # Downgrade !
    <   - rasterio=1.2.2=py37hd5c4cce_0
    >   - rasterio=1.2.1=py37ha549118_0
    
    <   - hvplot=0.7.1=pyh44b312d_0
    >   - hvplot=0.7.2=pyh6c4a22f_0
    
    <   - rioxarray=0.3.1=pyhd8ed1ab_0
    >   - rioxarray=0.4.1.post0=pyhd8ed1ab_0
    
    # Downgrade !
    <   - xskillscore=0.0.19=pyhd8ed1ab_0
    >   - xskillscore=0.0.18=py_1

    Full diff of conda env export: 210415-210527.1-update210615-conda-env-export.diff.txt

    Full new conda env export: 210527.1-update210615-conda-env-export.yml.txt

1.13.7 (2021-06-10)

  • jupyterhub: allow config override via env.local

    Overview

    This is basically the same as ENABLE_JUPYTERHUB_MULTI_NOTEBOOKS but at the bottom of the file so it can override everything.

    ENABLE_JUPYTERHUB_MULTI_NOTEBOOKS is kept for backward-compat.

    First useful application is to enable server culling for auto shutdown of idle kernels and idle jupyter single server, hopefully fixes #67.

    The culling settings will only take effect the next time user restart their personal Jupyter server because it seems that the Jupyter server is the one culling itself. JupyterHub do not perform the culling, it simply forwards the culling settings to the Jupyter server.

    $ docker inspect jupyter-lvu --format '{{ .Args }}'
    [run -n birdy /usr/local/bin/start-notebook.sh --ip=0.0.0.0 --port=8888 --notebook-dir=/notebook_dir --SingleUserNotebookApp.default_url=/lab --debug --disable-user-config --NotebookApp.terminals_enabled=False --NotebookApp.shutdown_no_activity_timeout=180 --MappingKernelManager.cull_idle_timeout=180 --MappingKernelManager.cull_connected=True]

    Changes

    Non-breaking changes

    • jupyterhub: allow config override via env.local

    Tests

    Deployed to https://lvupavicsdev.ouranos.ca/jupyter (timeout set to 5 mins)

1.13.6 (2021-06-02)

  • Bugfix for autodeploy job

    The new code added with this merge created a new bug for the autodeploy job.

    From the autodeploy job's log :

    triggerdeploy START_TIME=2021-05-13T14:00:03+0000
    Error: DEPLOY_DATA_JOB_SCHEDULE not set
    

    If the AUTODEPLOY_NOTEBOOK_FREQUENCY variable is not set in the env.local file, it would create the error above. The variable is set in the default.env file, in case it is not defined in the env.local, and is then used for the new env file from pavics-jupyter-base here. The error happens because the default.env was not called in the triggerdeploy.sh script, and the variable was not set when running the env.local.

    Solution was tested in a test environment and the cronjob seems to be fixed now.

    Tests were executed to see if the same situation could be found anywhere else. From what was observed, default.env seems to be called consistently before the env.local. Only here, default.env doesn't seem to be called. A default.env call has also been added in that file.

1.13.5 (2021-05-19)

  • magpie 3.x + gunicorn bind

1.13.4 (2021-05-18)

  • Update to raven 0.13.0

1.13.3 (2021-05-11)

    • Add new docker-compose optional components
      • optional-components/database-external-ports
      • optional-components/wps-healthchecks

    Following is the output result when using optional-components/wps-healthcheck

    ubuntu@daccs-instance-26730-daccsci:~$ pavics-compose ps
    reading './components/monitoring/default.env'
    reading './optional-components/testthredds/default.env'
    COMPOSE_CONF_LIST=-f docker-compose.yml -f ./components/monitoring/docker-compose-extra.yml -f ./optional-components/canarie-api-full-monitoring/docker-compose-extra.yml -f ./optional-components/all-public-access/docker-compose-extra.yml -f ./optional-components/testthredds/docker-compose-extra.yml -f ./optional-components/secure-thredds/docker-compose-extra.yml -f ./optional-components/wps-healthchecks/docker-compose-extra.yml -f ./optional-components/database-external-ports/docker-compose-extra.yml
         Name                    Command                  State                                                                                Ports
    --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    alertmanager      /bin/alertmanager --config ...   Up             0.0.0.0:9093->9093/tcp
    cadvisor          /usr/bin/cadvisor -logtostderr   Up (healthy)   0.0.0.0:9999->8080/tcp
    catalog           /bin/sh -c python /home/do ...   Up (healthy)   0.0.0.0:8086->80/tcp
    finch             gunicorn --bind=0.0.0.0:50 ...   Up (healthy)   0.0.0.0:8095->5000/tcp
    flyingpigeon      /bin/bash -c source activa ...   Up (healthy)   0.0.0.0:8093->8093/tcp
    frontend          /bin/sh -c /bin/bash ./bin ...   Up             0.0.0.0:3000->3000/tcp
    geoserver         /entrypointwrapper               Up             0.0.0.0:8087->8080/tcp
    grafana           /run.sh                          Up             0.0.0.0:3001->3000/tcp
    hummingbird       /usr/bin/tini -- make upda ...   Up (healthy)   0.0.0.0:28097->28097/tcp, 0.0.0.0:38097->38097/tcp, 8000/tcp, 8080/tcp, 0.0.0.0:8097->8097/tcp, 8443/tcp, 0.0.0.0:48097->9001/tcp
    jupyterhub        jupyterhub                       Up             0.0.0.0:8800->8000/tcp
    magpie            /bin/sh -c crond -c $CRON_ ...   Up             0.0.0.0:2001->2001/tcp
    malleefowl        /usr/bin/tini -- make upda ...   Up (healthy)   0.0.0.0:28091->28091/tcp, 0.0.0.0:38091->38091/tcp, 8000/tcp, 8080/tcp, 0.0.0.0:8091->8091/tcp, 8443/tcp, 0.0.0.0:48091->9001/tcp
    mongodb           /entrypoint.sh bash -c cho ...   Up             0.0.0.0:27017->27017/tcp
    ncwms2            /usr/bin/tini -- make upda ...   Up             0.0.0.0:8080->8080/tcp, 0.0.0.0:48080->9001/tcp
    node-exporter     /bin/node_exporter --path. ...   Up
    phoenix           /usr/bin/tini -- make upda ...   Up             0.0.0.0:38443->38443/tcp, 8000/tcp, 8080/tcp, 0.0.0.0:8081->8081/tcp, 0.0.0.0:8443->8443/tcp, 0.0.0.0:9001->9001/tcp
    portainer         /portainer                       Up             0.0.0.0:9000->9000/tcp
    postgis           /bin/sh -c /start-postgis.sh     Up             5432/tcp
    postgres          docker-entrypoint.sh postgres    Up             0.0.0.0:5432->5432/tcp
    postgres-magpie   docker-entrypoint.sh postgres    Up             0.0.0.0:5433->5432/tcp
    project-api       /bin/sh -c npm run bootstr ...   Up             0.0.0.0:3005->3005/tcp
    prometheus        /bin/prometheus --config.f ...   Up             0.0.0.0:9090->9090/tcp
    proxy             /entrypoint                      Up             0.0.0.0:443->443/tcp, 0.0.0.0:80->80/tcp, 0.0.0.0:58079->8079/tcp, 0.0.0.0:58086->8086/tcp, 0.0.0.0:58091->8091/tcp, 0.0.0.0:58093->8093/tcp,
                                                                      0.0.0.0:58094->8094/tcp
    raven             /bin/bash -c source activa ...   Up (healthy)   0.0.0.0:8096->9099/tcp
    solr              /usr/bin/tini -- /bin/sh - ...   Up             0.0.0.0:8983->8983/tcp, 0.0.0.0:48983->9001/tcp
    testthredds       /entrypointwrapper               Up (healthy)   0.0.0.0:8084->8080/tcp, 8443/tcp
    thredds           /entrypointwrapper               Up (healthy)   0.0.0.0:8083->8080/tcp, 8443/tcp
    twitcher          pserve /opt/birdhouse/src/ ...   Up             0.0.0.0:8000->8000/tcp, 8080/tcp, 8443/tcp, 9001/tcp
    

1.13.2 (2021-05-11)

  • Custom notebooks

1.13.1 (2021-05-10)

1.13.0 (2021-05-06)

  • bump default log retention to 500m instead of 2m, more suitable for prod

    Overview

    Bump default log retention to 500m instead of 2m, more suitable for prod

    Forgot to push during PR #152.

    Changes

    Non-breaking changes

    • Bump default log retention to 500m instead of 2m, more suitable for prod

1.12.4 (2021-05-06)

1.12.3 (2021-05-04)

  • Change overview:
    • allow customization of /data persistence root on disk, retaining current default for existing deployment
    • add data persistence for mongodb container

1.12.2 (2021-04-28)

  • Add contributions guideline and policy

1.12.1 (2021-04-28)

  • proxy: allow homepage (location /) to be configurable

1.12.0 (2021-04-19)

  • Magpie upgrade strike II

    Strike II of this original PR #107.

    Matching notebook fix Ouranosinc/pavics-sdi#218

    Performed test upgrade on staging (Medus) using prod (Boreas) Magpie DB, everything went well and Jenkins passed (http://jenkins.ouranos.ca/job/ouranos-staging/job/medus.ouranos.ca/80/parameters/). This Jenkins build uses the corresponding branch in Ouranosinc/pavics-sdi#218 and with TEST_MAGPIE_AUTH enabled.

    Manual upgrade migration procedure:

    1. Save /data/magpie_persist folder from prod pavics.ouranos.ca: cd /data; tar czf magpie_persist.prod.tgz magpie_persist
    2. scp magpie_persist.prod.tgz to medus
    3. login to medus
    4. cd /path/to/birdhouse-deploy/birdhouse
    5. ./pavics-compose.sh down
    6. git checkout master
    7. cd /data
    8. rm -rf magpie_persist
    9. tar xzf magpie_persist.prod.tgz # restore Magpie DB with prod version
    10. cd /path/to/birdhouse-deploy/birdhouse
    11. ./pavics-compose.sh up -d
    12. Update env.local MAGPIE_ADMIN_PASSWORD with prod passwd for Twitcher to be able to access Magpie since we juste restore the Magpie DB from prod
    13. ./pavics-compose.sh restart twitcher # for Twitcher to get new Magpie admin passwd
    14. Baseline working state: trigger Jenkins test suite, ensure all pass except pavics_thredds.ipynb that requires new Magpie
    15. Baseline working state: view existing services permissions on group Anonymous (https://medus.ouranos.ca/magpie/ui/groups/anonymous/default)
    16. git checkout restore-previous-broken-magpie-upgrade-so-we-can-work-on-a-fix # This current branch
    17. ./pavics-compose.sh up -d # upgrade to new Magpie
    18. docker logs magpie: check no DB migration error
    19. Trigger Jenkins test suite again

1.11.29 (2021-04-16)

  • Update Raven and Jupyter env for Raven demo

    Raven release notes in PR Ouranosinc/raven#374 and Ouranosinc/raven#382

    Jupyter env update in PR Ouranosinc/PAVICS-e2e-workflow-tests#71

    Other fixes:

    • Fix intermittent Jupyter spawning error by doubling various timeouts config (it's intermittent so hard to test so we are not sure which ones of timeout fixed it)
    • Fix Finch and Raven "Broken pipe" error when the request size is larger than default 3mb (bumped to 100mb) (fixes Ouranosinc/raven#361 and Finch related comment)
    • Lower chance to have "Max connection" error for Finch and Raven (bump parallelprocesses from 2 to 10). In prod, the server has the CPU needed to run 10 concurrent requests if needed so this prevent users having to "wait" after each other.

1.11.28 (2021-04-09)

  • jupyter: update for new clisops, xclim, ravenpy

    Matching PR to deploy the new Jupyter env to PAVICS.

    See PR Ouranosinc/PAVICS-e2e-workflow-tests#68 for more info.

    Relevant changes:

    <   - clisops=0.5.1=pyhd3deb0d_0
    >   - clisops=0.6.3=pyh44b312d_0
    
    <   - xclim=0.23.0=pyhd8ed1ab_0
    >   - xclim=0.25.0=pyhd8ed1ab_0
    
    >   - ostrich=0.1.2=h2bc3f7f_0
    >   - raven=0.1.1=h2bc3f7f_0
    
    <     - ravenpy==0.2.3  # from pip
    >   - ravenpy=0.3.1=py37_0  # from conda
    
    >   - aiohttp=3.7.4=py37h5e8e339_0
    
    <   - roocs-utils=0.1.5=pyhd3deb0d_1
    >   - roocs-utils=0.3.0=pyh6c4a22f_0
    
    <   - cf_xarray=0.4.0=pyh44b312d_0
    >   - cf_xarray=0.5.1=pyh44b312d_0
    
    <   - rioxarray=0.2.0=pyhd8ed1ab_0
    >   - rioxarray=0.3.1=pyhd8ed1ab_0
    
    <   - xarray=0.16.2=pyhd8ed1ab_0
    >   - xarray=0.17.0=pyhd8ed1ab_0
    
    <   - geopandas=0.8.2=pyhd8ed1ab_0
    >   - geopandas=0.9.0=pyhd8ed1ab_0
    
    <   - gdal=3.1.4=py37h2ec2946_5
    >   - gdal=3.2.1=py37hc5bc4e4_7
    
    <   - jupyter_conda=4.1.0=hd8ed1ab_1
    >   - jupyter_conda=5.0.0=hd8ed1ab_0
    
    <   - python=3.7.9=hffdb5ce_100_cpython
    >   - python=3.7.10=hffdb5ce_100_cpython

1.11.27 (2021-04-01)

1.11.26 (2021-03-31)

  • Update canarieAPI doc links

    • Updated components' version number.
    • Replaced links to githubio docs to readthedocs.
    • renderer is provided by THREDDS-WMS.
    • slicer is provided by finch.

1.11.25 (2021-03-26)

  • finch: update to version 0.7.1

    See Finch release PR bird-house/finch#164 for more release info.

    This update will fix the following Jenkins error introduced by bird-house/finch#161 (comment):

    12:37:00  _________ finch-master/docs/source/notebooks/finch-usage.ipynb::Cell 1 _________
    12:37:00  Notebook cell execution failed
    12:37:00  Cell 1: Cell outputs differ
    12:37:00
    12:37:00  Input:
    12:37:00  help(wps.frost_days)
    12:37:00
    12:37:00  Traceback:
    12:37:00   mismatch 'stdout'
    12:37:00
    12:37:00   assert reference_output == test_output failed:
    12:37:00
    12:37:00    'Help on meth...ut files.\n\n' == 'Help on meth...ut files.\n\n'
    12:37:00    Skipping 70 identical leading characters in diff, use -v to show
    12:37:00    - min=None, missing_options=None, check_missing='any', thresh='0 degC', freq='YS', variable=None, output_formats=None) method of birdy.client.base.WPSClient instance
    12:37:00    + min=None, check_missing='any', cf_compliance='warn', data_validation='raise', thresh='0 degC', freq='YS', missing_options=None, variable=None, output_formats=None) method of birdy.client.base.WPSClient instance
    12:37:00          Number of days where daily minimum temperatures are below 0.
    12:37:00
    12:37:00          Parameters
    12:37:00          ----------
    12:37:00          tasmin : ComplexData:mimetype:`application/x-netcdf`, :mimetype:`application/x-ogc-dods`
    12:37:00              NetCDF Files or archive (tar/zip) containing netCDF files.
    12:37:00          thresh : string
    12:37:00              Freezing temperature.
    12:37:00          freq : {'YS', 'MS', 'QS-DEC', 'AS-JUL'}string
    12:37:00              Resampling frequency.
    12:37:00          check_missing : {'any', 'wmo', 'pct', 'at_least_n', 'skip', 'from_context'}string
    12:37:00              Method used to determine which aggregations should be considered missing.
    12:37:00          missing_options : ComplexData:mimetype:`application/json`
    12:37:00              JSON representation of dictionary of missing method parameters.
    12:37:00    +     cf_compliance : {'log', 'warn', 'raise'}string
    12:37:00    +         Whether to log, warn or raise when inputs have non-CF-compliant attributes.
    12:37:00    +     data_validation : {'log', 'warn', 'raise'}string
    12:37:00    +         Whether to log, warn or raise when inputs fail data validation checks.
    12:37:00          variable : string
    12:37:00              Name of the variable in the NetCDF file.
    12:37:00
    12:37:00          Returns
    12:37:00          -------
    12:37:00          output_netcdf : ComplexData:mimetype:`application/x-netcdf`
    12:37:00              The indicator values computed on the original input grid.
    12:37:00          output_log : ComplexData:mimetype:`text/plain`
    12:37:00              Collected logs during process run.
    12:37:00          ref : ComplexData:mimetype:`application/metalink+xml; version=4.0`
    12:37:00              Metalink file storing all references to output files.
    

    Jenkins build with Finch notebooks passing against newer Finch: http://jenkins.ouranos.ca/job/ouranos-staging/job/lvupavics.ouranos.ca/45/console

1.11.24 (2021-03-19)

  • Avoid docker pull since pull rate limit on dockerhub

    Pin bash tag so it is reproducible (previously it was more or less reproducible since we always ensure "latest" tag).

    Avoid the following error:

    + docker pull bash
    Using default tag: latest
    Error response from daemon: toomanyrequests: You have reached your pull rate limit. You may increase the limit by authenticating and upgrading: https://www.docker.com/increase-rate-limit
    

1.11.23 (2021-03-17)

  • Custom Jupyter user images

    Adds CRIM's nlp and eo images to the available list of images in JupyterHub

    The base image (pavics-jupyter-base) wasn't added to the list, because it is assumed the users will always be using the other more specialized images.

    We were already able to add/override Jupyter images but this PR makes it more integrated: those image will also be pulled in advanced so startup is much faster for big images since these images will already be cached.

    Backward incompatible changes: DOCKER_NOTEBOOK_IMAGE renamed to DOCKER_NOTEBOOK_IMAGES and is now a space separated list of images. Any existing override in env.local using the old name will have to switch to the new name.

1.11.22 (2021-03-16)

1.11.21 (2021-02-19)

  • Configurable Jupyterhub README

    While the README.ipynb provided by birdhouse-deploy is good, it does not quite fit our needs at PCIC. This PR allows users to configure their own README for Jupyterhub.

    Changes

    • Adds JUPYERHUB_README as configuration option in the appropriate spots

1.11.20 (2021-02-19)

  • jupyter: update to version 210216 for xESMF

    Matching PR to deploy Ouranosinc/PAVICS-e2e-workflow-tests#61 to PAVICS.

    For regridding notebook, see Ouranosinc/pavics-sdi#201 (comment).

    Noticeable changes:

    >   - xesmf=0.5.2=pyhd8ed1ab_0
    
    <   - owslib=0.21.0=pyhd8ed1ab_0
    >   - owslib=0.23.0=pyhd8ed1ab_0
    
    <   - cftime=1.3.1=py37h6323ea4_0
    >   - cftime=1.4.1=py37h902c9e0_0
    
    <   - dask=2021.1.1=pyhd8ed1ab_0
    >   - dask=2021.2.0=pyhd8ed1ab_0
    
    <   - rioxarray=0.1.1=pyhd8ed1ab_0
    >   - rioxarray=0.2.0=pyhd8ed1ab_0

1.11.19 (2021-02-10)

  • proxy: proxy_read_timeout config should be configurable

    We have a performance problem with the production deployment at Ouranos so we need a longer timeout. Being an Ouranos specific need, it should not be hardcoded as in previous PR #122.

    The previous increase was sometime not enough !

    The value is now configurable via env.local as most other customizations. Documentation updated.

    Timeout in Prod:

    WPS_URL=https://pavics.ouranos.ca/twitcher/ows/proxy/raven/wps FINCH_WPS_URL=https://pavics.ouranos.ca/twitcher/ows/proxy/finch/wps FLYINGPIGEON_WPS
    _URL=https://pavics.ouranos.ca/twitcher/ows/proxy/flyingpigeon/wps pytest --nbval-lax --verbose docs/source/notebooks/Running_HMETS_with_CANOPEX_datas
    et.ipynb --sanitize-with docs/source/output-sanitize.cfg --ignore docs/source/notebooks/.ipynb_checkpoints
    
    HTTPError: 504 Server Error: Gateway Time-out for url: https://pavics.ouranos.ca/twitcher/ows/proxy/raven/wps
    
    ===================================================== 11 failed, 4 passed, 1 warning in 249.80s (0:04:09) ===========================================
    

    Pass easily on my test VM with very modest hardware (10G ram, 2 cpu):

    WPS_URL=https://lvupavicsmaster.ouranos.ca/twitcher/ows/proxy/raven/wps FINCH_WPS_URL=https://lvupavicsmaster.ouranos.ca/twitcher/ows/proxy/finch/wp
    s FLYINGPIGEON_WPS_URL=https://lvupavicsmaster.ouranos.ca/twitcher/ows/proxy/flyingpigeon/wps pytest --nbval-lax --verbose docs/source/notebooks/Runni
    ng_HMETS_with_CANOPEX_dataset.ipynb --sanitize-with docs/source/output-sanitize.cfg --ignore docs/source/notebooks/.ipynb_checkpoints
    
    =========================================================== 15 passed, 1 warning in 33.84s ===========================================================
    

    Pass against Medus:

    WPS_URL=https://medus.ouranos.ca/twitcher/ows/proxy/raven/wps FINCH_WPS_URL=https://medus.ouranos.ca/twitcher/ows/proxy/finch/wps FLYINGPIGEON_WPS_URL=https://medus.ouranos.ca/twitcher/ows/proxy/flyingpigeon/wps pytest --nbval-lax --verbose docs/source/notebooks/Running_HMETS_with_CANOPEX_dataset.ipynb --sanitize-with docs/source/output-sanitize.cfg --ignore docs/source/notebooks/.ipynb_checkpoints
    
    ============================================== 15 passed, 1 warning in 42.44s =======================================================
    

    Pass against hirondelle.crim.ca:

    WPS_URL=https://hirondelle.crim.ca/twitcher/ows/proxy/raven/wps FINCH_WPS_URL=https://hirondelle.crim.ca/twitcher/ows/proxy/finch/wps FLYINGPIGEON_WPS_URL=https://hirondelle.crim.ca/twitcher/ows/proxy/flyingpigeon/wps pytest --nbval-lax --verbose docs/source/notebooks/Running_HMETS_with_CANOPEX_dataset.ipynb --sanitize-with docs/source/output-sanitize.cfg --ignore docs/source/notebooks/.ipynb_checkpoints
    
    =============================================== 15 passed, 1 warning in 35.61s ===============================================
    

    For comparison, a run on Prod without Twitcher (PR bird-house/birdhouse-deploy-ouranos#5):

    WPS_URL=https://pavics.ouranos.ca/raven/wps FINCH_WPS_URL=https://pavics.ouranos.ca/twitcher/ows/proxy/finch/wps FLYINGPIGEON_WPS_URL=https://pavics
    .ouranos.ca/twitcher/ows/proxy/flyingpigeon/wps pytest --nbval-lax --verbose docs/source/notebooks/Running_HMETS_with_CANOPEX_dataset.ipynb --sanitize
    -with docs/source/output-sanitize.cfg --ignore docs/source/notebooks/.ipynb_checkpoints
    
    HTTPError: 504 Server Error: Gateway Time-out for url: https://pavics.ouranos.ca/raven/wps
    
    ================================================ 11 failed, 4 passed, 1 warning in 248.99s (0:04:08) =================================================
    

    A run on Prod without Twitcher and Nginx (direct hit Raven):

    WPS_URL=http://pavics.ouranos.ca:8096/ FINCH_WPS_URL=https://pavics.ouranos.ca/twitcher/ows/proxy/finch/wps FLYINGPIGEON_WPS_URL=https://pavics.oura
    nos.ca/twitcher/ows/proxy/flyingpigeon/wps pytest --nbval-lax --verbose docs/source/notebooks/Running_HMETS_with_CANOPEX_dataset.ipynb --sanitize-with
     docs/source/output-sanitize.cfg --ignore docs/source/notebooks/.ipynb_checkpoints
    
    ===================================================== 15 passed, 1 warning in 218.46s (0:03:38) ======================================================
    
    

1.11.18 (2021-02-02)

  • update Raven and Jupyter env

    See https://github.com/Ouranosinc/raven/compare/v0.10.0...v0.11.1 for change details.

    Jupyter env change details: Ouranosinc/PAVICS-e2e-workflow-tests#60

    Jenkins run (this Jupyter env pavics/workflow-tests:210201.2 against a devel version of Raven 0.11.1 + --nbval-lax) http://jenkins.ouranos.ca/job/PAVICS-e2e-workflow-tests/job/test-nbval-lax-DO_NOT_MERGE/4/console

    Only known error:

    20:25:45  =========================== short test summary info ============================
    20:25:45  FAILED pavics-sdi-master/docs/source/notebooks/WMS_example.ipynb::Cell 1
    20:25:45  FAILED pavics-sdi-master/docs/source/notebooks/WMS_example.ipynb::Cell 2
    20:25:45  FAILED pavics-sdi-master/docs/source/notebooks/WMS_example.ipynb::Cell 3
    20:25:45  FAILED pavics-sdi-master/docs/source/notebooks/WMS_example.ipynb::Cell 4
    20:25:45  FAILED pavics-sdi-master/docs/source/notebooks/WMS_example.ipynb::Cell 5
    20:25:45  FAILED pavics-sdi-master/docs/source/notebooks/WMS_example.ipynb::Cell 6
    20:25:45  FAILED pavics-sdi-master/docs/source/notebooks/WMS_example.ipynb::Cell 7
    20:25:45  FAILED raven-master/docs/source/notebooks/Bias_correcting_climate_data.ipynb::Cell 8
    20:25:45  FAILED raven-master/docs/source/notebooks/Bias_correcting_climate_data.ipynb::Cell 9
    20:25:45  FAILED raven-master/docs/source/notebooks/Bias_correcting_climate_data.ipynb::Cell 10
    20:25:45  FAILED raven-master/docs/source/notebooks/Bias_correcting_climate_data.ipynb::Cell 11
    20:25:45  FAILED raven-master/docs/source/notebooks/Full_process_example_1.ipynb::Cell 13
    20:25:45  FAILED raven-master/docs/source/notebooks/Full_process_example_1.ipynb::Cell 17
    20:25:45  FAILED raven-master/docs/source/notebooks/Full_process_example_1.ipynb::Cell 18
    20:25:45  FAILED raven-master/docs/source/notebooks/Full_process_example_1.ipynb::Cell 19
    20:25:45  FAILED raven-master/docs/source/notebooks/Full_process_example_1.ipynb::Cell 20
    20:25:45  FAILED raven-master/docs/source/notebooks/Full_process_example_1.ipynb::Cell 21
    20:25:45  FAILED raven-master/docs/source/notebooks/Multiple_watersheds_simulation.ipynb::Cell 1
    20:25:45  FAILED raven-master/docs/source/notebooks/Multiple_watersheds_simulation.ipynb::Cell 3
    20:25:45  FAILED raven-master/docs/source/notebooks/Multiple_watersheds_simulation.ipynb::Cell 4
    20:25:45  FAILED raven-master/docs/source/notebooks/Multiple_watersheds_simulation.ipynb::Cell 5
    20:25:45  FAILED raven-master/docs/source/notebooks/Region_selection.ipynb::Cell 7
    20:25:45  FAILED raven-master/docs/source/notebooks/Region_selection.ipynb::Cell 8
    20:25:45  FAILED raven-master/docs/source/notebooks/Subset_climate_data_over_watershed.ipynb::Cell 5
    20:25:45  ============ 24 failed, 226 passed, 2 skipped in 2528.69s (0:42:08) ============
    

1.11.17 (2021-01-28)

1.11.16 (2021-01-14)

1.11.15 (2021-01-14)

  • jupyter: update to version 201214

    Matching PR to deploy the new Jupyter env in PR Ouranosinc/PAVICS-e2e-workflow-tests#56 to PAVICS.

    Relevant changes:

    >   - cfgrib=0.9.8.5=pyhd8ed1ab_0
    
    <   - clisops=0.3.1=pyh32f6830_1
    >   - clisops=0.4.0=pyhd3deb0d_0
    
    <   - dask=2.30.0=py_0
    >   - dask=2020.12.0=pyhd8ed1ab_0
    
    <   - owslib=0.20.0=py_0
    >   - owslib=0.21.0=pyhd8ed1ab_0
    
    <   - xarray=0.16.1=py_0
    >   - xarray=0.16.2=pyhd8ed1ab_0
    
    <   - xclim=0.21.0=py_0
    >   - xclim=0.22.0=pyhd8ed1ab_0
    
    <   - jupyter_conda=3.4.1=pyh9f0ad1d_0
    >   - jupyter_conda=4.1.0=hd8ed1ab_1

1.11.14 (2020-12-17)

  • Add ability to execute post actions for deploy-data script.

    Script deploy-data was previously introduced in PR #72 to deploy any files from any git repos to the local host it runs.

    Now it grows the ability to run commands from the git repo it just pulls.

    Being able to run commands open new possibilities:

    Combining this deploy-data with the scheduler component means we have a way for cronjobs to automatically always execute the most up-to-date version of any scripts from any git repos.

1.11.13 (2020-12-14)

  • jupyterhub: update to version 1.3.0 to include login terms patch

    This version of jupyterhub includes the login terms patch originally introduced in commit 8be8eeac211d3f5c2de620781db8832fdb8f9093 of PR #104.

    This official login terms feature has a few enhancements (see jupyterhub/jupyterhub#3264 (comment)):

    • no javascript dependency
    • pop-up reminder for user to check the checkbox

    Behavior change is the "Sign in" button is not longer disabled if unchecked. It simply does not work and reminds the user to check the checkbox if unchecked.

    Before:

    recorded

    After: recorded

1.11.12 (2020-11-25)

  • Fix geoserver not configured properly behind proxy.

    Hitting https://pavics.ouranos.ca/geoserver/wfs?request=GetCapabilities&version=1.1.0

    Before fix (wrong scheme and wrong port):

    <ows:Operation name="GetCapabilities">
    <ows:DCP>
    <ows:HTTP>
    <ows:Get xlink:href="http://pavics.ouranos.ca:80/geoserver/wfs"/>
    <ows:Post xlink:href="http://pavics.ouranos.ca:80/geoserver/wfs"/>
    </ows:HTTP>
    </ows:DCP>
    

    After fix:

    <ows:Operation name="GetCapabilities">
    <ows:DCP>
    <ows:HTTP>
    <ows:Get xlink:href="https://pavics.ouranos.ca:443/geoserver/wfs"/>
    <ows:Post xlink:href="https://pavics.ouranos.ca:443/geoserver/wfs"/>
    </ows:HTTP>
    </ows:DCP>
    

    This config automate manual step to set proxy base url in Geoserver UI https://docs.geoserver.org/2.9.3/user/configuration/globalsettings.html#proxy-base-url

    I had to override the docker image entrypoint to edit the server.xml on the fly before starting Geoserver (Tomcat) since setting Java proxy config did not seem to work (see first commit).

    Related to Ouranosinc/raven#297.

1.11.11 (2020-11-20)

  • Various small fixes.

    monitoring: prevent losing stats when VM auto start from a power failure

    check-instance-ready: new script to smoke test instance (use in bootstrap-instance-for-testsuite for our automation pipeline).

    jupyter: add CATALOG_USERNAME and anonymous to blocked_users list for security See comment #102 (comment) and comment #102 (comment)

    They are not real Jupyter users and their password is known.
    
    See config/magpie/permissions.cfg.template that created those users.
    
    Tested:
    ```
    [W 2020-11-20 13:25:18.924 JupyterHub auth:487] User 'admin-catalog' blocked. Stop authentication
    [W 2020-11-20 13:25:18.924 JupyterHub base:752] Failed login for admin-catalog
    
    [W 2020-11-20 13:49:18.069 JupyterHub auth:487] User 'anonymous' blocked. Stop authentication
    [W 2020-11-20 13:49:18.070 JupyterHub base:752] Failed login for anonymous
    ```
    

1.11.10 (2020-11-18)

  • Add terms conditions to JupyterHub login page and update to latest JupyterHub version.

    User have to check the checkbox agreeing to the terms and conditions in order to login (fixes Ouranosinc/pavics-sdi#188).

    User will have to accept the terms and conditions (the checkbox) each time he needs to login. However, if user do not logout or wipe his browser cookies, the next time he navigate to the login page, he'll just log right in, no password is asked so no terms and conditions to accept either.

    This behavior is optional and only enabled if JUPYTER_LOGIN_TERMS_URL in env.local is set.

    Had to patch the login.html template from jupyterhub docker image for this feature (PR jupyterhub/jupyterhub#3264).

    Also update jupyterhub docker image to latest version.

    Deployed to my test server https://lvupavics.ouranos.ca/jupyter/hub/login (pointing to a bogus terms and conditions link for now).

    Tested on Firefox and Google Chrome.

    Tested that upgrade from jupyterhub 1.0.0 to 1.2.1 is completely transparent to already logged in jupyter users.

    [D 2020-11-18 19:53:52.517 JupyterHub app:2055] Verifying that lvu is running at http://172.18.0.3:8888/jupyter/user/lvu/
    [D 2020-11-18 19:53:52.523 JupyterHub utils:220] Server at http://172.18.0.3:8888/jupyter/user/lvu/ responded with 302
    [D 2020-11-18 19:53:52.523 JupyterHub _version:76] jupyterhub and jupyterhub-singleuser both on version 1.2.1
    [I 2020-11-18 19:53:52.524 JupyterHub app:2069] lvu still running
    

    recorded

1.11.9 (2020-11-13)

  • jupyter: new image with 4 new extensions

    The google drive extension for JupyterLab requires a settings file containing the clientid of the project created in developers.google.com, which give authorization to use google drive.

    This PR's role is to include this file in the birdhouse configs.

    Matching PR Ouranosinc/PAVICS-e2e-workflow-tests#54 (commit 5d5a9aa2251386378406efb5b414b3aa6db0b37e) for the new image with 4 new extensions: jupytext, jupyterlab-google-drive, jupyter_conda and jupyterlab-git

    Matching PR Ouranosinc/pavics-sdi#185 for documentation about the new extensions.

1.11.8 (2020-11-06)

  • bump finch to version-0.5.3

1.11.7 (2020-11-06)

  • bump thredds-docker to 4.6.15

1.11.6 (2020-11-06)

  • Prepare fresh deployment for automated tests.

    @MatProv is building an automated pipeline that will provision and deploy a full PAVICS stack and run our Jenkins test suite for each PR here.

    So each time his new fresh instance comes up, there are a few steps to perform for the Jenkins test suite to pass. Those steps are captured in scripts/bootstrap-instance-for-testsuite. @MatProv please call this script, do not perform each steps yourself so any future changes to those steps will be transparent to your pipeline. A new optional components was also required, done in PR #92.

    For security reasons, Jupyterhub will block the test user to login since its password is known publicly.

    Each step are also in their own script so can be assembled differently to prepare the fresh instance if desired.

    Solr query in the canarie monitoring also updated to target the minimal dataset from bootstrap-testdata so the canarie monitoring page works on all PAVICS deployment (fixes #6). @MatProv you can use this canarie monitoring page (ex: https://pavics.ouranos.ca/canarie/node/service/status) to confirm the fresh instance is ready to run the Jenkins test suite.

1.11.5 (2020-10-27)

1.11.4 (2020-10-15)

  • Sync Raven testdata to Thredds for Raven tutorial notebooks.

    Leveraging the cron daemon of the scheduler component, sync Raven testdata to Thredds for Raven tutorial notebooks.

    Activation of the pre-configured cronjob is via env.local as usual for infra-as-code.

    New generic deploy-data script can clone any number of git repos, sync any number of folders in the git repo to any number of local folders, with ability to cherry-pick just the few files needed (Raven testdata has many types of files, we only need to sync .nc files to Thredds, to avoid polluting Thredds storage /data/datasets/testdata/raven).

    Limitation of the first version of this deploy-data script:

    • Do not handle re-organizing file layout, this is a pure sync only with very limited rsync filtering for now (tutorial notebooks deploy from multiple repos, need re-organizing the file layout)

    So the script has room to grow. I see it as a generic solution to the repeated problem "take files from various git repos and deploy them somewhere automatically". If we need to deploy another repo, juste write a new config file, stop writing boilerplate code again.

    Minor unrelated change in this PR:

    • README update to reference the new birdhouse-deploy-ouranos.
    • Make sourcing the various pre-configured cronjob backward-compat with older version of the repo where those cronjob did not exist yet.

1.11.3 (2020-09-28)

1.11.2 (2020-09-15)

  • Auto-renew LetsEncrypt SSL certificate.

    Auto-renew LetsEncrypt SSL certificate leveraging the cron jobs of the "scheduler" component. Meaning this feature is self-contained in the PAVICS stack, no dependency on the host's cron jobs.

    Default behavior is to attempt renewal everyday. certbot client in renew mode will not hit LetsEncrypt server if renewal is not allowed (not within 1 month of expiry) so this should not put too much stress on LetsEncrypt server. However, this gives us 30 retry opportunities (1 month) if something is wrong on the first try.

    All configs are centralized in env.local, easing reproducibility on multiple deployments of PAVICS and following infra-as-code.

    User can still perform the renewal manually by calling certbotwrapper directly. User is not forced to enable the "scheduler" component but will miss out on the automatic renewal.

    Documentation for activating this automatic renewal is in env.local.example.

    See vagrant-utils/configure-pavics.sh for how it's being used for real in a Vagrant box.

    Logs (/var/log/PAVICS/renew_letsencrypt_ssl.log) when no renewal is necessary, proxy down time less than 1 minute: certbot-renew-no-ops.txt

    ==========
    certbotwrapper START_TIME=2020-09-11T01:20:02+0000
    + realpath /vagrant/birdhouse/deployment/certbotwrapper
    + THIS_FILE=/vagrant/birdhouse/deployment/certbotwrapper
    + dirname /vagrant/birdhouse/deployment/certbotwrapper
    + THIS_DIR=/vagrant/birdhouse/deployment
    + pwd
    + SAVED_PWD=/
    + . /vagrant/birdhouse/deployment/../default.env
    + export 'DOCKER_NOTEBOOK_IMAGE=pavics/workflow-tests:200803'
    + export 'FINCH_IMAGE=birdhouse/finch:version-0.5.2'
    + export 'THREDDS_IMAGE=unidata/thredds-docker:4.6.14'
    + export 'JUPYTERHUB_USER_DATA_DIR=/data/jupyterhub_user_data'
    + export 'JUPYTER_DEMO_USER=demo'
    + export 'JUPYTER_DEMO_USER_MEM_LIMIT=2G'
    + export 'JUPYTER_DEMO_USER_CPU_LIMIT=0.5'
    + export 'JUPYTER_LOGIN_BANNER_TOP_SECTION='
    + export 'JUPYTER_LOGIN_BANNER_BOTTOM_SECTION='
    + export 'CANARIE_MONITORING_EXTRA_CONF_DIR=/conf.d'
    + export 'THREDDS_ORGANIZATION=Birdhouse'
    + export 'MAGPIE_DB_NAME=magpiedb'
    + export 'VERIFY_SSL=true'
    + export 'AUTODEPLOY_DEPLOY_KEY_ROOT_DIR=/root/.ssh'
    + export 'AUTODEPLOY_PLATFORM_FREQUENCY=7 5 * * *'
    + export 'AUTODEPLOY_NOTEBOOK_FREQUENCY=@hourly'
    + ENV_LOCAL_FILE=/vagrant/birdhouse/deployment/../env.local
    + set +x
    + CERT_DOMAIN=
    + '[' -z  ]
    + CERT_DOMAIN=lvupavicsmaster.ouranos.ca
    + '[' '!' -z 1 ]
    + cd /vagrant/birdhouse/deployment/..
    + docker stop proxy
    proxy
    + cd /
    + CERTBOT_OPTS=
    + '[' '!' -z 1 ]
    + CERTBOT_OPTS=renew
    + docker run --rm --name certbot -v /etc/letsencrypt:/etc/letsencrypt -v /var/lib/letsencrypt:/var/lib/letsencrypt -v /var/log/letsencrypt:/var/log/letsencrypt -p 443:443 -p 80:80 certbot/certbot:v1.3.0 renew
    Saving debug log to /var/log/letsencrypt/letsencrypt.log
    
    - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
    Processing /etc/letsencrypt/renewal/lvupavicsmaster.ouranos.ca.conf
    - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
    Cert not yet due for renewal
    
    - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
    
    The following certs are not due for renewal yet:
      /etc/letsencrypt/live/lvupavicsmaster.ouranos.ca/fullchain.pem expires on 2020-11-02 (skipped)
    No renewals were attempted.
    - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
    + RC=0
    + '[' '!' -z 1 ]
    + TMP_SSL_CERT=/tmp/tmp_certbotwrapper_ssl_cert.pem
    + CERTPATH=/etc/letsencrypt/live/lvupavicsmaster.ouranos.ca
    + cd /vagrant/birdhouse/deployment/..
    + docker run --rm --name copy_cert -v /etc/letsencrypt:/etc/letsencrypt bash cat /etc/letsencrypt/live/lvupavicsmaster.ouranos.ca/fullchain.pem /etc/letsencrypt/live/lvupavicsmaster.ouranos.ca/privkey.pem
    + diff /home/vagrant/certkey.pem /tmp/tmp_certbotwrapper_ssl_cert.pem
    + rm -v /tmp/tmp_certbotwrapper_ssl_cert.pem
    removed '/tmp/tmp_certbotwrapper_ssl_cert.pem'
    + '[' -z  ]
    + docker start proxy
    proxy
    + cd /
    + set +x
    
    certbotwrapper finished START_TIME=2020-09-11T01:20:02+0000
    certbotwrapper finished   END_TIME=2020-09-11T01:20:21+0000
    

    Logs when renewal is needed but failed due to firewall, certbot adds a random delay so proxy could be down up to 10 mins: certbot-renew-error.txt

    ==========
    certbotwrapper START_TIME=2020-09-11T13:00:04+0000
    + realpath /home/mourad/PROJECTS/birdhouse-deploy/birdhouse/deployment/certbotwrapper
    + THIS_FILE=/home/mourad/PROJECTS/birdhouse-deploy/birdhouse/deployment/certbotwrapper
    + dirname /home/mourad/PROJECTS/birdhouse-deploy/birdhouse/deployment/certbotwrapper
    + THIS_DIR=/home/mourad/PROJECTS/birdhouse-deploy/birdhouse/deployment
    + pwd
    + SAVED_PWD=/
    + . /home/mourad/PROJECTS/birdhouse-deploy/birdhouse/deployment/../default.env
    + export 'DOCKER_NOTEBOOK_IMAGE=pavics/workflow-tests:200803'
    + export 'FINCH_IMAGE=birdhouse/finch:version-0.5.2'
    + export 'THREDDS_IMAGE=unidata/thredds-docker:4.6.14'
    + export 'JUPYTERHUB_USER_DATA_DIR=/data/jupyterhub_user_data'
    + export 'JUPYTER_DEMO_USER=demo'
    + export 'JUPYTER_DEMO_USER_MEM_LIMIT=2G'
    + export 'JUPYTER_DEMO_USER_CPU_LIMIT=0.5'
    + export 'JUPYTER_LOGIN_BANNER_TOP_SECTION='
    + export 'JUPYTER_LOGIN_BANNER_BOTTOM_SECTION='
    + export 'CANARIE_MONITORING_EXTRA_CONF_DIR=/conf.d'
    + export 'THREDDS_ORGANIZATION=Birdhouse'
    + export 'MAGPIE_DB_NAME=magpiedb'
    + export 'VERIFY_SSL=true'
    + export 'AUTODEPLOY_DEPLOY_KEY_ROOT_DIR=/root/.ssh'
    + export 'AUTODEPLOY_PLATFORM_FREQUENCY=7 5 * * *'
    + export 'AUTODEPLOY_NOTEBOOK_FREQUENCY=@hourly'
    + ENV_LOCAL_FILE=/home/mourad/PROJECTS/birdhouse-deploy/birdhouse/deployment/../env.local
    + set +x
    + CERT_DOMAIN=
    + '[' -z  ]
    + CERT_DOMAIN=medus.ouranos.ca
    + '[' '!' -z 1 ]
    + cd /home/mourad/PROJECTS/birdhouse-deploy/birdhouse/deployment/..
    + docker stop proxy
    proxy
    + cd /
    + CERTBOT_OPTS=
    + '[' '!' -z 1 ]
    + CERTBOT_OPTS=renew
    + docker run --rm --name certbot -v /etc/letsencrypt:/etc/letsencrypt -v /var/lib/letsencrypt:/var/lib/letsencrypt -v /var/log/letsencrypt:/var/log/letsencrypt -p 443:443 -p 80:80 certbot/certbot:v1.3.0 renew
    Saving debug log to /var/log/letsencrypt/letsencrypt.log
    
    - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
    Processing /etc/letsencrypt/renewal/medus.ouranos.ca.conf
    - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
    Cert is due for renewal, auto-renewing...
    Non-interactive renewal: random delay of 10.77459918236335 seconds
    Plugins selected: Authenticator standalone, Installer None
    Renewing an existing certificate
    Performing the following challenges:
    http-01 challenge for medus.ouranos.ca
    Waiting for verification...
    Challenge failed for domain medus.ouranos.ca
    http-01 challenge for medus.ouranos.ca
    Cleaning up challenges
    Attempting to renew cert (medus.ouranos.ca) from /etc/letsencrypt/renewal/medus.ouranos.ca.conf produced an unexpected error: Some challenges have failed.. Skipping.
    All renewal attempts failed. The following certs could not be renewed:
      /etc/letsencrypt/live/medus.ouranos.ca/fullchain.pem (failure)
    
    - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
    
    All renewal attempts failed. The following certs could not be renewed:
      /etc/letsencrypt/live/medus.ouranos.ca/fullchain.pem (failure)
    - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
    1 renew failure(s), 0 parse failure(s)
    IMPORTANT NOTES:
     - The following errors were reported by the server:
    
       Domain: medus.ouranos.ca
       Type:   connection
       Detail: Fetching
       http://medus.ouranos.ca/.well-known/acme-challenge/F-_TzoOMcgoo5WC9FQvi_QdKuoqdsrQFa7MR2bEdnJE:
       Timeout during connect (likely firewall problem)
    
       To fix these errors, please make sure that your domain name was
       entered correctly and the DNS A/AAAA record(s) for that domain
       contain(s) the right IP address. Additionally, please check that
       your computer has a publicly routable IP address and that no
       firewalls are preventing the server from communicating with the
       client. If you're using the webroot plugin, you should also verify
       that you are serving files from the webroot path you provided.
    + RC=1
    + '[' '!' -z 1 ]
    + TMP_SSL_CERT=/tmp/tmp_certbotwrapper_ssl_cert.pem
    + CERTPATH=/etc/letsencrypt/live/medus.ouranos.ca
    + cd /home/mourad/PROJECTS/birdhouse-deploy/birdhouse/deployment/..
    + docker run --rm --name copy_cert -v /etc/letsencrypt:/etc/letsencrypt bash cat /etc/letsencrypt/live/medus.ouranos.ca/fullchain.pem /etc/letsencrypt/live/medus.ouranos.ca/privkey.pem
    + diff /etc/letsencrypt/live/medus.ouranos.ca/certkey.pem /tmp/tmp_certbotwrapper_ssl_cert.pem
    + rm -v /tmp/tmp_certbotwrapper_ssl_cert.pem
    removed '/tmp/tmp_certbotwrapper_ssl_cert.pem'
    + '[' -z  ]
    + docker start proxy
    proxy
    + cd /
    + set +x
    
    certbotwrapper finished START_TIME=2020-09-11T13:00:04+0000
    certbotwrapper finished   END_TIME=2020-09-11T13:00:49+0000
    

    Logs when renewal is successful, again proxy could be down up to 10 mins due to random delay by certbot client: certbot-renew-success-in-2-run-after-file-copy-fix.txt

    ==========
    certbotwrapper START_TIME=2020-09-11T13:10:04+0000
    + realpath /home/mourad/PROJECTS/birdhouse-deploy/birdhouse/deployment/certbotwrapper
    + THIS_FILE=/home/mourad/PROJECTS/birdhouse-deploy/birdhouse/deployment/certbotwrapper
    + dirname /home/mourad/PROJECTS/birdhouse-deploy/birdhouse/deployment/certbotwrapper
    + THIS_DIR=/home/mourad/PROJECTS/birdhouse-deploy/birdhouse/deployment
    + pwd
    + SAVED_PWD=/
    + . /home/mourad/PROJECTS/birdhouse-deploy/birdhouse/deployment/../default.env
    + export 'DOCKER_NOTEBOOK_IMAGE=pavics/workflow-tests:200803'
    + export 'FINCH_IMAGE=birdhouse/finch:version-0.5.2'
    + export 'THREDDS_IMAGE=unidata/thredds-docker:4.6.14'
    + export 'JUPYTERHUB_USER_DATA_DIR=/data/jupyterhub_user_data'
    + export 'JUPYTER_DEMO_USER=demo'
    + export 'JUPYTER_DEMO_USER_MEM_LIMIT=2G'
    + export 'JUPYTER_DEMO_USER_CPU_LIMIT=0.5'
    + export 'JUPYTER_LOGIN_BANNER_TOP_SECTION='
    + export 'JUPYTER_LOGIN_BANNER_BOTTOM_SECTION='
    + export 'CANARIE_MONITORING_EXTRA_CONF_DIR=/conf.d'
    + export 'THREDDS_ORGANIZATION=Birdhouse'
    + export 'MAGPIE_DB_NAME=magpiedb'
    + export 'VERIFY_SSL=true'
    + export 'AUTODEPLOY_DEPLOY_KEY_ROOT_DIR=/root/.ssh'
    + export 'AUTODEPLOY_PLATFORM_FREQUENCY=7 5 * * *'
    + export 'AUTODEPLOY_NOTEBOOK_FREQUENCY=@hourly'
    + ENV_LOCAL_FILE=/home/mourad/PROJECTS/birdhouse-deploy/birdhouse/deployment/../env.local
    + set +x
    + CERT_DOMAIN=
    + '[' -z  ]
    + CERT_DOMAIN=medus.ouranos.ca
    + '[' '!' -z 1 ]
    + cd /home/mourad/PROJECTS/birdhouse-deploy/birdhouse/deployment/..
    + docker stop proxy
    proxy
    + cd /
    + CERTBOT_OPTS=
    + '[' '!' -z 1 ]
    + CERTBOT_OPTS=renew
    + docker run --rm --name certbot -v /etc/letsencrypt:/etc/letsencrypt -v /var/lib/letsencrypt:/var/lib/letsencrypt -v /var/log/letsencrypt:/var/log/letsencrypt -p 443:443 -p 80:80 certbot/certbot:v1.3.0 renew
    Saving debug log to /var/log/letsencrypt/letsencrypt.log
    
    - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
    Processing /etc/letsencrypt/renewal/medus.ouranos.ca.conf
    - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
    Cert is due for renewal, auto-renewing...
    Non-interactive renewal: random delay of 459.45712705256506 seconds
    Plugins selected: Authenticator standalone, Installer None
    Renewing an existing certificate
    Performing the following challenges:
    http-01 challenge for medus.ouranos.ca
    Waiting for verification...
    Cleaning up challenges
    
    - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
    new certificate deployed without reload, fullchain is
    /etc/letsencrypt/live/medus.ouranos.ca/fullchain.pem
    - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
    
    - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
    
    Congratulations, all renewals succeeded. The following certs have been renewed:
      /etc/letsencrypt/live/medus.ouranos.ca/fullchain.pem (success)
    - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
    + RC=0
    + '[' '!' -z 1 ]
    + TMP_SSL_CERT=/tmp/tmp_certbotwrapper_ssl_cert.pem
    + CERTPATH=/etc/letsencrypt/live/medus.ouranos.ca
    + cd /home/mourad/PROJECTS/birdhouse-deploy/birdhouse/deployment/..
    + docker run --rm --name copy_cert -v /etc/letsencrypt:/etc/letsencrypt bash cat /etc/letsencrypt/live/medus.ouranos.ca/fullchain.pem /etc/letsencrypt/live/medus.ouranos.ca/privkey.pem
    + diff /etc/letsencrypt/live/medus.ouranos.ca/certkey.pem /tmp/tmp_certbotwrapper_ssl_cert.pem
    --- /etc/letsencrypt/live/medus.ouranos.ca/certkey.pem
    +++ /tmp/tmp_certbotwrapper_ssl_cert.pem
    @@ -1,33 +1,33 @@
     -----BEGIN CERTIFICATE-----
    
    REMOVED for Privacy.
    
     -----END PRIVATE KEY-----
    + '[' 0 -eq 0 ]
    + cp -v /tmp/tmp_certbotwrapper_ssl_cert.pem /etc/letsencrypt/live/medus.ouranos.ca/certkey.pem
    cp: can't create '/etc/letsencrypt/live/medus.ouranos.ca/certkey.pem': File exists
    + rm -v /tmp/tmp_certbotwrapper_ssl_cert.pem
    removed '/tmp/tmp_certbotwrapper_ssl_cert.pem'
    + '[' -z  ]
    + docker start proxy
    proxy
    + cd /
    + set +x
    
    certbotwrapper finished START_TIME=2020-09-11T13:10:04+0000
    certbotwrapper finished   END_TIME=2020-09-11T13:18:10+0000
    ==========
    certbotwrapper START_TIME=2020-09-11T15:00:06+0000
    + realpath /home/mourad/PROJECTS/birdhouse-deploy/birdhouse/deployment/certbotwrapper
    + THIS_FILE=/home/mourad/PROJECTS/birdhouse-deploy/birdhouse/deployment/certbotwrapper
    + dirname /home/mourad/PROJECTS/birdhouse-deploy/birdhouse/deployment/certbotwrapper
    + THIS_DIR=/home/mourad/PROJECTS/birdhouse-deploy/birdhouse/deployment
    + pwd
    + SAVED_PWD=/
    + . /home/mourad/PROJECTS/birdhouse-deploy/birdhouse/deployment/../default.env
    + export 'DOCKER_NOTEBOOK_IMAGE=pavics/workflow-tests:200803'
    + export 'FINCH_IMAGE=birdhouse/finch:version-0.5.2'
    + export 'THREDDS_IMAGE=unidata/thredds-docker:4.6.14'
    + export 'JUPYTERHUB_USER_DATA_DIR=/data/jupyterhub_user_data'
    + export 'JUPYTER_DEMO_USER=demo'
    + export 'JUPYTER_DEMO_USER_MEM_LIMIT=2G'
    + export 'JUPYTER_DEMO_USER_CPU_LIMIT=0.5'
    + export 'JUPYTER_LOGIN_BANNER_TOP_SECTION='
    + export 'JUPYTER_LOGIN_BANNER_BOTTOM_SECTION='
    + export 'CANARIE_MONITORING_EXTRA_CONF_DIR=/conf.d'
    + export 'THREDDS_ORGANIZATION=Birdhouse'
    + export 'MAGPIE_DB_NAME=magpiedb'
    + export 'VERIFY_SSL=true'
    + export 'AUTODEPLOY_DEPLOY_KEY_ROOT_DIR=/root/.ssh'
    + export 'AUTODEPLOY_PLATFORM_FREQUENCY=7 5 * * *'
    + export 'AUTODEPLOY_NOTEBOOK_FREQUENCY=@hourly'
    + ENV_LOCAL_FILE=/home/mourad/PROJECTS/birdhouse-deploy/birdhouse/deployment/../env.local
    + set +x
    + CERT_DOMAIN=
    + '[' -z  ]
    + CERT_DOMAIN=medus.ouranos.ca
    + '[' '!' -z 1 ]
    + cd /home/mourad/PROJECTS/birdhouse-deploy/birdhouse/deployment/..
    + docker stop proxy
    proxy
    + cd /
    + CERTBOT_OPTS=
    + '[' '!' -z 1 ]
    + CERTBOT_OPTS=renew
    + docker run --rm --name certbot -v /etc/letsencrypt:/etc/letsencrypt -v /var/lib/letsencrypt:/var/lib/letsencrypt -v /var/log/letsencrypt:/var/log/letsencrypt -p 443:443 -p 80:80 certbot/certbot:v1.3.0 renew
    Saving debug log to /var/log/letsencrypt/letsencrypt.log
    
    - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
    Processing /etc/letsencrypt/renewal/medus.ouranos.ca.conf
    - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
    Cert not yet due for renewal
    
    - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
    
    The following certs are not due for renewal yet:
      /etc/letsencrypt/live/medus.ouranos.ca/fullchain.pem expires on 2020-12-10 (skipped)
    No renewals were attempted.
    - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
    + RC=0
    + '[' '!' -z 1 ]
    + TMP_SSL_CERT=/tmp/tmp_certbotwrapper_ssl_cert.pem
    + CERTPATH=/etc/letsencrypt/live/medus.ouranos.ca
    + cd /home/mourad/PROJECTS/birdhouse-deploy/birdhouse/deployment/..
    + docker run --rm --name copy_cert -v /etc/letsencrypt:/etc/letsencrypt bash cat /etc/letsencrypt/live/medus.ouranos.ca/fullchain.pem /etc/letsencrypt/live/medus.ouranos.ca/privkey.pem
    + diff /etc/letsencrypt/live/medus.ouranos.ca/certkey.pem /tmp/tmp_certbotwrapper_ssl_cert.pem
    --- /etc/letsencrypt/live/medus.ouranos.ca/certkey.pem
    +++ /tmp/tmp_certbotwrapper_ssl_cert.pem
    @@ -1,33 +1,33 @@
     -----BEGIN CERTIFICATE-----
    
    REMOVED for Privacy.
    
     -----END PRIVATE KEY-----
    + '[' 0 -eq 0 ]
    + cp -v /tmp/tmp_certbotwrapper_ssl_cert.pem /etc/letsencrypt/live/medus.ouranos.ca/certkey.pem
    '/tmp/tmp_certbotwrapper_ssl_cert.pem' -> '/etc/letsencrypt/live/medus.ouranos.ca/certkey.pem'
    + rm -v /tmp/tmp_certbotwrapper_ssl_cert.pem
    removed '/tmp/tmp_certbotwrapper_ssl_cert.pem'
    + '[' -z  ]
    + docker start proxy
    proxy
    + cd /
    + set +x
    
    certbotwrapper finished START_TIME=2020-09-11T15:00:06+0000
    certbotwrapper finished   END_TIME=2020-09-11T15:00:31+0000
    

1.11.1 (2020-09-15)

1.11.0 (2020-08-25)

1.10.4 (2020-08-05)

  • jupyter: new update image with hvplot pinned to older version for violin plot

    See PR Ouranosinc/PAVICS-e2e-workflow-tests#48 (commit https://github.com/Ouranosinc/PAVICS-e2e-workflow-tests/commit/4ad6ba6fa2a4ecf6d5d78e0602b39202307bcb76) for more detailed info.

    Deployed to Medus for testing (as regular PAVICS image, not the devel image). @aulemahal reported back that violin plot still do not work even with the old hvplot pinned in this image.

    I'll release this image as-is since violin plot is also not working in the previous image that had hvplot 0.6.0 so no new regression there. Will unpin hvplot on next image build because pinning it did not fix violin plot (probably interference from other newer packages in this build).

    Noticeable changes:

    <   - hvplot=0.6.0=pyh9f0ad1d_0
    >   - hvplot=0.5.2=py_0
    
    <   - dask=2.20.0=py_0
    >   - dask=2.22.0=py_0
    
    <   - geopandas=0.8.0=py_1
    >   - geopandas=0.8.1=py_0
    
    <   - pandas=1.0.5=py37h0da4684_0
    >   - pandas=1.1.0=py37h3340039_0
    
    <   - matplotlib=3.2.2=1
    >   - matplotlib=3.3.0=1
    
    <   - numpy=1.18.5=py37h8960a57_0
    >   - numpy=1.19.1=py37h8960a57_0
    
    <   - cryptography=2.9.2=py37hb09aad4_0
    >   - cryptography=3.0=py37hb09aad4_0
    
    <   - python=3.7.6=h8356626_5_cpython
    >   - python=3.7.8=h6f2ec95_1_cpython
    
    <   - nbval=0.9.5=py_0
    >   - nbval=0.9.6=pyh9f0ad1d_0
    
    <   - pytest=5.4.3=py37hc8dfbb8_0
    >   - pytest=6.0.1=py37hc8dfbb8_0

1.10.3 (2020-07-21)

  • proxy: increase timeout for reading a response from the proxied server

    Fixes Ouranosinc/raven#286

    "there seems to be a problem with the size of the ncml and the timeout if I use more than 10-12 years as the historical data. I get a : "Netcdf: DAP failure" error if I use too many years."

    ________________________________________________________ TestBiasCorrect.test_bias_correction ________________________________________________________
    Traceback (most recent call last):
      File "/zstore/repos/raven/tests/test_bias_correction.py", line 20, in test_bias_correction
        ds = (xr.open_dataset(hist_data).sel(lat=slice(lat + 1, lat - 1),lon=slice(lon - 1, lon + 1), time=slice(dt.datetime(1991,1,1), dt.datetime(2010,12,31))).mean(dim={"lat", "lon"}, keep_attrs=True))
      File "/home/lvu/.conda/envs/raven/lib/python3.7/site-packages/xarray/core/common.py", line 84, in wrapped_func
        func, dim, skipna=skipna, numeric_only=numeric_only, **kwargs
      File "/home/lvu/.conda/envs/raven/lib/python3.7/site-packages/xarray/core/dataset.py", line 4313, in reduce
        **kwargs,
      File "/home/lvu/.conda/envs/raven/lib/python3.7/site-packages/xarray/core/variable.py", line 1586, in reduce
        input_data = self.data if allow_lazy else self.values
      File "/home/lvu/.conda/envs/raven/lib/python3.7/site-packages/xarray/core/variable.py", line 349, in data
        return self.values
      File "/home/lvu/.conda/envs/raven/lib/python3.7/site-packages/xarray/core/variable.py", line 457, in values
        return _as_array_or_item(self._data)
      File "/home/lvu/.conda/envs/raven/lib/python3.7/site-packages/xarray/core/variable.py", line 260, in _as_array_or_item
        data = np.asarray(data)
      File "/home/lvu/.conda/envs/raven/lib/python3.7/site-packages/numpy/core/_asarray.py", line 83, in asarray
        return array(a, dtype, copy=False, order=order)
      File "/home/lvu/.conda/envs/raven/lib/python3.7/site-packages/xarray/core/indexing.py", line 677, in __array__
        self._ensure_cached()
      File "/home/lvu/.conda/envs/raven/lib/python3.7/site-packages/xarray/core/indexing.py", line 674, in _ensure_cached
        self.array = NumpyIndexingAdapter(np.asarray(self.array))
      File "/home/lvu/.conda/envs/raven/lib/python3.7/site-packages/numpy/core/_asarray.py", line 83, in asarray
        return array(a, dtype, copy=False, order=order)
      File "/home/lvu/.conda/envs/raven/lib/python3.7/site-packages/xarray/core/indexing.py", line 653, in __array__
        return np.asarray(self.array, dtype=dtype)
      File "/home/lvu/.conda/envs/raven/lib/python3.7/site-packages/numpy/core/_asarray.py", line 83, in asarray
        return array(a, dtype, copy=False, order=order)
      File "/home/lvu/.conda/envs/raven/lib/python3.7/site-packages/xarray/core/indexing.py", line 557, in __array__
        return np.asarray(array[self.key], dtype=None)
      File "/home/lvu/.conda/envs/raven/lib/python3.7/site-packages/xarray/backends/netCDF4_.py", line 73, in __getitem__
        key, self.shape, indexing.IndexingSupport.OUTER, self._getitem
      File "/home/lvu/.conda/envs/raven/lib/python3.7/site-packages/xarray/core/indexing.py", line 837, in explicit_indexing_adapter
        result = raw_indexing_method(raw_key.tuple)
      File "/home/lvu/.conda/envs/raven/lib/python3.7/site-packages/xarray/backends/netCDF4_.py", line 85, in _getitem
        array = getitem(original_array, key)
      File "/home/lvu/.conda/envs/raven/lib/python3.7/site-packages/xarray/backends/common.py", line 54, in robust_getitem
        return array[key]
      File "netCDF4/_netCDF4.pyx", line 4408, in netCDF4._netCDF4.Variable.__getitem__
      File "netCDF4/_netCDF4.pyx", line 5352, in netCDF4._netCDF4.Variable._get
      File "netCDF4/_netCDF4.pyx", line 1887, in netCDF4._netCDF4._ensure_nc_success
    RuntimeError: NetCDF: DAP failure
    

1.10.2 (2020-07-18)

1.10.1 (2020-07-11)

  • Monitoring: add alert rules and alert handling (deduplicate, group, route, silence, inhibit).

    This is a follow up to the previous PR #56 that added the monitoring itself.

    Added cAdvisor and Node-exporter collection of alert rules found here https://awesome-prometheus-alerts.grep.to/rules with a few fixing because of errors in the rules and tweaking to reduce false positive alarms (see list of commits). Great collection of sample of ready-made rules to hit the ground running and learn PromML query language on the way.

    2020-07-08-090953_474x1490_scrot

    Added Alertmanager to handle the alerts (deduplicate, group, route, silence, inhibit). Currently the only notification route configured is email but Alertmanager is able to route alerts to Slack and any generic services accepting webhooks.

    2020-07-08-091150_1099x669_scrot

    2020-07-08-091302_1102x1122_scrot

    This is an initial attempt at alerting. There are several ways to tweak the system without changing the code:

    • To add more Prometheus alert rules, volume-mount more *.rules files to the prometheus container.
    • To disable existing Prometheus alert rules, add more Alertmanager inhibition rules using ALERTMANAGER_EXTRA_INHIBITION via env.local file.
    • Other possible Alertmanager configs via env.local: ALERTMANAGER_EXTRA_GLOBAL, ALERTMANAGER_EXTRA_ROUTES, ALERTMANAGER_EXTRA_RECEIVERS.

    What more could be done after this initial attempt:

    • Possibly add more graphs to Grafana dashboard since we have more alerts on metrics that we do not have matching Grafana graph. Graphs are useful for historical trends and correlation with other metrics, so not required if we do not need trends and correlation.

    • Only basic metrics are being collected currently. We could collect more useful metrics like SMART status and alert when a disk is failing.

    • The autodeploy mechanism can hook into this monitoring system to report pass/fail status and execution duration, with alerting for problems. Then we can also correlate any CPU, memory, disk I/O spike, when the autodeploy runs and have a trace of previous autodeploy executions.

    I had to test these alerts directly in prod to tweak for less false positive alert and to debug not working rules to ensure they work on prod so these changes are already in prod ! This also test the SMTP server on the network.

    See rules on Prometheus side: http://pavics.ouranos.ca:9090/rules, http://medus.ouranos.ca:9090/rules

    Manage alerts on Alertmanager side: http://pavics.ouranos.ca:9093/#/alerts, http://medus.ouranos.ca:9093/#/alerts

    Part of issue #12

1.10.0 (2020-07-02)

  • Monitoring for host and each docker container.

    Screenshot_2020-06-19 Docker and system monitoring - Grafana

    For host, using Node-exporter to collect metrics:

    • uptime
    • number of container
    • used disk space
    • used memory, available memory, used swap memory
    • load
    • cpu usage
    • in and out network traffic
    • disk I/O

    For each container, using cAdvisor to collect metrics:

    • in and out network traffic
    • cpu usage
    • memory and swap memory usage
    • disk usage

    Useful visualisation features:

    • zoom in one graph and all other graph update to match the same "time range" so we can correlate event
    • view each graph independently for more details
    • mouse over each data point will show value at that moment

    Prometheus is used as the time series DB and Grafana is used as the visualization dashboard.

    Node-exporter, cAdvisor and Prometheus are exposed so another Prometheus on the network can also scrape those same metrics and perform other analysis if required.

    The whole monitoring stack is a separate component so user is not forced to enable it if there is already another monitoring system in place. Enabling this monitoring stack is done via env.local file, like all other components.

    The Grafana dashboard is taken from https://grafana.com/grafana/dashboards/893 with many fixes (see commits) since most of the metric names have changed over time. Still it was much quicker to hit the ground running than learning the Prometheus query language and Grafana visualization options from scratch. Not counting there are lots of metrics exposed, had to filter out which one are relevant to graph. So starting from a broken dashboard was still a big win. Grafana has a big collection of existing but probably un-maintained dashboards we can leverage.

    So this is a first draft for monitoring. Many things I am not sure or will need tweaking or is missing:

    • Probably have to add more metrics or remove some that might be irrelevant, with time we will see.
    • Probably will have to tweak the scrape interval and the retention time, to keep the disk storage requirement reasonable, again we'll see with time.
    • Missing alerting. With all the pretty graph, we are not going to look at them all day, we need some kind of alerting mechanism.

    Test system: http://lvupavicsmaster.ouranos.ca:3001/d/pf6xQMWGz/docker-and-system-monitoring?orgId=1&refresh=5m, user: admin, passwd: the default passwd

    Also tested on Medus: http://medus.ouranos.ca:3001/d/pf6xQMWGz/docker-and-system-monitoring?orgId=1&refresh=5m (on Medus had to perform full yum update to get new kernel and new docker engine for cAdvisor to work properly).

    Part of issue #12

1.9.6 (2020-06-15)

  • flyingpigeon: update to version 1.6

    Deploy the new Flyingpigeon 1.6 on PAVICS.

    Has been deployed to Medus test environment.

    flyingpigeon changelog from release commit https://github.com/bird-house/flyingpigeon/commit/a6f54ed0c20919485c2420295729e30f914cfa15 (PR bird-house/flyingpigeon#332)

    1.6 (2020-06-10)

    • remove eggshell dependency
    • notebooks are part of the test suite
    • improved plot processes
    • remove mosaic option for subset processes
    • polygon subset processes files separately instead of an entire data-set at once
    • multiple outputs listed in Metalink output
    • update pywps to 4.2.3
    • use cruft to keep up-to-date with the cookie-cutter template

1.9.5 (2020-06-12)

1.9.4 (2020-06-03)

1.9.3 (2020-05-07)

1.9.2 (2020-04-29)

1.9.1 (2020-04-24)

  • Fix notebook autodeploy wipe already deployed notebook when GitHub down.

    Fixes #43

    Fail early with any unexpected error to not wipe already deployed notebooks.

    Check source dir not empty before wiping dest dir containing already deployed notebooks.

    Reduce cleaning verbosity for more concise logging.

    To fix this error found in production logs when Github is down today:

    notebookdeploy START_TIME=2020-04-23T10:01:01-0400
    ++ mktemp -d -t notebookdeploy.XXXXXXXXXXXX
    + TMPDIR=/tmp/notebookdeploy.ICk70Vto2LaE
    + cd /tmp/notebookdeploy.ICk70Vto2LaE
    + mkdir tutorial-notebooks
    + cd tutorial-notebooks
    + wget --quiet https://raw.githubusercontent.com/Ouranosinc/PAVICS-e2e-workflow-tests/master/downloadrepos
    + chmod a+x downloadrepos
    chmod: cannot access ‘downloadrepos’: No such file or directory
    + wget --quiet https://raw.githubusercontent.com/Ouranosinc/PAVICS-e2e-workflow-tests/master/default_build_params
    + wget --quiet https://raw.githubusercontent.com/Ouranosinc/PAVICS-e2e-workflow-tests/master/binder/reorg-notebooks
    + chmod a+x reorg-notebooks
    chmod: cannot access ‘reorg-notebooks’: No such file or directory
    + wget --quiet --output-document - https://github.com/Ouranosinc/PAVICS-e2e-workflow-tests/archive/master.tar.gz
    + tar xz
    
    gzip: stdin: unexpected end of file
    tar: Child returned status 1
    tar: Error is not recoverable: exiting now
    + ./downloadrepos
    /etc/cron.hourly/PAVICS-deploy-notebooks: line 63: ./downloadrepos: No such file or directory
    + ./reorg-notebooks
    /etc/cron.hourly/PAVICS-deploy-notebooks: line 64: ./reorg-notebooks: No such file or directory
    + mv -v 'PAVICS-e2e-workflow-tests-master/notebooks/*.ipynb' ./
    mv: cannot stat ‘PAVICS-e2e-workflow-tests-master/notebooks/*.ipynb’: No such file or directory
    + rm -rfv PAVICS-e2e-workflow-tests-master
    + rm -rfv downloadrepos default_build_params reorg-notebooks
    + TMP_SCRIPT=/tmp/notebookdeploy.ICk70Vto2LaE/deploy-notebook
    + cat
    + chmod a+x /tmp/notebookdeploy.ICk70Vto2LaE/deploy-notebook
    + docker pull bash
    Using default tag: latest
    latest: Pulling from library/bash
    Digest: sha256:febb3d74f41f2405fe21b7c7b47ca1aee0eda0a3ffb5483ebe3423639d30d631
    Status: Image is up to date for bash:latest
    + docker run --rm --name deploy_tutorial_notebooks -u root -v /tmp/notebookdeploy.ICk70Vto2LaE/deploy-notebook:/deploy-notebook:ro -v /tmp/notebookdeploy.ICk70Vto2LaE/tutorial-notebooks:/tutorial-notebooks:ro -v /data/jupyterhub_user_data:/notebook_dir:rw --entrypoint /deploy-notebook bash
    + cd /notebook_dir
    + rm -rf tutorial-notebooks/WCS_example.ipynb tutorial-notebooks/WFS_example.ipynb tutorial-notebooks/WMS_example.ipynb tutorial-notebooks/WPS_example.ipynb tutorial-notebooks/catalog_search.ipynb tutorial-notebooks/dap_subset.ipynb tutorial-notebooks/esgf-compute-api-examples-devel tutorial-notebooks/esgf-dap.ipynb tutorial-notebooks/finch-usage.ipynb tutorial-notebooks/hummingbird.ipynb tutorial-notebooks/opendap.ipynb tutorial-notebooks/pavics_thredds.ipynb tutorial-notebooks/raven-master tutorial-notebooks/rendering.ipynb tutorial-notebooks/subsetting.ipynb
    + cp -rv '/tutorial-notebooks/*' tutorial-notebooks
    cp: can't stat '/tutorial-notebooks/*': No such file or directory
    + chown -R root:root tutorial-notebooks
    + set +x
    removed directory: ‘/tmp/notebookdeploy.ICk70Vto2LaE/tutorial-notebooks’
    removed ‘/tmp/notebookdeploy.ICk70Vto2LaE/deploy-notebook’
    removed directory: ‘/tmp/notebookdeploy.ICk70Vto2LaE’
    
    notebookdeploy finished START_TIME=2020-04-23T10:01:01-0400
    notebookdeploy finished   END_TIME=2020-04-23T10:02:12-0400
    

1.9.0 (2020-04-24)

  • vagrant: add centos7 and LetsEncrypt SSL cert support, fix scheduler autodeploy remaining issues

    Fixes #27.

    Centos7 support added to Vagrant to reproduce problems found on Medus in PR #39 (commit https://github.com/bird-house/birdhouse-deploy/commit/6036dbd5ff072544d902e7b84b5eff361b00f78b):

    Problem 1: wget httpS url not working in bash docker image breaking the notebook autodeploy when running under the new scheduler autodeploy: not reproducible

    Problem 2: all containers are destroyed and recreated when alternating between manually running ./pavics-compose.sh up -d locally and when the same command is executed automatically by the scheduler autodeploy inside its own container: not reproducible

    Problem 3: sysctl: error: 'net.ipv4.tcp_tw_reuse' is an unknown key on ./pavics-compose.sh up -d when executed automatically by the scheduler autodeploy inside its own container: reproduced but seems harmless so not fixing it.

    Problem 4: current user lose write permission to birdhouse-deploy checkout and other checkout in AUTODEPLOY_EXTRA_REPOS when using scheduler autodeploy: fixed

    Problem 5: no documentation for the new scheduler autodeploy: fixed

    Another autodeploy fix found while working on this PR: notebook autodeploy broken when /data/jupyterhub_user_data/tutorial-notebooks dir do not pre-exist. Regression from this commit https://github.com/bird-house/birdhouse-deploy/pull/16/commits/6ddaddc74d384299e45b0dc8d50a63e59b3cc0d5 (PR #16): before that commit the entire dir was copied, not just the content, so the dir was created automatically.

    Centos7 Vagrant box experience is not completely automated as Ubuntu box, even when using the same vagrant-disksize Vagrant plugin as Ubuntu box. Manual disk resize instruction is provided. Candidate for automation later if we destroy and recreate Centos7 box very often. Hopefully the problem is not there for Centos8 so we can forget about this annoyance.

    Automatic generation of SSL certificate from LetsEncrypt is also added for both Ubuntu and Centos Vagrant box. Can be used outside of Vagrant so Medus and Boreas can also benefit next time, if needed. Later docker image of certbot is used so should already be using ACMEv2 protocol (ACMEv1 is being deprecated).

    Pagekite is also preserved for both boxes for when exposing port 80 and 443 directly on the internet is not possible but PAVICS still need a real SSL certificate.

    Test server: https://lvupavicsmaster.ouranos.ca (Centos7, on internet with LetsEncrypt SSL cert).

    Jenkins run only have known errors: http://jenkins.ouranos.ca/job/ouranos-staging/job/lvupavicsmaster.ouranos.ca/4/console

    2020-04-22-070604_1299x1131_scrot

1.8.10 (2020-04-09)

  • Autodeploy the autodeploy phase 2: everything operational but a few compatibility issues remain

    Part of #27

    Activating the ./components/scheduler will do everything. All configurations are centralized in the env.local file.

    One missing feature is piece-wise choice of platform or notebook autodeploy only, like with the old manual install-* stcripts under https://github.com/bird-house/birdhouse-deploy/tree/master/birdhouse/deployment. Right now it's all or nothing. I can work on this if you guys think it's needed.

    Remaining compatibility issues with Medus (Vagrant box works fine):

    • Notebook autodeploy do not work. It looks like using the bash docker image, I am unable to wget any httpS address. This same docker run command works fine on my Vagrant box as well. So there's something on Medus.
    $ docker run --rm --name debug_wget_httpS -u root bash bash -c "wget https://google.com -O -"
    Connecting to google.com (172.217.13.206:443)
    wget: error getting response: Connection reset by peer
    
    • All the containers are being recreated when ./pavics-compose.sh runs inside the container (first migration to the new autodeploy mechanism). To investigate but I suspect this might be due to older version of docker and docker-compose on Medus.

    • This one looks like due to older kernel on Medus:

    sysctl: error: 'net.ipv4.tcp_tw_reuse' is an unknown key
    sh: 0: unknown operand
    
    • All the files updated by git pull are now owned by root (the user inside the container). I'll have to undo this ownership change, somehow. This one is super weird, I should have got it on my Vagrant box. Probably Vagrant did some magic to always ensure files under /vagrant is always owned by the user even if changed by user root.

    • Documentation: update README and list relevant configuration variables in env.local for this new ./component/scheduler.

    Migrating to this new mechanism requires manual deletion of all the artifacts created by the old install scripts: sudo rm /etc/cron.d/PAVICS-deploy /etc/cron.hourly/PAVICS-deploy-notebooks /etc/logrotate.d/PAVICS-deploy /usr/local/sbin/triggerdeploy.sh. Both can not co-exist at the same time.

    Maximum backward-compatibility has been kept with the old existing install scripts style:

    • Still log to the same existing log files under /var/log/PAVICS.
    • Old single ssh deploy key is still compatible, but the new mechanism allows for different ssh deploy keys for each extra repos (again, public repos should use https clone path to avoid dealing with ssh deploy keys in the first place)
    • Old install scripts are kept

    Features missing in old existing install scripts or how this improves on the old install scripts:

    • Autodeploy of the autodeploy itself ! This is the biggest win. Previously, if triggerdeploy.sh or PAVICS-deploy-notebooks script changes, they have to be deployed manually. It's very annoying. Now they are volume-mount in so are fresh on each run.
    • env.local now drive absolutely everything, source control that file and we've got a true DevOPS pipeline.
    • Configurable platform and notebook autodeploy frequency. Previously, this means manually editing the generated cron file, less ideal.
    • Do not need any support on the local host other than docker and docker-compose. cron/logrotate/git/ssh versions are all locked-down in the docker images used by the autodeploy. Recall previously we had to deal with git version too old on some hosts.
    • Each cron job run in its own docker image meaning the runtime environment is traceable and reproducible.
    • The newly introduced scheduler component is made extensible so other jobs can added into it as well (ex: backup), via env.local, which should source control, meaning all surrounding maintenance related tasks can also be traceable and reproducible.

    This is a rather large PR. For a less technical overview, start with the diff of README.md, env.local.example, common.env. If a change looks funny to you, read the commit description that introduce that change, the reasoning should be there.

1.8.9 (2020-04-08)

  • finch: update to 0.5.2

    Fix following 2 Jenkins failures:

    Tested in this Jenkins run http://jenkins.ouranos.ca/job/ouranos-staging/job/lvupavics-lvu.pagekite.me/20/console

      _________ finch-master/docs/source/notebooks/dap_subset.ipynb::Cell 9 __________
      Notebook cell execution failed
      Cell 9: Cell outputs differ
    
      Input:
      resp = wps.sdii(pr + sub)
      out = resp.get(asobj=True)
      out.output_netcdf.sdii
    
      Traceback:
       mismatch 'text/html'
    
       assert reference_output == test_output failed:
    
        '<pre>&lt;xar...vera...</pre>' == '<pre>&lt;xar...vera...</pre>'
        Skipping 350 identical leading characters in diff, use -v to show
          m/day
        -     cell_methods:   time: mean (interval: 30 minutes)
              history:        pr=max(0,pr) applied to raw data;\n[DATE_TIME] ...
        +     cell_methods:   time: mean (interval: 30 minutes)
              standard_name:  lwe_thickness_of_precipitation_amount
              long_name:      Average precipitation during wet days (sdii)
              description:    Annual simple daily intensity index (sdii) : annual avera...</pre>
    
      _________ finch-master/docs/source/notebooks/finch-usage.ipynb::Cell 1 _________
      Notebook cell execution failed
      Cell 1: Cell outputs differ
    
      Input:
      help(wps.frost_days)
    
      Traceback:
       mismatch 'stdout'
    
       assert reference_output == test_output failed:
    
        'Help on meth...ut files.\n\n' == 'Help on meth...ut files.\n\n'
        Skipping 399 identical leading characters in diff, use -v to show
        -    freq : string
        +    freq : {'YS', 'MS', 'QS-DEC', 'AS-JUL'}string
                  Resampling frequency
    
              Returns
              -------
              output_netcdf : ComplexData:mimetype:`application/x-netcdf`
                  The indicator values computed on the original input grid.
              output_log : ComplexData:mimetype:`text/plain`
                  Collected logs during process run.
              ref : ComplexData:mimetype:`application/metalink+xml; version=4.0`
                  Metalink file storing all references to output files.
    

1.8.8 (2020-03-20)

  • jupyter: make configurable public demo user name, passwd, resource limit, login banner

    For security reasons, the public demo username and password are not hardcoded anymore.

    Compromising of one PAVICS deployment should not compromise all other PAVICS deployments if each deployment use a different password.

    The password is set when the public demo user is created in Magpie, see the birdhouse/README.md update.

    The login banner do not display the public demo password anymore. If one really want to display the password, can use the top or bottom section of the login banner that is customizable via env.local.

    Login banner is updated with more notices, please review wording.

    Resource limits (only memory limit seems to work with the DockerSpawner) is also customizable.

    All changes to env.local are live after a ./pavics-compose.sh up -d.

    Test server: https://lvupavics-lvu.pagekite.me/jupyter/ (ask me privately for the password :D)

1.8.7 (2020-03-19)

  • finch: update to v0.5.1

1.8.6 (2020-03-16)

1.8.5 (2020-03-13)

  • jupyter: update to pavics/workflow-tests:200312 for Raven notebooks

1.8.4 (2020-03-10)

  • raven: upgrade to pavics/raven:0.10.0

1.8.3 (2020-02-17)

1.8.2 (2020-02-10)

  • Optionally monitor all components behind Twitcher using canarie api.

    Fixes #8

    The motivation was the need for some quick dashboard for the working state of all the components, not to get more stats.

    Right now we bypassing Twitcher, which is not real life, it's not what real users will experience.

    This is ultra cheap to add and provide very fast and up-to-date (every minute) result. It's like an always on sanity check that can quickly help debugging any connectivity issues between the components.

    It is optional because it assumes all components are publicly accessible. Might not be the case for everyone. We can also override the override :D

    All components in config/canarie-api/docker_configuration.py.template that do not have public (behind Twitcher) monitoring are added.

    Also added Hummingbird and ncWMS2 public monitoring.

    @tlogan2000 This will catch accidental Thredds public url breakage like last time and will leverage the existing monitoring on https://pavics.ouranos.ca/canarie/node/service/stats by @moulab88.

    @davidcaron @dbyrns This is optional so if the CRIM do not want to enable it, it's fine.

    New node monitoring page:

    Screenshot_2020-02-07 Ouranos - Node Service

1.8.1 (2020-02-06)

  • Increase JupyterHub security.

    ab56994 jupyter: limit memory of public user to 500 MB 90c1950 jupyter: prevent user from loading user-owned config at spawner server startup e8f2fa3 jupyter: avoid terminating user running jobs on Hub update 3f97cc7 jupyter: get ready to prevent browser session re-use even if password changed e2ebcc3 jupyter: disable notebook terminal for security reasons

1.8.0 (2020-02-03)

1.7.1 (2020-01-30)

  • jupyter: update various packages and add threddsclient

    Noticeable changes:

    <     - bokeh==1.4.0
    >   - bokeh=1.4.0=py36_0
    
    <   - python=3.7.3=h33d41f4_1
    >   - python=3.6.7=h357f687_1006
    
    >   - threddsclient=0.4.2=py_0
    
    <     - xarray==0.13.0
    >   - xarray=0.14.1=py_1
    
    <     - dask==2.8.0
    >     - dask==2.9.2
    
    <     - xclim==0.12.2
    >     - xclim==0.13.0

    See PR Ouranosinc/PAVICS-e2e-workflow-tests#34 for more info.

1.7.0 (2020-01-22)

  • backup solr: should save all of /data/solr, not just the index

Prior Versions

All versions prior to 1.7.0 were not officially tagged. Is it strongly recommended employing later versions to ensure better traceability of changes that could impact behavior and potential issues on new server instances.