gaiaDocker runs the core OHDSI GIS technology stack using cross-platform Docker container technology.
Information on Observational Health Data Sciences and Informatics (OHDSI)
This repository contains the Docker Compose file used to launch the OHDSI gaiaDocker Docker containers:
-
the OHDSI GIS gaia stack [ with: --profile gaia ]
- gaia-core
Hades based R environment with additional GIS toolchain
image: OHDSI/GIS#containerize:./docker/gaia-core - gaia-db
postgis relational database as GIS datastore
image: OHDSI/gaiaDB#main:./ - gaia-catalog
python flask app as an interface to gaia-solr at http://localhost:5000
image: OHDSI/gaiaCatalog#main:./docker/repository - gaia-solr
solr index of all catalog entries at http://localhost:8983
image: OHDSI/gaiaCatalog#main:./docker/solr - gaia-osgeo
gdal/ogr toolset for ETL
image: OHDSI/gaiaDocker#main:./docker/ohdsi-osgeo - gaia-postgis
postgis toolset for ETL
image: OHDSI/gaiaDocker#main:./docker/ohdsi-postgis - gaia-git
git for using external code
image: OHDSI/gaiaDocker#main:./docker/ohdsi-git - gaia-gdsc
python environment for data processing
image: OHDSI/gaiaDocker#main:./docker/ohdsi-gdsc
- gaia-core
-
additional tools [ optional ] [ with: --profile degauss ]
- gaia-degauss
degauss geocoder for adding lat/lon to address information
image: GDSC/docker#ohdsi:./builds/degauss
- gaia-degauss
This repository is based on the OHDSI Broadsea implementation with future integration in mind.
Throughout this README, we will show docker compose commands with the convention of docker compose (no hyphen), per the new Docker Compose V2 standard outlined by Docker.
For gaiaDocker, you will need Docker version 1.27.0+.
- Linux, Mac, or Windows with WSL
- Docker 1.27.0+
- Git
- Chromium-based web browser (Chrome, Edge, etc.)
You also need the gaiaCatalog github repository cloned into the same parent folder as the gaiaDocker repository. See instructions below.
All secrets are in the top-level secrets folder. For gaiaDocker there is a gaia subfolder with gaia specific secrets. Note that syou should change your internal secrets (postgres, internal API, etc) and that you will need to provide your own external API secrets (Copernicus, Census, Earth Explorer, and so on).
For API keys see the README.md in the secrets/gaia directory of this repository.
If using Mac Silicon (M1, M2, etc), you may need to set the DOCKER_ARCH variable in Section 1 of the .env file to "linux/arm64" (line 5). Some Broadsea services still need to run via emulation of linux/amd64 and are hard-coded as such.
It is likely the gaia-core container will run, but RStudio login will fail on Mac Silicon. The base Hades image upon which this version of gaia-core is built is only maintained for amd64 architectures.
-
Download and install Docker.
- Windows: See the installation instructions for Docker Desktop at the Docker Web Site NOTE: you will have to either install WSL or Hyper-V. This repo has been tested on WSL.
- Apple: See the installation instructions for Docker Desktop at the Docker Web Site
NOTE: you can also use Colima (from your shell/terminal)
- first install homebrew:
$
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)" - then install colima:
$
brew install colima
- first install homebrew:
$
-
git clone this GitHub repo:
git clone git@github.com:OHDSI/gaiaDocker.git -
Before starting gaiaDocker containers, you must authenticate to GitHub Container Registry (GHCR). This step is required to access the osgeo/gdal image. Detailed instructions here. Basic steps for authenticating in a command line / terminal window with a Personal Access Token (PAT) below:
# Create a GitHub PAT with read:packages scope in order to authenticate (see above instructions) export CR_PAT=YOUR_TOKEN echo $CR_PAT | docker login ghcr.io -u USERNAME --password-stdin # > Login Succeeded
In the same parent directory as the gaiaDocker repository, run the following command:
git clone git@github.com:OHDSI/gaiaCatalog.gitthe resulting structure should look like this:
.
+-- parentDirectory
| --- gaiaDocker
| --- gaiaCatalog
-
In a command line / terminal window - navigate to the directory where this README.md file is located and start the gaia Docker Containers using the below command. On Linux you may need to use 'sudo' to run this command.
docker compose --profile gaia up -d
--or--
docker-compose --profile gaia up -d
- The first time the above command is run it will take several mintues for the containers to build.
- The containers will run and you can explore.
- In the gaia-catalog (locahost:5000) the red buttons next to the dataset name will load the specific layer into the public schema of the gaia-db when clicked. The structure of the source data is maintained to the extent possible. If successful the dot will turn green.
- In the gaia-solr (localhost:8983) you can explore the two indexes (collections and dcat). The dcat collection is the index for the data layers.
- In the gaia-core (localhost:8787) you can login as user:ohdsi with pass:mypass to run RStudio with the geospatial extensions loaded (windows only)
- You can connect to the gaia-db with a postgres client like PGAdmin or QGIS with host:localhost, port:5433, database:gaiaDB, user:postgres, pass:SuperSecret
Both the postGIS database and SOLR-based catalog will persist on your local machine when docker is stopped or the machine powers off. In any case, sometimes it will be necessary to reset the PostGIS database and / or the SOLR index for the catalog.
To reset the postGIS database it is easiest to rebuild the gaia-db docker image and the gaia-db docker volumne. To do this navigate to the gaiaDocker directory in a command line / terminal window and run:
docker-compose --profile gaia down
docker image rm gaia-db
docker volume rm gaia-db
docker-compose --profile gaia up -dYou can update the catalog entires by pulling new json files from the gaiaCatalog repository. In a command line / terminal window navigate to the gaiaCatalog directory and run:
git pull origin mainThis updates the json files used as the source of truth for the catalog. NOTE: once the json is updated, the SOLR index must also be updated. The easiest way to do this is to navigate to the gaiaDocker directory in a command line / terminal window and rebuild the gaia-solr docker image and gaia-solr docker volume as follows:
docker-compose --profile gaia down
docker image rm gaia-solr
docker volume rm gaia-solr
docker-compose --profile gaia up -d-
In the directory where this README.md is located, in a command line / terminal window, use the below command to terminate the containers.
docker compose --profile gaia down
--or--
docker-compose --profile gaia down