Skip to content

jhpyle/docassemble-compose

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 

Repository files navigation

docassemble-compose

This repository demonstrates a way to run docassemble using Docker Compose.

Synopsis

To start up:

git clone https://github.com/jhpyle/docassemble-compose
cd docassemble-compose
./setup.sh
docker compose up --build --detach

To shut down, delete application data, and free up disk space:

docker compose down --volumes --rmi

Requirements

The system requirements of using the Docker Compose implementation are the same as those described in the Docker section of the docassemble documentation. You should have at least 4GB of RAM available and at least 40GB of hard drive space. The AMD architecture is recommended but the ARM architecture is available, if somewhat experimental.

Using a Linux server in the cloud is recommended.

If you want to install it on Windows, then:

  • Install WSL2. Using a recent version of Ubuntu (e.g., Ubuntu 24.04) is recommended.
  • Install git for Windows.
  • Install Docker Desktop for Windows.
  • Make sure Docker Desktop is integrated with WSL2.
  • With Docker Desktop running, launch a WSL2 command line (e.g., open the Ubuntu 24.04 "app").

How to install Docker Compose

The method for installing Docker Compose will depend on your host machine. Consult the internet for details.

On Debian, the command may be:

sudo apt-get install docker-compose

On Ubuntu, the command may be:

sudo apt-get install docker-compose-v2

However, if you are using Docker's repository, the command would be:

sudo apt-get install docker-compose-plugin

On Amazon Linux, the command may be:

sudo yum install docker-compose-plugin

On Windows, Docker Compose is included with Docker Desktop.

How to download docassemble-compose

Make sure git is installed. Then download docassemble-compose:

git clone https://github.com/jhpyle/docassemble-compose

This will create a directory called docassemble-compose in the current directory.

Switch into that directory:

cd docassemble-compose

From here, you can run Docker Compose commands. First, however, you need to create the necessary configuration files. To do so, run:

./setup.sh

This is an interactive script that will ask you questions about your deployment and create the appropriate configuration files. The script is designed for new installations, but you should use it even if you want to migrate an existing server.

If you want to use Let's Encrypt to obtain a SSL certificate, you will need to make sure that you have configured DNS so that your machine has an externally addressable hostname (not a default hostname that a provider like AWS might assign automatically). The ./setup.sh script will attempt to obtain an SSL certificate. However, if you already have an SSL certificate for your server (i.e., because you are migrating an existing server to Docker Compose) you can say no to the question "Do you want to try to obtain a certificate from Let's Encrypt now?"

You can then manually edit the configuration files to customize your deployment.

Configuration files

The following files must be present in order for the application to start. Initial versions of the files are created by the setup.sh script based on template files (files in the templates directory). These files can be customized to change the way that the application operates.

  • docker-compose.yml: specifies the "services," "volumes," and "secrets" that Docker Compose should manager.
  • initial-config.yml: when the container running the docassemble service starts up, it checks to see if the file /config/config.yml exists in the container. If it does not exist, initial-config.yml will be copied to /config/config.yml. Note that the /config directory is a Docker volume. The initial-config.yml is only the initial configuration. Once your docassemble server starts, the actual Configuration will be located at /config/config.yml inside the Docker volume, and it is not tied to initial-config.yml.
  • nginx.conf: specifies the NGINX configuration used by the nginx service.
  • initial-creds.sh: when the docassemble server starts for the first time, it will create a user account with administrative privileges. This file contains the username and password of the administrative user, and can also contain an API key owned by the administrative user. This file can be blank, and making it blank is a good idea once the docassemble server starts, so that your credentials are not available in plain text format on your host or in the docassemble container.

By changing docker-compose.yml and initial-config.yml, you can customize your setup. For example, if you are using AWS RDS to provide a SQL database, you can remove the postgres service (and its volume, pg_data) from docker-compose.yml and then modify initial-config.yml to include the db configuration you want to use. If you are using a managed or external Redis service, you can remove the redis service (and its volume, redis_data) from docker-compose.yml and then modify initial-config.yml to include the redis URL that points to your external Redis service.

Using a random secret key

The setup.py file defines a random secretkey in initial-config.yml with a random 32-character alphanumeric string. This is important for security reasons. The secretkey needs to be kept secret because it is the basis of encryption on the server. Since the secretkey is the basis of encryption, you should not change it after your server starts, and you should not lose it. If you lose it (e.g., you delete it or overwrite it in your Configuration), encrypted passwords and interview answers on the server will not be able to be decrypted.

Using Let's Encrypt

The setup.sh file asks you if you want to use HTTPS with Let's Encrypt. If you say yes, it includes the certbot service in docker-compose.yml, which renews the certificate every 24 hours. The setup.sh script also obtains the initial certificate, so make sure you have set up a DNS entry to associate your hostname with the machine running Docker Compose before you run setup.sh.

Customizing the container

If editing docker-compose.yml and initial-config.yml does not provide sufficient flexibility, you can customize the application further by modifying the core Docker image, which contains the docassemble web application. The services directory contains a Dockerfile.core file and a core subdirectory with files that Dockerfile.core uses. The docker-compose.yml file references this file:

    build:
      context: ./services
      dockerfile: Dockerfile.core
    image: docassemble

Note that the image's tag is simply docassemble, not jhpyle/docassemble. The image will be built when you run docker compose up --build --detach. You can change the way the container operates by changing the following files in services:

  • core/run-www.sh: a bash script that runs as www-data and launches uwsgi, the celery workers, and the websockets listener.
  • core/run.sh: a bash script that runs as root and launches cron and run-www.sh.
  • core/cron.sh: a bash script that cleans up files in the temporary directory and executes docassemble scheduled tasks.
  • core/crontab: the file that will be installed at /etc/crontab and that will cause cron.sh to be executed.
  • core/docassemble.ini: the configuration file for uwsgi.
  • Dockerfile.core: this contains instructions on how to build the image containing the docassemble application. It is based on Ubuntu 24.04. The Ubuntu packages that are available within the container are determined by apt-get commands in this file.

Starting the application

When you are sure the correct configuration files are in place, you can start everything by running:

docker compose up --build --detach

Stopping the application

To stop each of the containers spawned by docker compose up, run:

docker compose down

This will not delete any application data or images. If you want to delete the application data, run:

docker compose down --volumes

If you want to delete the images, run:

docker compose down --rmi

Editing files in a docker volume

If you need to edit your config.yml file and you cannot do through the web interface, then while your application is running, you can do:

docker cp docassemble-compose-docassemble-1:/config/config.yml .
nano config.yml
docker cp config.yml docassemble-compose-docassemble-1:/config/config.yml
docker exec docassemble-compose-docassemble-1 chown www-data:www-data /config/config.yml
rm config.yml

(In place of nano, you can use whatever editor you want to use.)

Upgrading

To upgrade the application, cd to docassemble-compose directory and run:

git pull
docker compose down --rmi
docker volume rm docassemble-compose_app
docker compose up --build --detach

git pull will down a new version of docassemble-compose, if one exists. If you have modified any of the files that are part of the repository, you will have merge conflicts that you will need to resolve.

docker compose down --rmi stops the application and removes the Docker images. This is necessary because if your docker-compose.yml calls for nginx:latest, and there is a nginx:latest present (i.e. it is listed when you run docker images, then Docker Compose will use that version of the image, even if it is very old. Removing the images forces Docker Compose to build or download new versions of the images. Removing the images does not delete application data, because your application data is in Docker volumes.

docker volume rm docassemble-compose_app removes the /app container, which contains software for the docassemble web app. It is a Docker volume for purposes of sharing across containers, but does not contain application data.

docker compose up --build --detach starts the application, downloads and/or builds the images necessary, so that they are using new versions, and starts the application again.

As long as you do not delete your Docker volumes (e.g., with docker volume rm or docker compose down --volumes) then when the services start up again, they will use the application data in your volumes. The initial-config.yml file will not be used because there is already a config.yml file in the app volume. Your PostgreSQL service will use the data in the pg_data volume. Your Redis server will use the data in the redis_data volume. If you are using Let's Encrypt, the nginx service will use the certificates in the certbot-certs volume. The rabbitmq service will use the data in rabbitmq_data, although it is unlikely that the contents of that volume are important.

Note that your docker-compose.yml file specifies particular images and tags. When you do git pull, your docker-compose.yml file will not change. If docassemble-compose has a new version of the docker-compose.yml.template file, you will not see those changes unless you delete your docker-compose.yml file and rerun setup.sh. Note that if you do that, you may need to manually migrate the data that are in your pg_data and redis Docker volumes. For example, if the data in pg_data were created by postgres:18, and you run postgres:19 while mounting the same pg_data volume, the postgres:19 container may not know how to read the data in the pg_data volume unless you perform manual migration steps.

Managing application data

Application data are stored in Docker volumes. You will need to make sure you do not accidentially delete these Docker volumes.

If you want to transfer your docassemble server from one host to another, you will need to copy the volumes from one server to another. In the section below, "You want to move to a new server and the dabackup volume exists on the current machine," there is an explanation of how to use the command line to transfer a Docker volume from one machine to another.

If you run docker volume ls, you will see several volumes:

DRIVER    VOLUME NAME
local     docassemble-compose_app
local     docassemble-compose_config
local     docassemble-compose_files
local     docassemble-compose_pg_data
local     docassemble-compose_rabbitmq_data
local     docassemble-compose_redis_data
local     docassemble-compose_tempdata

If you are using Let's Encrypt, you will also see:

local     docassemble-compose_certbot-certs
local     docassemble-compose_certbot-www

docassemble-compose_app does not contain application data. It contains software, including bash scripts and a Python virtual environment. It is a volume because it needs to be accessible by other containers. In order to upgrade docassemble-compose, you will need to delete this volume.

docassemble-compose_config contains the config.yml file. It is mounted at /config/ inside the docassemble container. A docker volume is used instead of a bind mount because the file needs to be readable and writable by the web application. If it was a bind mount, managing file permissions would be difficult.

docassemble-compose_files contains the uploaded files and Playground files. It will be empty if you are using S3 or Azure Blob Storage.

docassemble-compose_pg_data contains the files for the PostgreSQL database. The db service uses it.

docassemble-compose_rabbitmq_data contains application data for the RabbitMQ task queue system. During a migration, these files are important only if you shut down Docker Compose while there are pending tasks, and you want the tasks to resume.

docassemble-compose_redis_data contains the Redis database for the docassemble application and the background tasks system.

docassemble-compose_tempdata contains a filesystem of temporary files. The volume does not contain important application data. It is a volume because it needs to be shared across containers. If you are doing a migration, you can ignore this volume.

docassemble-compose_certbot-certs contains SSL certificates obtained by the certbot service. If you are migrating, you should copy this volume. However, certbot may issue you a new certificate even if you lose the contents of this volume.

docassemble-compose_certbot-www does not contain application data. It is a volume that is shared between the certbot and nginx services for purposes of communicating with Let's Encrypt servers.

Troubleshooting

If something goes wrong, you can troubleshoot by running docker compose commands from the docassemble-compose directory.

See the logs of the docassemble service, which includes the web application, celery workers, scheduled task system, and the websockets application:

docker compose logs docassemble

See the logs of the nginx service:

docker compose logs nginx

You can get a shell inside the docassemble service, where you can explore the files in the running container:

docker compose exec docassemble /bin/bash

The following does the same thing, but runs as the root user:

docker compose exec -u root docassemble /bin/bash

Before you start the docassemble service, you can get a shell inside the docassemble service:

compose run --rm --no-deps docassemble /bin/bash

Note that this will create the app volume at /app if it does not already exist. You can use this to edit the files in /app before starting the service.

You can get a shell inside the nginx service, where you can explore the configuration files supporting the nginx service:

docker compose exec nginx /bin/bash

You can use the Redis database:

docker compose exec redis redis-cli

Note that you will need to run AUTH abc123 to authenticate with the server before you can run redis commands.

You can psql on the docassemble PostgreSQL database:

docker compose exec db psql -U docassemble -W docassemble

Note that it will ask you for the password, which is abc123.

Differences from the single Docker image deployment

This method of deploying docassemble is entirely separate from the method described in the Docker section of the docassemble documentation; that section of the documentation will be of little help when using docassemble-compose. It describes the jhpyle/docassemble image, which is not used by this Docker Compose implementation.

The monolithic jhpyle/docassemble Docker image uses supervisord to orchestrate services like Redis, RabbitMQ, PostgreSQL, NGINX, uWSGI, and Unoconv. By contrast, this implementation uses Docker Compose to orchestrate most of these services.

Control over NGINX

In the jhpyle/docassemble Docker image, Configuration values can be used to change the way that NGINX operates. This deployment does not work that way. For example, changing maximum content length in the Configuration will not affect NGINX's client_max_body_size setting; you will have to change that yourself.

The following Configuration directives will not affect the nginx service, and manual changes to nginx.conf are required.

  • root
  • url root
  • external hostname
  • http port
  • websockets port
  • nginx ssl protocols
  • nginx ssl ciphers
  • maximum content length

Persistence of application data

The Docker section describes methods of persisting application data in a single collection of backup files, stored either in a Docker volume, S3 bucket, or Azure Blob Storage, which is populated daily and on every safe shutdown. This set of backup files allow you to start a new container from a new image, where the application data are automatically imported, even if the underlying OS of the image changed, or the PostgreSQL database version changed.

This Docker Compose deployment method does not use this system. Rather, application data persist in several Docker volumes. Running docker compose down --volumes will delete all of the application data.

Migration of application data when upgrading

Note that if you upgrade the images in the Docker Compose application, there will be a version mismatch between the data in the Docker volume and the version of the software. The Docker Compose application uses a number of third-party images like postgres and redis, and these images may or may not have the ability to migrate data created by earlier versions. You may need to perform manual migration steps. See the documentation for those images.

Pandoc, LibreOffice, Tesseract, and Gotenberg

This method of deploying docassemble allows you to minimize disk usage by optionally not installing Pandoc, Tesseract, or LibreOffice. If you elect to use these features, they will run in separate containers that interact with the web application through Celery.

Conversion of DOCX files to PDF is handled by a Gotenberg service.

Non-essential functionality not installed

The Ubuntu packages installed in the web app container are kept to a minimum. As a result, certain features are not available.

  • fontconfig is not installed, so the fc-list command to list the installed fonts on the server is not available.
  • The mmdc() function, which is deprecated, is not supported because it requires node.js and Chrome to be installed.

Logs

The Logs menu item is not available when you use this deployment method. Log files are managed through Docker Compose.

Features not available yet

Some features are not yet supported, including:

  • Using SSL certificates to authenticate with a PostgreSQL database.
  • Setting a server time zone.
  • Setting a server locale.
  • Using an alternative pip package index.

Migrating from single Docker image deployment

Before you try to migrate an existing server from the jhpyle/docassemble single image deployment to the Docker Compose deployment, make sure you are confident in your knowledge of the Linux command line and Docker Compose. Do not irreversibly destroy your current deployment; you can shut it down while you attempt migration, but you need to plan to revert back to using it if the migration to Docker Compose does not succeed.

You also need to make sure that you have enough free disk space to deploy the Docker Compose deployment. The software alone may take up upwards of 10GB, and you need space for a copy of all of your stored files, your SQL database, and your Redis database.

Migrating your application data is a delicate process. There is no automated tool that will do it.

There are several components that need to be migrated:

  1. Configuration directives need to be copied and pasted into initial-config.yml.
  2. The SQL server needs to be copied, unless you are using an external SQL server currently.
  3. The Redis database needs to be copied, unless you are using an external Redis server currently.
  4. The stored files need to be copied, unless you are using S3 or Azure Blob Storage.
  5. If you are using Let's Encrypt, the SSL certificates need to be copied.

How you restore the SQL database, Redis database, and stored files depends on what form of data storage you are using.

The migration instructions in the following sections assume that on the same machine where you want to run Docker Compose, you have a Docker volume called dabackup that contains the application data you want to migrate.

Using the same server and dabackup already exists

If your current server uses a Docker volume for persistent storage, and you plan to run Docker Compose on the same host, the assumptions of the following sections (beginning with "Migrating the Configuration") are met. The dabackup Docker volume mounts to /usr/share/docassemble/backup on the container running the jhpyle/docassemble image, and it contains the files you need. You can proceed to the Migrating the Configuration section.

Using the same server and dabackup does not exist

If you don't have a dabackup volume, you can create one and copy files from your current jhpyle/docassemble container to the dabackup volume.

First, find the name or ID of your running jhpyle/docassemble container.

docker ps -a

Suppose the container has the id 6fe0f76e8b81. Then you can do:

MYCONTAINER=6fe0f76e8b81
docker stop -t 6000 ${MYCONTAINER}
docker run --rm --name tempcontainer -d -v dabackup:/import ubuntu:24.04 sleep infinity
docker cp -a ${MYCONTAINER}:/usr/share/docassemble/backup/postgres /tmp/
docker cp -a /tmp/postgres tempcontainer:/import/
rm -r /tmp/postgres
docker cp -a ${MYCONTAINER}:/usr/share/docassemble/backup/redis.rdb /tmp/
docker cp -a /tmp/redis.rdb tempcontainer:/import/
rm /tmp/redis.rdb

If you are using Let's Encrypt, copy the Let's Encrypt certificates as well:

docker cp -a ${MYCONTAINER}:/usr/share/docassemble/backup/letsencrypt.tar.gz /tmp/
docker cp -a /tmp/letsencrypt.tar.gz tempcontainer:/import/
rm /tmp/letsencrypt.tar.gz

If you are using cloud-based data storage, you do not need to migrate any stored files, but otherwise, you need to add the stored files to the dabackup volume.

docker cp -a ${MYCONTAINER}:/usr/share/docassemble/files /tmp/
docker cp -a /tmp/files tempcontainer:/import/
rm -r /tmp/files

When you are done populating the dabackup folder, stop the temporary container:

docker stop tempcontainer

Now the dabackup volume contains the files you need. To verify:

docker run --rm -v dabackup:/data alpine ls -lR /data

You can proceed to the Migrating the Configuration section.

Here is an explanation of the above commands. First, your running container is stopped, and is given plenty of time to shut down (docker stop -t 6000 ${MYCONTAINER}). This will ensure that up-to-date application data exists in /usr/share/docassemble/backup and no new application data will be generated while you are performing the migration.

Next, a temporary container is run, solely for the purpose of receiving files via docker cp and storing them in the Docker volume dabackup.

docker run --rm --name tempcontainer -d -v dabackup:/import ubuntu:24.04 sleep infinity

The -d parameter "detaches" the container, so it runs in the background. The --name tempcontainer parameter gives the container a name. The -v dabackup:/import parameter creates the dabackup volume and mounts it at /import inside the container. The command sleep infinity does nothing; the machine will just wait indefinitely.

The next command copies the postgres directory from the internal backup directory inside the container and writes it to the /tmp directory on the host.

docker cp -a ${MYCONTAINER}:/usr/share/docassemble/backup/postgres /tmp/

The -a parameter to docker cp means "archive." It ensures that file attributes are preserved during the copy operation.

The next command copies the postgres directory from the /tmp directory on the host to the dabackup volume, using tempcontainer and its Docker volume mount to receive the file.

docker cp -a /tmp/postgres tempcontainer:/import/

This is a two-step process because docker cp does not support copying from one container to another. Finally we delete the copy of the data in /tmp because we don't need it anymore.

rm -r /tmp/postgres

The other commands work the same way.

You want to move to a new server and the dabackup volume exists on the current machine

If you want to run Docker Compose on a new host machine by following the instructions below, the dabackup volume needs to exist on the new machine, not the old machine. If you have a dabackup volume on your current machine, you can copy it to your new machine using command line operations.

The ssh command can be used to make a secure network connection between your current machine and the new machine.

The following instructions assume that you have an SSL certificate that you use to connect to your new machine through ssh or similar application (e.g., PuTTY). If you want to make an ssh connection between your current server and your new server, your certificate needs to exist as a file on your current server.

Your certificate is just a text file that looks like this:

-----BEGIN RSA PRIVATE KEY-----
[several lines of random characters]
-----END RSA PRIVATE KEY-----

You can use copy and paste to recreate this file.

On your current server, you can do:

cat > mycert.rsa

It will then wait for your input. You can paste the contents of your certificate into the terminal, then type Ctrl-d. Now you can do:

cat mycert.rsa

to see what your certificate looks like and verify that it was created correctly.

You might get a complaint from ssh if your certificate is not locked down, so you should do:

chmod og-rwx mycert.rsa

Here is an example command that you can run (after customizing it) on your current server to copy your dabackup volume from the current server to a new server.

REMOTEUSER=ubuntu
REMOTEHOST=20.84.121.64
CERTFILE=mycert.rsa
docker run --rm -v dabackup:/from alpine tar -C /from -c -f - \
files postgres letsencrypt.tar.gz redis.rdb" | ssh -i \
${CERTFILE} ${REMOTEUSER}@${REMOTEHOST} 'docker run --rm -i \
-v dabackup:/to alpine tar -C /to -x -p -v -f -'

This assumes:

  • On the new server, you log in with the username ubuntu
  • The new server's IP address is 20.84.121.64 (an external hostname would also work)
  • The name of your certificate file is mycert.rsa.
  • The dabackup volume contains the directories files and postgres, as well as files letsencrypt.tar.gz and redis.rdb. If you aren't using Let's Encrypt, take out the reference to letsencrypt.tar.gz. If you are using cloud-based data storage, take out the reference to files.

The copy operation may take a long time, especially if there are a lot of files in the files directory.

This is a pretty complicated one-liner. The basic idea is that it runs tar inside of a temporary Docker container on the local machine and simultaneously runs tar inside of a temporary container on the remote machine. Each container mounts the dabackup volume. The local tar command creates an archive, and the archive is sent as a data stream to the remote tar command, which extracts the archive. The output of the local tar command is sent to the remote tar command by way of a pipe (|). ssh is used to run the remote tar command from the local machine while sending the output of the local tar command to the remote tar command.

  • alpine is the name of a very mimimal Linux distribution that contains basic commands like tar.
  • -v dabackup:/from means that the dabackup volume will be available at the path /from.
  • tar is an archiving utility.
    • -C /from means to make /from the active working directory
    • -c means "create" an archive
    • -f indicates the output file containing the archive and -f - has a special meaning, which is to write the file to standard output rather than to a file on the file system
    • files postgres letsencrypt.tar.gz redis.rdb is the list of paths to be included in the archive
  • The | character indicates that the output of the previous command should be "piped" to the next command, which is ssh.
  • ssh can do many things but in this context it runs a command on a remote server.
    • -i ${CERTFILE} means to use the certificate indicated in the file mycert.rsa
    • ${REMOTEUSER}@${REMOTEHOST} means that ssh should connect to the host 20.84.121.64 and log in as the user ubuntu
    • 'docker run --rm -i -v dabackup:/to alpine tar -C /to -x -p -v -f -' is the command that will be run on 20.84.121.64. In this docker run command:
      • --rm means remove the container afterward
      • -i means that the container should run interactively, meaning that the output from the previous command will be the input to the command that docker run will run.
      • -v dabackup:/to means mount the dabackup volume (which will be created if it doesn't exist) at the path /to
      • alpine, as before, is a minimal Linux distribution
      • tar -C /to -x -p -v -f - is the command to run inside the alpine container.
        • -C /to means to change the current working directory to /to, which is where the dabackup volume is mounted
        • -x means "extract" an archive
        • -p means preserve file permissions when extracting
        • -v means tar should be "verbose" in its output
        • -f indicates the input file containing the archive and -f - has a special meaning, which is to use the standard input as the archive rather than a file on the file system

You want to move to a new server and the dabackup volume does not exist on the current machine

If you want to move to a new server and you do not have a dabackup volume on the current machine, you can follow the instructions in the "Using the same server and dabackup does not exist" section above, followed by the instructions in the "You want to move to a new server and the dabackup volume exists on the current machine" section above.

Then, you will have a dabackup volume on your new server, and you can continue to the next section.

Migrating the Configuration

If you made modifications to the Configuration of your existing server, you will need to bring those modifications over to your initial-config.yml file.

To get your Configuration from your dabackup volume, you can do:

docker run -v dabackup:/data alpine cat /data/config.yml > old-config.yml

The Docker Compose deployment relies on certain configuration directives to operate, so if you simply copy your existing config.yml file and overwrite initial-config.yml, your deployment will fail. You will need to copy over some directives but not others.

Items in the generated initial-config.yml file that you should not disturb are:

  • external hostname - should have been correctly set already by ./setup.sh
  • behind https load balancer - should have been set correctly to true by ./setup.sh; the nginx service acts as an HTTP proxy for the docassemble web app
  • url root - should have been correctly set by ./setup.sh
  • enable unoconv - should be false because this deployment does not support unoconv
  • use nginx to serve files - should be true, although you can turn it off if it causes any problems.
  • redis - only change this if you are using an external Redis server (see "Migrating Redis," below).
  • rabbitmq - this will point to the RabbitMQ server started by Docker Compose
  • gotenberg - the url attribute within gotenberg will point to the Gotenberg server started by Docker Compose
  • packages
  • uploads
  • log
  • ready file
  • webapp
  • version file
  • log to std
  • allow log viewing
  • pandoc
  • expose websockets
  • libreoffice
  • pandoc with celery
  • libreoffice with celery
  • tesseract with celery
  • celery modules
  • celery task routes
  • db - only change this if you are using an external SQL server (see "Migrating SQL," below)

Also, note that you may have configuration options in your Configuration that are not applicable in the Docker Compose deployment. These include:

  • root owned
  • os locale
  • other os locales
  • ubuntu packages
  • python packages
  • backup days
  • websockets
  • websockets ip
  • websockets port
  • http port
  • use lets encrypt
  • lets encrypt email
  • nginx ssl protocols
  • nginx ssl ciphers
  • update on start
  • enable unoconv
  • timezone
  • web server
  • pip index
  • supervisor
  • backup file storage

You should not copy these Configuration directives.

It is very important to copy over the secretkey from your existing deployment. Replace the secretkey that was generated by ./setup.sh.

You should run your initial-config.yml through a YAML linter when you are finished editing it, to make sure you don't have any duplicate keys or other errors that may cause problems.

sudo apt-get install yamllint
yamllint initial-config.yml

(Note that line-length errors are not a problem.)

Migrating SQL

If you currently use an external SQL server, like AWS RDS, this part is easy: simply edit your initial-config.yml file and update the db configuration, then edit your docker-compose.yml file to remove the db service and the pg_data volume.

Otherwise, you need to migrate your SQL server's data to the db service in your Docker Compose deployment.

Before migrating your SQL database, it is important to safely docker stop your running jhpyle/docassemble container. This will ensure that the database is dumped to the postgres/docassemble file in data storage.

If you have a dabackup volume, start up the db service and attach dabackup to it at the path /import:

docker compose run --name tempdb --rm -d -v dabackup:/import db

The postgres:latest image will download and run as a container named tempdb. The docassemble database will be created, and the docassemble user will be given control over the docassemble database with the password abc123. This is all specified in the docker-compose.yml file under the db service. The --rm parameter means that the container should be deleted when it stops. (Note that the removal of the container does not mean the removal of the database, because the database is stored in the pg_data volume. The -d parameter means that the db service should run in the background, so that you are returned to the command line. The -v dabackup:/import parameter means that the dabackup volume will be available inside the container at /import.

Now that the db service is running, you can use docker compose exec to run commands inside the db container, and those commands can make use of the files in the dabackup volume (mounted at /import). The following command restores the docassemble database from the backup file located in the postgres directory inside the dabackup volume:

docker compose exec db pg_restore -F c -c -d docassemble \
-U docassemble -W /import/postgres/docassemble

It will ask for a password, and the password is abc123 (as was specified in the docker-compose.yml file). The meaning of this command is:

  • Run docker compose
  • Execute a command in the container of the db service
  • The command to execute is pg_restore with the following parameters:
    • -F c specifies the format of the backup file ("custom")
    • -c means to clean the database first (this can be omitted if you know that the database is empty)
    • -d docassemble means restore to the docassemble database
    • -U docassemble means to connect to the PostgreSQL server as the docassemble user
    • -W means that pg_restore should interactively ask the user for a password
    • /import/postgres/docassemble is the path to the file containing the database backup

Note that this command may produce a number of error messages about tables and sequences not existing. This is expected when -c is used and the database is empty. It may end with a statement like:

pg_restore: warning: errors ignored on restore: 86

Now you can stop the container you created when you ran docker compose run.

docker stop tempdb

The pg_data volume now contains a populated SQL database ready to be used by the docassemble application.

Migrating Redis

If you currently use an external Redis server, this part is easy: simply edit your initial-config.yml file and update the redis directive, then edit your docker-compose.yml file to remove the redis service and the redis_data volume. Also, in the docker-compose.yml file, replace redis://:abc123@redis with your custom Redis URL wherever you see it.

Otherwise, you need to migrate your Redis server's data to the redis service in your Docker Compose deployment.

Before migrating your Redis database, it is important to safely docker stop your running jhpyle/docassemble container. This will ensure that the database is dumped to the redis.rdb file in data storage.

If your data storage method uses a Docker volume, you can copy the redis.rdb file from the volume to the redis_data volume by doing:

docker compose run --rm -v dabackup:/import redis install \
-o redis -g redis /import/redis.rdb /data/dump.rdb

The meaning of this command is:

  • Run docker compose
  • "Run" the redis service with the following options:
    • --rm means delete the container afterwards
    • -v dabackup:/import means mount the dabackup volume at /import inside the container
  • Instead of running the default command of the service, run the install command with the following parameters:
    • -o redis -g redis means the file being installed should be owned by the redis user
    • /import/redis.rdb is the source file, the backup of your Redis database in your dabackup volume
    • /data/dump.rdb is the destination file (the /data directory is where the redis service keeps its data)

Then you need to start the server with the Append-Only File (AOF) feature off.

docker compose run --rm --name tempredis -d redis redis-server --appendonly no

Redis will read the dump.rdb file and use its contents as the database.

The AOF feature is a good thing, so you should turn it on.

docker compose exec redis redis-cli BGREWRITEAOF

Now stop the server.

docker stop tempredis

The redis_data volume now contains a populated Redis database ready to be used by the docassemble application.

Migrating stored files

If you are currently using S3 or Azure blob storage, this part is easy: simply edit your initial-config.yml file and update the s3 or azure section with the information about your cloud data storage.

If you are not using cloud data storage, but you are using a Docker volume for data storage, the stored files exist in in your dabackup volume under the directory files. You need to copy the files directory to the place where the Docker Compose deployment can use it.

Before migrating your stored files, it is important to safely docker stop your running jhpyle/docassemble container. This will ensure that the files are backed up to data storage.

First, you need to fix the symbolic links in the files directory. Prior to of docassemble version 1.8.16, the symbolic links used in the files directory used absolute instead of relative paths, and they won't work with Docker Compose because the Docker Compose implementation uses /app instead of /usr/share/docassemble. To fix this in your dabackup volume, run the following:

docker run --rm -v ./services/core/fixsym.sh:/fixsym.sh -v dabackup:/app ubuntu:24.04 /bin/bash /fixsym.sh

This may take a few minutes to complete.

Next, you need to build the image for the docassemble service.

docker compose build docassemble

This will take some time.

Then, you can copy the files directory from your dabackup volume to the location where it can be used by the docassemble service.

docker compose run --rm --no-deps -v dabackup:/import docassemble cp -r /import/files /

The meaning of this command is:

  • Run docker compose
  • "Run" the docassemble service with the following options:
    • --rm means delete the container afterwards
    • --no-deps means that Docker Compose should not start any other service except for docassemble
    • -v dabackup:/import means mount the dabackup volume at /import inside the container
  • Instead of running the default command of the service, run the cp command with the following parameters:
    • -r means to copy recursively, so that the contents of the directory are copied
    • /import/files is the source directory (a backup copy of your stored files)
    • / is the destination directory (/files is where the stored files are located)

Migrating Let's Encrypt certificates

If you are running docassemble using HTTPS and you use Let's Encrypt to obtain certificates, you can migrate certificates from your jhpyle/docassemble deployment to your Docker Compose deployment.

If you are using a Docker volume dabackup for data storage, a copy of your Let's Encrypt certificates is located in the letsencrypt.tar.gz file in the volume. The letsencrypt.tar.gz archive includes a directory etc containing a directory letsencrypt containing the files used by Let's Encrypt.

To unpack the letsencrypt.tar.gz file into a place where the Docker Compose deployment can use them, run:

docker compose run --rm --no-deps -v dabackup:/import --entrypoint tar \
certbot -C / -zxf /import/letsencrypt.tar.gz

The meaning of this command is:

  • Run docker compose
  • "Run" the docassemble service with the following options:
    • --rm means delete the container afterwards
    • --no-deps means that Docker Compose should not start any other service except for certbot
    • -v dabackup:/import means mount the dabackup volume at /import inside the container
    • --entrypoint tar forces the service to run tar instead of the default entrypoint command.
  • Instead of running the default command of the service, run the tar command with the following parameters:
    • -C / means to change the current working directory
    • -zxf /import/letsencrypt.tar.gz means use the gzip compression format (z), extract (x), from the file /import/letsencrypt.tar.gz (f /import/letsencrypt.tar.gz)

This will populate files in /etc/letsencrypt, where the certbot service can use them.

Debugging your migration

Once you have migrated your data, you should be able to start the docassemble application in the normal fashion:

docker compose up --build --detach

If something goes wrong and you want to get your working docassemble site back up, you can docker compose down to stop everything, or use docker stop and docker rm manually to stop and remove the Docker Compose containers, and then docker start your stopped jhpyle/docassemble container.

To see logs, use:

docker compose logs

or to look at a specific service, use:

docker compose logs docassemble

To see what Docker volumes exist, run:

docker volume ls

To inspect what is inside a volume, you can run a command like:

docker run --rm -ti -v docassemble-compose_certbot-certs:/certs ubuntu:24.04 /bin/bash

This will give you a bash command line inside of a plain ubuntu:24.04 container with the Docker volume docassemble-compose_certbot-certs mounted at /certs. You can run this whether or not there are containers running that are using the Docker volumes.

You can use docker volume rm to remove a Docker volume and try populating it again.

You can get inside of a running container by running a command like:

docker compose exec certbot /bin/sh

Some services have /bin/bash installed, so you should use /bin/bash whenever possible because it is a more user-friendly shell. Nearly all containers will have /bin/sh available.

If a container is not running, but you want to run a shell within that container, and see everything that is inside of it (including mounted volumes), you can run a command like:

docker compose run --rm -i --no-deps redis /bin/bash

Sometimes you have to run the command this way:

docker compose run --rm -i --no-deps --entrypoint /bin/bash redis

Pros and cons of using Docker Compose

There are several ways to deploy docassemble:

  1. Install it manually on a Linux system.
  2. Do docker run on the monolithic jhpyle/docassemble image.
  3. Use Kubernetes (https://github.com/jhpyle/charts).
  4. Use Docker Compose.
  5. Build your own system.

Manual installation is not recommended because there would be multiple steps involved, and it you would need to know what you were doing. Using the monolithic jhpyle/docassemble image is simple but may use more disk space than you need to use. Both the Kubernetes and Docker Compose methods separate services into multiple containers. In the case of Kubernetes, this enables scalability that is practically limitless. Docker Compose does not provide any greater scalability than using docker run.

You could create your own way of deploying docassemble, even if you don't know Python. At its core, docassemble is a Python application that looks for a config.yml file at a location indicated by DA_CONFIG_FILE. By changing values in config.yml your can change how it works. The db, redis, and rabbitmq directives need to point to running services, and docassemble assumes that certain command-line applications can run, such as convert (ImageMagick), pdftk, git, pdftoppm, and pip. You could run docassemble with Gunicorn or Waitress instead of uWSGI.

The main benefit of Docker Compose is that it can conserve disk space and facilitate vulnerability scanning.

Disk space

The jhpyle/docassemble Docker image is a large image, mostly because it contains a full TeX Live implementation in order to support document assembly through Pandoc. Some users do not use the Pandoc document assembly system, so their servers would work if TeX Live was not installed. LibreOffice, which is used for document manipulation and conversion tasks (even when there is an external DOCX to PDF converter like Gotenberg, ConvertAPI, or CloudConvert) is another large package that users may or may not need. If a user does not need to perform optical character recognition (OCR), or wants to use a cloud service for OCR, they could save disk space by not installing the Tesseract application. A user might only need a few specific fonts in order to assemble PDF documents, and could save disk space by only installing the specific fonts they need. If a user wants to an external PostgreSQL database, or an external Redis database, they can save (a relatively small amount of) disk space by not installing PostgreSQL and Redis.

One way to optimize disk usage is to build a custom Docker image. You can write a custom Dockerfile that only installs the Ubuntu packages that are necessary. Note that the jhpyle/docassemble is built using a base image, jhpyle/docassemble-os, both of which are on GitHub and Docker Hub. Using copy-and-paste, it is possible to combine the Dockerfile of jhpyle/docassemble-os with the Dockerfile of jhpyle/docassemble in to a single Dockerfile, and then remove the commands that install unnecessary components. Another way is to create a Dockerfile that builds a custom image by starting with the large jhpyle/docassemble image and then uninstalling the components that are not needed. For example:

FROM jhpyle/docassemble

RUN DEBIAN_FRONTEND=noninteractive TERM=xterm \
apt-get -q -y remove \
  pandoc \
  texlive \
  texlive-luatex \
  texlive-xetex \
  texlive-latex-recommended \
  texlive-latex-extra \
  texlive-font-utils \
  texlive-lang-all \
  texlive-extra-utils \
&& apt-get -q -y autoremove

This Docker Compose implementation allows you to optimize hard drive space by only downloading the images and running the containers that are necessary for your deployment.

The following services run using standard images from Docker Hub:

  • Redis
  • RabbitMQ
  • PostgreSQL
  • NGINX
  • Certbot
  • Gotenberg

In addition, there are four services using custom images that Docker Compose builds upon startup:

  • services/Dockerfile.core
  • services/Dockerfile.pandoc
  • services/Dockerfile.libreoffice
  • services/Dockerfile.tesseract

The interactive setup.py script constructs a docker-compose.yml file based on the user's needs. If the user elects to use Pandoc, LibreOffice, or Tesseract, those applications operate inside of separate Docker Compose "services." The "core" service communicates with the other services using Celery (which uses the Redis and RabbitMQ services for interprocess communication).

The setup.py script creates a docker-compose.yml file and an initial-config.yml file that can be modified further. If you want to use an external SQL database, you can remove the postgres service from docker-compose.yml (and its volume) and edit the db configuration in initial-config.yml. The only service that is actually required is the core service; all the other services are optional or can be replaced with external services.

Using Docker Compose can use more disk space than a tailored monolithic image, since each image contains a separate copy of operating system files. With the images that the docker-compose.yml file builds itself, this concern is mitigated because the images have layers in common (the Dockerfiles start with the same lines), but it is an issue when you run multiple images downloaded from Docker Hub.

Vulnerability scanning

Segregating services into separate images has advantages if you wish to scan your Docker images for software vulnerabilities. Since scanning a large image may require large amounts of RAM, using multiple smaller images may avoid resource limits. When interpreting lists of CVEs, it is helpful to know which vulnerabilities are associated with which service. For example, if there is a vulnerability in an Ubuntu .deb package in the rabbitmq image, you can evaluate the risk of that vulnerability knowing that the rabbitmq service is not directly connected to the internet, can only accept input through port 5672, and is isolated by Docker in a container that does not share file systems with other services.

About

Docker Compose deployment of docassemble

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published