This repository demonstrates a way to run docassemble using Docker Compose.
To start up:
git clone https://github.com/jhpyle/docassemble-compose
cd docassemble-compose
./setup.sh
docker compose up --build --detach
To shut down, delete application data, and free up disk space:
docker compose down --volumes --rmi
The system requirements of using the Docker Compose implementation are the same as those described in the Docker section of the docassemble documentation. You should have at least 4GB of RAM available and at least 40GB of hard drive space. The AMD architecture is recommended but the ARM architecture is available, if somewhat experimental.
Using a Linux server in the cloud is recommended.
If you want to install it on Windows, then:
- Install WSL2. Using a recent version of Ubuntu (e.g., Ubuntu 24.04) is recommended.
- Install git for Windows.
- Install Docker Desktop for Windows.
- Make sure Docker Desktop is integrated with WSL2.
- With Docker Desktop running, launch a WSL2 command line (e.g., open the Ubuntu 24.04 "app").
The method for installing Docker Compose will depend on your host machine. Consult the internet for details.
On Debian, the command may be:
sudo apt-get install docker-compose
On Ubuntu, the command may be:
sudo apt-get install docker-compose-v2
However, if you are using Docker's repository, the command would be:
sudo apt-get install docker-compose-plugin
On Amazon Linux, the command may be:
sudo yum install docker-compose-plugin
On Windows, Docker Compose is included with Docker Desktop.
Make sure git
is
installed.
Then download docassemble-compose
:
git clone https://github.com/jhpyle/docassemble-compose
This will create a directory called docassemble-compose
in the
current directory.
Switch into that directory:
cd docassemble-compose
From here, you can run Docker Compose commands. First, however, you need to create the necessary configuration files. To do so, run:
./setup.sh
This is an interactive script that will ask you questions about your deployment and create the appropriate configuration files. The script is designed for new installations, but you should use it even if you want to migrate an existing server.
If you want to use Let's Encrypt to obtain a SSL certificate, you will
need to make sure that you have configured DNS so that your machine
has an externally addressable hostname (not a default hostname that a
provider like AWS might assign automatically). The ./setup.sh
script
will attempt to obtain an SSL certificate. However, if you already
have an SSL certificate for your server (i.e., because you are
migrating an existing server to Docker Compose) you can say no to
the question "Do you want to try to obtain a certificate from Let's
Encrypt now?"
You can then manually edit the configuration files to customize your deployment.
The following files must be present in order for the application to
start. Initial versions of the files are created by the setup.sh
script based on template files (files in the templates
directory). These files can be customized to change the way that the
application operates.
docker-compose.yml
: specifies the "services," "volumes," and "secrets" that Docker Compose should manager.initial-config.yml
: when the container running thedocassemble
service starts up, it checks to see if the file/config/config.yml
exists in the container. If it does not exist,initial-config.yml
will be copied to/config/config.yml
. Note that the/config
directory is a Docker volume. Theinitial-config.yml
is only the initial configuration. Once your docassemble server starts, the actual Configuration will be located at/config/config.yml
inside the Docker volume, and it is not tied toinitial-config.yml
.nginx.conf
: specifies the NGINX configuration used by thenginx
service.initial-creds.sh
: when the docassemble server starts for the first time, it will create a user account with administrative privileges. This file contains the username and password of the administrative user, and can also contain an API key owned by the administrative user. This file can be blank, and making it blank is a good idea once the docassemble server starts, so that your credentials are not available in plain text format on your host or in thedocassemble
container.
By changing docker-compose.yml
and initial-config.yml
, you can
customize your setup. For example, if you are using AWS RDS to provide
a SQL database, you can remove the postgres
service (and its volume,
pg_data
) from docker-compose.yml
and then modify
initial-config.yml
to include the db
configuration you want to
use. If you are using a managed or external Redis service, you can
remove the redis
service (and its volume, redis_data
) from
docker-compose.yml
and then modify initial-config.yml
to include
the redis
URL that points to your external Redis service.
The setup.py
file defines a random secretkey
in
initial-config.yml
with a random 32-character alphanumeric
string. This is important for security reasons. The secretkey
needs
to be kept secret because it is the basis of encryption on the server.
Since the secretkey
is the basis of encryption, you should not
change it after your server starts, and you should not lose it. If you
lose it (e.g., you delete it or overwrite it in your Configuration),
encrypted passwords and interview answers on the server will not be
able to be decrypted.
The setup.sh
file asks you if you want to use HTTPS with Let's
Encrypt. If you say yes, it includes the certbot
service in
docker-compose.yml
, which renews the certificate every 24 hours. The
setup.sh
script also obtains the initial certificate, so make sure
you have set up a DNS entry to associate your hostname with the
machine running Docker Compose before you run setup.sh
.
If editing docker-compose.yml
and initial-config.yml
does not
provide sufficient flexibility, you can customize the application
further by modifying the core
Docker image, which contains the
docassemble web application. The services
directory contains a
Dockerfile.core
file and a core
subdirectory with files that
Dockerfile.core
uses. The docker-compose.yml
file references this
file:
build:
context: ./services
dockerfile: Dockerfile.core
image: docassemble
Note that the image's tag is simply docassemble
, not
jhpyle/docassemble
. The image will be built when you run docker compose up --build --detach
. You can change the way the container
operates by changing the following files in services
:
core/run-www.sh
: abash
script that runs aswww-data
and launchesuwsgi
, thecelery
workers, and thewebsockets
listener.core/run.sh
: abash
script that runs asroot
and launchescron
andrun-www.sh
.core/cron.sh
: abash
script that cleans up files in the temporary directory and executes docassemble scheduled tasks.core/crontab
: the file that will be installed at/etc/crontab
and that will causecron.sh
to be executed.core/docassemble.ini
: the configuration file foruwsgi
.Dockerfile.core
: this contains instructions on how to build the image containing the docassemble application. It is based on Ubuntu 24.04. The Ubuntu packages that are available within the container are determined byapt-get
commands in this file.
When you are sure the correct configuration files are in place, you can start everything by running:
docker compose up --build --detach
To stop each of the containers spawned by docker compose up
, run:
docker compose down
This will not delete any application data or images. If you want to delete the application data, run:
docker compose down --volumes
If you want to delete the images, run:
docker compose down --rmi
If you need to edit your config.yml
file and you cannot do through
the web interface, then while your application is running, you can do:
docker cp docassemble-compose-docassemble-1:/config/config.yml .
nano config.yml
docker cp config.yml docassemble-compose-docassemble-1:/config/config.yml
docker exec docassemble-compose-docassemble-1 chown www-data:www-data /config/config.yml
rm config.yml
(In place of nano
, you can use whatever editor you want to use.)
To upgrade the application, cd
to docassemble-compose
directory
and run:
git pull
docker compose down --rmi
docker volume rm docassemble-compose_app
docker compose up --build --detach
git pull
will down a new version of docassemble-compose
, if one
exists. If you have modified any of the files that are part of the
repository, you will have merge conflicts that you will need to
resolve.
docker compose down --rmi
stops the application and removes the
Docker images. This is necessary because if your docker-compose.yml
calls for nginx:latest
, and there is a nginx:latest
present
(i.e. it is listed when you run docker images
, then Docker Compose
will use that version of the image, even if it is very old. Removing
the images forces Docker Compose to build or download new versions
of the images. Removing the images does not delete application data,
because your application data is in Docker volumes.
docker volume rm docassemble-compose_app
removes the /app
container, which contains software for the docassemble web app. It
is a Docker volume for purposes of sharing across containers, but does
not contain application data.
docker compose up --build --detach
starts the application, downloads
and/or builds the images necessary, so that they are using new
versions, and starts the application again.
As long as you do not delete your Docker volumes (e.g., with docker volume rm
or docker compose down --volumes
) then when the services
start up again, they will use the application data in your
volumes. The initial-config.yml
file will not be used because there
is already a config.yml
file in the app
volume. Your PostgreSQL
service will use the data in the pg_data
volume. Your Redis server
will use the data in the redis_data
volume. If you are using Let's
Encrypt, the nginx
service will use the certificates in the
certbot-certs
volume. The rabbitmq
service will use the data in
rabbitmq_data
, although it is unlikely that the contents of that
volume are important.
Note that your docker-compose.yml
file specifies particular images
and tags. When you do git pull
, your docker-compose.yml
file will
not change. If docassemble-compose
has a new version of the
docker-compose.yml.template
file, you will not see those changes
unless you delete your docker-compose.yml
file and rerun
setup.sh
. Note that if you do that, you may need to manually migrate
the data that are in your pg_data
and redis
Docker volumes. For
example, if the data in pg_data
were created by postgres:18
, and
you run postgres:19
while mounting the same pg_data
volume, the
postgres:19
container may not know how to read the data in the
pg_data
volume unless you perform manual migration steps.
Application data are stored in Docker volumes. You will need to make sure you do not accidentially delete these Docker volumes.
If you want to transfer your docassemble server from one host to another, you will need to copy the volumes from one server to another. In the section below, "You want to move to a new server and the dabackup volume exists on the current machine," there is an explanation of how to use the command line to transfer a Docker volume from one machine to another.
If you run docker volume ls
, you will see several volumes:
DRIVER VOLUME NAME
local docassemble-compose_app
local docassemble-compose_config
local docassemble-compose_files
local docassemble-compose_pg_data
local docassemble-compose_rabbitmq_data
local docassemble-compose_redis_data
local docassemble-compose_tempdata
If you are using Let's Encrypt, you will also see:
local docassemble-compose_certbot-certs
local docassemble-compose_certbot-www
docassemble-compose_app
does not contain application data. It
contains software, including bash scripts and a Python virtual
environment. It is a volume because it needs to be accessible by other
containers. In order to upgrade docassemble-compose
, you will need
to delete this volume.
docassemble-compose_config
contains the config.yml
file. It is mounted
at /config/
inside the docassemble
container. A docker volume is
used instead of a bind mount because the file needs to be readable and
writable by the web application. If it was a bind mount, managing file
permissions would be difficult.
docassemble-compose_files
contains the uploaded files and Playground
files. It will be empty if you are using S3 or Azure Blob Storage.
docassemble-compose_pg_data
contains the files for the PostgreSQL
database. The db
service uses it.
docassemble-compose_rabbitmq_data
contains application data for the
RabbitMQ task queue system. During a migration, these files are
important only if you shut down Docker Compose while there are
pending tasks, and you want the tasks to resume.
docassemble-compose_redis_data
contains the Redis database for the
docassemble application and the background tasks system.
docassemble-compose_tempdata
contains a filesystem of temporary
files. The volume does not contain important application data. It is a
volume because it needs to be shared across containers. If you are
doing a migration, you can ignore this volume.
docassemble-compose_certbot-certs
contains SSL certificates obtained
by the certbot
service. If you are migrating, you should copy this
volume. However, certbot
may issue you a new certificate even if you
lose the contents of this volume.
docassemble-compose_certbot-www
does not contain application
data. It is a volume that is shared between the certbot
and nginx
services for purposes of communicating with Let's Encrypt servers.
If something goes wrong, you can troubleshoot by running docker compose
commands from the docassemble-compose
directory.
See the logs of the docassemble
service, which includes the web
application, celery workers, scheduled task system, and the websockets
application:
docker compose logs docassemble
See the logs of the nginx
service:
docker compose logs nginx
You can get a shell inside the docassemble
service, where you can
explore the files in the running container:
docker compose exec docassemble /bin/bash
The following does the same thing, but runs as the root user:
docker compose exec -u root docassemble /bin/bash
Before you start the docassemble
service, you can get a shell inside
the docassemble
service:
compose run --rm --no-deps docassemble /bin/bash
Note that this will create the app
volume at /app
if it does not
already exist. You can use this to edit the files in /app
before
starting the service.
You can get a shell inside the nginx
service, where you can explore
the configuration files supporting the nginx
service:
docker compose exec nginx /bin/bash
You can use the Redis database:
docker compose exec redis redis-cli
Note that you will need to run AUTH abc123
to authenticate with the
server before you can run redis commands.
You can psql
on the docassemble
PostgreSQL database:
docker compose exec db psql -U docassemble -W docassemble
Note that it will ask you for the password, which is abc123
.
This method of deploying docassemble is entirely separate from the
method described in the Docker section of the docassemble
documentation; that section of the documentation will be of little
help when using docassemble-compose
. It describes the
jhpyle/docassemble
image, which is not used by this Docker Compose
implementation.
The monolithic jhpyle/docassemble
Docker image uses supervisord
to
orchestrate services like Redis, RabbitMQ, PostgreSQL,
NGINX, uWSGI, and Unoconv. By contrast, this implementation uses
Docker Compose to orchestrate most of these services.
In the jhpyle/docassemble
Docker image, Configuration values can be
used to change the way that NGINX operates. This deployment does not
work that way. For example, changing maximum content length
in the
Configuration will not affect NGINX's client_max_body_size
setting;
you will have to change that yourself.
The following Configuration directives will not affect the nginx
service, and manual changes to nginx.conf
are required.
root
url root
external hostname
http port
websockets port
nginx ssl protocols
nginx ssl ciphers
maximum content length
The Docker section describes methods of persisting application data in a single collection of backup files, stored either in a Docker volume, S3 bucket, or Azure Blob Storage, which is populated daily and on every safe shutdown. This set of backup files allow you to start a new container from a new image, where the application data are automatically imported, even if the underlying OS of the image changed, or the PostgreSQL database version changed.
This Docker Compose deployment method does not use this
system. Rather, application data persist in several Docker
volumes. Running docker compose down --volumes
will delete all of
the application data.
Note that if you upgrade the images in the Docker Compose application,
there will be a version mismatch between the data in the Docker volume
and the version of the software. The Docker Compose application uses
a number of third-party images like postgres
and redis
, and these
images may or may not have the ability to migrate data created by
earlier versions. You may need to perform manual migration steps. See
the documentation for those images.
This method of deploying docassemble allows you to minimize disk usage by optionally not installing Pandoc, Tesseract, or LibreOffice. If you elect to use these features, they will run in separate containers that interact with the web application through Celery.
Conversion of DOCX files to PDF is handled by a Gotenberg service.
The Ubuntu packages installed in the web app container are kept to a minimum. As a result, certain features are not available.
fontconfig
is not installed, so thefc-list
command to list the installed fonts on the server is not available.- The
mmdc()
function, which is deprecated, is not supported because it requires node.js and Chrome to be installed.
The Logs menu item is not available when you use this deployment method. Log files are managed through Docker Compose.
Some features are not yet supported, including:
- Using SSL certificates to authenticate with a PostgreSQL database.
- Setting a server time zone.
- Setting a server locale.
- Using an alternative
pip
package index.
Before you try to migrate an existing server from the
jhpyle/docassemble
single image deployment to the Docker Compose
deployment, make sure you are confident in your knowledge of the Linux
command line and Docker Compose. Do not irreversibly destroy your
current deployment; you can shut it down while you attempt migration,
but you need to plan to revert back to using it if the migration to
Docker Compose does not succeed.
You also need to make sure that you have enough free disk space to deploy the Docker Compose deployment. The software alone may take up upwards of 10GB, and you need space for a copy of all of your stored files, your SQL database, and your Redis database.
Migrating your application data is a delicate process. There is no automated tool that will do it.
There are several components that need to be migrated:
- Configuration directives need to be copied and pasted into
initial-config.yml
. - The SQL server needs to be copied, unless you are using an external SQL server currently.
- The Redis database needs to be copied, unless you are using an external Redis server currently.
- The stored files need to be copied, unless you are using S3 or Azure Blob Storage.
- If you are using Let's Encrypt, the SSL certificates need to be copied.
How you restore the SQL database, Redis database, and stored files depends on what form of data storage you are using.
The migration instructions in the following sections assume that on
the same machine where you want to run Docker Compose, you have a
Docker volume called dabackup
that contains the application data you
want to migrate.
If your current server uses a Docker volume for persistent storage,
and you plan to run Docker Compose on the same host, the assumptions
of the following sections (beginning with "Migrating the
Configuration") are met. The dabackup
Docker volume mounts to
/usr/share/docassemble/backup
on the container running the
jhpyle/docassemble
image, and it contains the files you need. You
can proceed to the Migrating the Configuration section.
If you don't have a dabackup
volume, you can create one and copy
files from your current jhpyle/docassemble
container to the
dabackup
volume.
First, find the name or ID of your running jhpyle/docassemble
container.
docker ps -a
Suppose the container has the id 6fe0f76e8b81
. Then you can do:
MYCONTAINER=6fe0f76e8b81
docker stop -t 6000 ${MYCONTAINER}
docker run --rm --name tempcontainer -d -v dabackup:/import ubuntu:24.04 sleep infinity
docker cp -a ${MYCONTAINER}:/usr/share/docassemble/backup/postgres /tmp/
docker cp -a /tmp/postgres tempcontainer:/import/
rm -r /tmp/postgres
docker cp -a ${MYCONTAINER}:/usr/share/docassemble/backup/redis.rdb /tmp/
docker cp -a /tmp/redis.rdb tempcontainer:/import/
rm /tmp/redis.rdb
If you are using Let's Encrypt, copy the Let's Encrypt certificates as well:
docker cp -a ${MYCONTAINER}:/usr/share/docassemble/backup/letsencrypt.tar.gz /tmp/
docker cp -a /tmp/letsencrypt.tar.gz tempcontainer:/import/
rm /tmp/letsencrypt.tar.gz
If you are using cloud-based data storage, you do not need to
migrate any stored files, but otherwise, you need to add the stored
files to the dabackup
volume.
docker cp -a ${MYCONTAINER}:/usr/share/docassemble/files /tmp/
docker cp -a /tmp/files tempcontainer:/import/
rm -r /tmp/files
When you are done populating the dabackup
folder, stop the temporary container:
docker stop tempcontainer
Now the dabackup
volume contains the files you need. To verify:
docker run --rm -v dabackup:/data alpine ls -lR /data
You can proceed to the Migrating the Configuration section.
Here is an explanation of the above commands. First, your running
container is stopped, and is given plenty of time to shut down
(docker stop -t 6000 ${MYCONTAINER}
). This will ensure that
up-to-date application data exists in /usr/share/docassemble/backup
and no new application data will be generated while you are performing
the migration.
Next, a temporary container is run, solely for the purpose of
receiving files via docker cp
and storing them in the Docker volume
dabackup
.
docker run --rm --name tempcontainer -d -v dabackup:/import ubuntu:24.04 sleep infinity
The -d
parameter "detaches" the container, so it runs in the
background. The --name tempcontainer
parameter gives the container a
name. The -v dabackup:/import
parameter creates the dabackup
volume and mounts it at /import
inside the container. The command
sleep infinity
does nothing; the machine will just wait
indefinitely.
The next command copies the postgres
directory from the internal
backup directory inside the container and writes it to the /tmp
directory on the host.
docker cp -a ${MYCONTAINER}:/usr/share/docassemble/backup/postgres /tmp/
The -a
parameter to docker cp
means "archive." It ensures that
file attributes are preserved during the copy operation.
The next command copies the postgres
directory from the /tmp
directory on the host to the dabackup
volume, using tempcontainer
and its Docker volume mount to receive the file.
docker cp -a /tmp/postgres tempcontainer:/import/
This is a two-step process because docker cp
does not support
copying from one container to another. Finally we delete the copy of
the data in /tmp
because we don't need it anymore.
rm -r /tmp/postgres
The other commands work the same way.
If you want to run Docker Compose on a new host machine by following
the instructions below, the dabackup
volume needs to exist on the
new machine, not the old machine. If you have a dabackup
volume on
your current machine, you can copy it to your new machine using
command line operations.
The ssh
command can be used to make a secure network connection
between your current machine and the new machine.
The following instructions assume that you have an SSL certificate
that you use to connect to your new machine through ssh
or similar
application (e.g., PuTTY). If you want to make an ssh
connection
between your current server and your new server, your certificate
needs to exist as a file on your current server.
Your certificate is just a text file that looks like this:
-----BEGIN RSA PRIVATE KEY-----
[several lines of random characters]
-----END RSA PRIVATE KEY-----
You can use copy and paste to recreate this file.
On your current server, you can do:
cat > mycert.rsa
It will then wait for your input. You can paste the contents of your certificate into the terminal, then type Ctrl-d. Now you can do:
cat mycert.rsa
to see what your certificate looks like and verify that it was created correctly.
You might get a complaint from ssh
if your certificate is not locked
down, so you should do:
chmod og-rwx mycert.rsa
Here is an example command that you can run (after customizing it) on
your current server to copy your dabackup
volume from the current
server to a new server.
REMOTEUSER=ubuntu
REMOTEHOST=20.84.121.64
CERTFILE=mycert.rsa
docker run --rm -v dabackup:/from alpine tar -C /from -c -f - \
files postgres letsencrypt.tar.gz redis.rdb" | ssh -i \
${CERTFILE} ${REMOTEUSER}@${REMOTEHOST} 'docker run --rm -i \
-v dabackup:/to alpine tar -C /to -x -p -v -f -'
This assumes:
- On the new server, you log in with the username
ubuntu
- The new server's IP address is
20.84.121.64
(an external hostname would also work) - The name of your certificate file is
mycert.rsa
. - The
dabackup
volume contains the directoriesfiles
andpostgres
, as well as filesletsencrypt.tar.gz
andredis.rdb
. If you aren't using Let's Encrypt, take out the reference toletsencrypt.tar.gz
. If you are using cloud-based data storage, take out the reference tofiles
.
The copy operation may take a long time, especially if there are a lot
of files in the files
directory.
This is a pretty complicated one-liner. The basic idea is that it runs
tar
inside of a temporary Docker container on the local machine and
simultaneously runs tar
inside of a temporary container on the
remote machine. Each container mounts the dabackup
volume. The local
tar
command creates an archive, and the archive is sent as a data
stream to the remote tar
command, which extracts the archive. The
output of the local tar
command is sent to the remote tar
command
by way of a pipe (|
). ssh
is used to run the remote tar
command from
the local machine while sending the output of the local tar
command
to the remote tar
command.
alpine
is the name of a very mimimal Linux distribution that contains basic commands liketar
.-v dabackup:/from
means that thedabackup
volume will be available at the path/from
.tar
is an archiving utility.-C /from
means to make/from
the active working directory-c
means "create" an archive-f
indicates the output file containing the archive and-f -
has a special meaning, which is to write the file to standard output rather than to a file on the file systemfiles postgres letsencrypt.tar.gz redis.rdb
is the list of paths to be included in the archive
- The
|
character indicates that the output of the previous command should be "piped" to the next command, which isssh
. ssh
can do many things but in this context it runs a command on a remote server.-i ${CERTFILE}
means to use the certificate indicated in the filemycert.rsa
${REMOTEUSER}@${REMOTEHOST}
means thatssh
should connect to the host20.84.121.64
and log in as the userubuntu
'docker run --rm -i -v dabackup:/to alpine tar -C /to -x -p -v -f -'
is the command that will be run on20.84.121.64
. In thisdocker run
command:--rm
means remove the container afterward-i
means that the container should run interactively, meaning that the output from the previous command will be the input to the command thatdocker run
will run.-v dabackup:/to
means mount thedabackup
volume (which will be created if it doesn't exist) at the path/to
alpine
, as before, is a minimal Linux distributiontar -C /to -x -p -v -f -
is the command to run inside thealpine
container.-C /to
means to change the current working directory to/to
, which is where thedabackup
volume is mounted-x
means "extract" an archive-p
means preserve file permissions when extracting-v
meanstar
should be "verbose" in its output-f
indicates the input file containing the archive and-f -
has a special meaning, which is to use the standard input as the archive rather than a file on the file system
If you want to move to a new server and you do not have a dabackup
volume on the current machine, you can follow the instructions in the
"Using the same server and dabackup does not exist" section above,
followed by the instructions in the "You want to move to a new server
and the dabackup volume exists on the current machine" section above.
Then, you will have a dabackup
volume on your new server, and you
can continue to the next section.
If you made modifications to the Configuration of your existing
server, you will need to bring those modifications over to your
initial-config.yml
file.
To get your Configuration from your dabackup
volume, you can do:
docker run -v dabackup:/data alpine cat /data/config.yml > old-config.yml
The Docker Compose deployment relies on certain
configuration directives to operate, so if you simply copy your
existing config.yml
file and overwrite initial-config.yml
, your
deployment will fail. You will need to copy over some directives but
not others.
Items in the generated initial-config.yml
file that you should not
disturb are:
external hostname
- should have been correctly set already by./setup.sh
behind https load balancer
- should have been set correctly totrue
by./setup.sh
; thenginx
service acts as an HTTP proxy for the docassemble web appurl root
- should have been correctly set by./setup.sh
enable unoconv
- should befalse
because this deployment does not supportunoconv
use nginx to serve files
- should betrue
, although you can turn it off if it causes any problems.redis
- only change this if you are using an external Redis server (see "Migrating Redis," below).rabbitmq
- this will point to the RabbitMQ server started by Docker Composegotenberg
- theurl
attribute withingotenberg
will point to the Gotenberg server started by Docker Composepackages
uploads
log
ready file
webapp
version file
log to std
allow log viewing
pandoc
expose websockets
libreoffice
pandoc with celery
libreoffice with celery
tesseract with celery
celery modules
celery task routes
db
- only change this if you are using an external SQL server (see "Migrating SQL," below)
Also, note that you may have configuration options in your Configuration that are not applicable in the Docker Compose deployment. These include:
root owned
os locale
other os locales
ubuntu packages
python packages
backup days
websockets
websockets ip
websockets port
http port
use lets encrypt
lets encrypt email
nginx ssl protocols
nginx ssl ciphers
update on start
enable unoconv
timezone
web server
pip index
supervisor
backup file storage
You should not copy these Configuration directives.
It is very important to copy over the secretkey
from your existing
deployment. Replace the secretkey
that was generated by
./setup.sh
.
You should run your initial-config.yml
through a YAML linter when
you are finished editing it, to make sure you don't have any duplicate
keys or other errors that may cause problems.
sudo apt-get install yamllint
yamllint initial-config.yml
(Note that line-length
errors are not a problem.)
If you currently use an external SQL server, like AWS RDS, this part
is easy: simply edit your initial-config.yml
file and update the
db
configuration, then edit your docker-compose.yml
file to remove
the db
service and the pg_data
volume.
Otherwise, you need to migrate your SQL server's data to the db
service in your Docker Compose deployment.
Before migrating your SQL database, it is important to safely docker stop
your running jhpyle/docassemble
container. This will ensure
that the database is dumped to the postgres/docassemble
file in
data storage.
If you have a dabackup
volume, start up the db
service and attach
dabackup
to it at the path /import
:
docker compose run --name tempdb --rm -d -v dabackup:/import db
The postgres:latest
image will download and run as a container named
tempdb
. The docassemble
database will be created, and the
docassemble
user will be given control over the docassemble
database with the password abc123
. This is all specified in the
docker-compose.yml
file under the db
service. The --rm
parameter
means that the container should be deleted when it stops. (Note that
the removal of the container does not mean the removal of the
database, because the database is stored in the pg_data
volume. The
-d
parameter means that the db
service should run in the
background, so that you are returned to the command line. The -v dabackup:/import
parameter means that the dabackup
volume will be
available inside the container at /import
.
Now that the db
service is running, you can use docker compose exec
to run commands inside the db
container, and those commands
can make use of the files in the dabackup
volume (mounted at
/import
). The following command restores the docassemble
database
from the backup file located in the postgres
directory inside the
dabackup
volume:
docker compose exec db pg_restore -F c -c -d docassemble \
-U docassemble -W /import/postgres/docassemble
It will ask for a password, and the password is abc123
(as was
specified in the docker-compose.yml
file). The meaning of this
command is:
- Run docker compose
- Execute a command in the container of the
db
service - The command to execute is
pg_restore
with the following parameters:-F c
specifies the format of the backup file ("custom")-c
means to clean the database first (this can be omitted if you know that the database is empty)-d docassemble
means restore to thedocassemble
database-U docassemble
means to connect to the PostgreSQL server as thedocassemble
user-W
means thatpg_restore
should interactively ask the user for a password/import/postgres/docassemble
is the path to the file containing the database backup
Note that this command may produce a number of error messages about
tables and sequences not existing. This is expected when -c
is used
and the database is empty. It may end with a statement like:
pg_restore: warning: errors ignored on restore: 86
Now you can stop the container you created when you ran docker compose run
.
docker stop tempdb
The pg_data
volume now contains a populated SQL database ready to be
used by the docassemble application.
If you currently use an external Redis server, this part is easy:
simply edit your initial-config.yml
file and update the redis
directive, then edit your docker-compose.yml
file to remove the
redis
service and the redis_data
volume. Also, in the
docker-compose.yml
file, replace redis://:abc123@redis
with your
custom Redis URL wherever you see it.
Otherwise, you need to migrate your Redis server's data to the redis
service in your Docker Compose deployment.
Before migrating your Redis database, it is important to safely docker stop
your running jhpyle/docassemble
container. This will ensure
that the database is dumped to the redis.rdb
file in
data storage.
If your data storage method uses a Docker volume, you can copy the
redis.rdb
file from the volume to the redis_data
volume by doing:
docker compose run --rm -v dabackup:/import redis install \
-o redis -g redis /import/redis.rdb /data/dump.rdb
The meaning of this command is:
- Run docker compose
- "Run" the
redis
service with the following options:--rm
means delete the container afterwards-v dabackup:/import
means mount thedabackup
volume at/import
inside the container
- Instead of running the default command of the service, run the
install
command with the following parameters:-o redis -g redis
means the file being installed should be owned by theredis
user/import/redis.rdb
is the source file, the backup of your Redis database in yourdabackup
volume/data/dump.rdb
is the destination file (the/data
directory is where theredis
service keeps its data)
Then you need to start the server with the Append-Only File (AOF) feature off.
docker compose run --rm --name tempredis -d redis redis-server --appendonly no
Redis will read the dump.rdb
file and use its contents as the
database.
The AOF feature is a good thing, so you should turn it on.
docker compose exec redis redis-cli BGREWRITEAOF
Now stop the server.
docker stop tempredis
The redis_data
volume now contains a populated Redis database ready
to be used by the docassemble application.
If you are currently using S3 or Azure blob storage, this part is
easy: simply edit your initial-config.yml
file and update the s3
or azure
section with the information about your cloud data storage.
If you are not using cloud data storage, but you are using a Docker
volume for data storage, the stored files exist in in your
dabackup
volume under the directory files
. You need to copy the
files
directory to the place where the Docker Compose deployment
can use it.
Before migrating your stored files, it is important to safely docker stop
your running jhpyle/docassemble
container. This will ensure
that the files are backed up to data storage.
First, you need to fix the symbolic links in the files
directory. Prior to of docassemble version 1.8.16, the symbolic
links used in the files
directory used absolute instead of relative
paths, and they won't work with Docker Compose because the Docker
Compose implementation uses /app
instead of
/usr/share/docassemble
. To fix this in your dabackup
volume, run
the following:
docker run --rm -v ./services/core/fixsym.sh:/fixsym.sh -v dabackup:/app ubuntu:24.04 /bin/bash /fixsym.sh
This may take a few minutes to complete.
Next, you need to build the image for the docassemble
service.
docker compose build docassemble
This will take some time.
Then, you can copy the files
directory from your dabackup
volume
to the location where it can be used by the docassemble
service.
docker compose run --rm --no-deps -v dabackup:/import docassemble cp -r /import/files /
The meaning of this command is:
- Run docker compose
- "Run" the
docassemble
service with the following options:--rm
means delete the container afterwards--no-deps
means that Docker Compose should not start any other service except fordocassemble
-v dabackup:/import
means mount thedabackup
volume at/import
inside the container
- Instead of running the default command of the service, run the
cp
command with the following parameters:-r
means to copy recursively, so that the contents of the directory are copied/import/files
is the source directory (a backup copy of your stored files)/
is the destination directory (/files
is where the stored files are located)
If you are running docassemble using HTTPS and you use Let's
Encrypt to obtain certificates, you can migrate certificates from your
jhpyle/docassemble
deployment to your Docker Compose deployment.
If you are using a Docker volume dabackup
for data storage, a copy
of your Let's Encrypt certificates is located in the
letsencrypt.tar.gz
file in the volume. The letsencrypt.tar.gz
archive includes a directory etc
containing a directory
letsencrypt
containing the files used by Let's Encrypt.
To unpack the letsencrypt.tar.gz
file into a place where the Docker
Compose deployment can use them, run:
docker compose run --rm --no-deps -v dabackup:/import --entrypoint tar \
certbot -C / -zxf /import/letsencrypt.tar.gz
The meaning of this command is:
- Run docker compose
- "Run" the
docassemble
service with the following options:--rm
means delete the container afterwards--no-deps
means that Docker Compose should not start any other service except forcertbot
-v dabackup:/import
means mount thedabackup
volume at/import
inside the container--entrypoint tar
forces the service to runtar
instead of the default entrypoint command.
- Instead of running the default command of the service, run the
tar
command with the following parameters:-C /
means to change the current working directory-zxf /import/letsencrypt.tar.gz
means use the gzip compression format (z
), extract (x
), from the file/import/letsencrypt.tar.gz
(f /import/letsencrypt.tar.gz
)
This will populate files in /etc/letsencrypt
, where the certbot
service can use them.
Once you have migrated your data, you should be able to start the docassemble application in the normal fashion:
docker compose up --build --detach
If something goes wrong and you want to get your working
docassemble site back up, you can docker compose down
to stop
everything, or use docker stop
and docker rm
manually to stop and
remove the Docker Compose containers, and then docker start
your
stopped jhpyle/docassemble
container.
To see logs, use:
docker compose logs
or to look at a specific service, use:
docker compose logs docassemble
To see what Docker volumes exist, run:
docker volume ls
To inspect what is inside a volume, you can run a command like:
docker run --rm -ti -v docassemble-compose_certbot-certs:/certs ubuntu:24.04 /bin/bash
This will give you a bash
command line inside of a plain
ubuntu:24.04
container with the Docker volume
docassemble-compose_certbot-certs
mounted at /certs
. You can run
this whether or not there are containers running that are using the
Docker volumes.
You can use docker volume rm
to remove a Docker volume and try
populating it again.
You can get inside of a running container by running a command like:
docker compose exec certbot /bin/sh
Some services have /bin/bash
installed, so you should use
/bin/bash
whenever possible because it is a more user-friendly
shell. Nearly all containers will have /bin/sh
available.
If a container is not running, but you want to run a shell within that container, and see everything that is inside of it (including mounted volumes), you can run a command like:
docker compose run --rm -i --no-deps redis /bin/bash
Sometimes you have to run the command this way:
docker compose run --rm -i --no-deps --entrypoint /bin/bash redis
There are several ways to deploy docassemble:
- Install it manually on a Linux system.
- Do
docker run
on the monolithicjhpyle/docassemble
image. - Use Kubernetes (https://github.com/jhpyle/charts).
- Use Docker Compose.
- Build your own system.
Manual installation is not recommended because there would be multiple
steps involved, and it you would need to know what you were
doing. Using the monolithic jhpyle/docassemble
image is simple but
may use more disk space than you need to use. Both the Kubernetes and
Docker Compose methods separate services into multiple
containers. In the case of Kubernetes, this enables scalability that
is practically limitless. Docker Compose does not provide any
greater scalability than using docker run
.
You could create your own way of deploying docassemble, even if
you don't know Python. At its core, docassemble is a Python
application that looks for a config.yml
file at a location indicated
by DA_CONFIG_FILE
. By changing values in config.yml
your can
change how it works. The db
, redis
, and rabbitmq
directives need
to point to running services, and docassemble assumes that certain
command-line applications can run, such as convert
(ImageMagick),
pdftk
, git
, pdftoppm
, and pip
. You could run docassemble
with Gunicorn or Waitress instead of uWSGI.
The main benefit of Docker Compose is that it can conserve disk space and facilitate vulnerability scanning.
The jhpyle/docassemble
Docker image is a large image, mostly because
it contains a full TeX Live implementation in order to support document
assembly through Pandoc. Some users do not use the Pandoc document
assembly system, so their servers would work if TeX Live was not
installed. LibreOffice, which is used for document manipulation and
conversion tasks (even when there is an external DOCX to PDF converter
like Gotenberg, ConvertAPI, or CloudConvert) is another large
package that users may or may not need. If a user does not need to
perform optical character recognition (OCR), or wants to use a cloud
service for OCR, they could save disk space by not installing the
Tesseract application. A user might only need a few specific fonts
in order to assemble PDF documents, and could save disk space by only
installing the specific fonts they need. If a user wants to an
external PostgreSQL database, or an external Redis database, they can
save (a relatively small amount of) disk space by not installing
PostgreSQL and Redis.
One way to optimize disk usage is to build a custom Docker image. You
can write a custom Dockerfile that only installs the Ubuntu packages
that are necessary. Note that the jhpyle/docassemble
is built using
a base image, jhpyle/docassemble-os
, both of which are on GitHub and
Docker Hub. Using copy-and-paste, it is possible to combine the
Dockerfile
of jhpyle/docassemble-os
with the Dockerfile
of
jhpyle/docassemble
in to a single Dockerfile
, and then remove the
commands that install unnecessary components. Another way is to create
a Dockerfile that builds a custom image by starting with the large
jhpyle/docassemble
image and then uninstalling the components that
are not needed. For example:
FROM jhpyle/docassemble
RUN DEBIAN_FRONTEND=noninteractive TERM=xterm \
apt-get -q -y remove \
pandoc \
texlive \
texlive-luatex \
texlive-xetex \
texlive-latex-recommended \
texlive-latex-extra \
texlive-font-utils \
texlive-lang-all \
texlive-extra-utils \
&& apt-get -q -y autoremove
This Docker Compose implementation allows you to optimize hard drive space by only downloading the images and running the containers that are necessary for your deployment.
The following services run using standard images from Docker Hub:
- Redis
- RabbitMQ
- PostgreSQL
- NGINX
- Certbot
- Gotenberg
In addition, there are four services using custom images that Docker Compose builds upon startup:
services/Dockerfile.core
services/Dockerfile.pandoc
services/Dockerfile.libreoffice
services/Dockerfile.tesseract
The interactive setup.py
script constructs a docker-compose.yml
file based on the user's needs. If the user elects to use Pandoc,
LibreOffice, or Tesseract, those applications operate inside of
separate Docker Compose "services." The "core" service communicates
with the other services using Celery (which uses the Redis and
RabbitMQ services for interprocess communication).
The setup.py
script creates a docker-compose.yml
file and an
initial-config.yml
file that can be modified further. If you want to
use an external SQL database, you can remove the postgres
service
from docker-compose.yml
(and its volume) and edit the db
configuration in initial-config.yml
. The only service that is
actually required is the core
service; all the other services are
optional or can be replaced with external services.
Using Docker Compose can use more disk space than a tailored
monolithic image, since each image contains a separate copy of
operating system files. With the images that the docker-compose.yml
file builds itself, this concern is mitigated because the images have
layers in common (the Dockerfiles start with the same lines), but it
is an issue when you run multiple images downloaded from Docker Hub.
Segregating services into separate images has advantages if you wish
to scan your Docker images for software vulnerabilities. Since
scanning a large image may require large amounts of RAM, using
multiple smaller images may avoid resource limits. When interpreting
lists of CVEs, it is helpful to know which vulnerabilities are
associated with which service. For example, if there is a
vulnerability in an Ubuntu .deb
package in the rabbitmq
image, you
can evaluate the risk of that vulnerability knowing that the
rabbitmq
service is not directly connected to the internet, can only
accept input through port 5672, and is isolated by Docker in a
container that does not share file systems with other services.