27
Migration
- Call Caching has been improved in this version of Cromwell, specifically the time needed to determine whether or not a job can be cached
has drastically decreased. To achieve that the database schema has been modified and a migration is required in order to preserve the pre-existing cached jobs.
This migration is relatively fast compared to previous migrations. To get an idea of the time needed, look at the size of yourCALL_CACHING_HASH_ENTRYtable.
As a benchmark, it takes 1 minute for a table with 6 million rows.
The migration will only be executed on MySQL. Other databases will lose their previous cached jobs.
In order to run properly on MySQL, the following flag needs to be adjusted: https://dev.mysql.com/doc/refman/5.5/en/server-system-variables.html#sysvar_group_concat_max_len
The following query will give you a minimum to set the group_concat_max_len value to:
SELECT MAX(aggregated) as group_concat_max_len FROM
(
SELECT cche.CALL_CACHING_ENTRY_ID, SUM(LENGTH(CONCAT(cche.HASH_KEY, cche.HASH_VALUE))) AS aggregated
FROM CALL_CACHING_HASH_ENTRY cche
GROUP BY cche.CALL_CACHING_ENTRY_ID
) aggregationHere is the SQL command to run to set the group_concat_max_len flag to the proper value:
SET GLOBAL group_concat_max_len = valueWhere value is replaced with the value you want to set it to.
Note that the migration will fail if the flag is not set properly.
Breaking Changes
- The response from the metadata endpoint now returns compressed HTTP responses depending on the request's headers using the following rules:
| Accept-Encoding header | Resulting response |
|---|---|
Accept-Encoding: gzip |
compressed with Gzip |
Accept-Encoding: deflate |
compressed with Deflate |
Accept-Encoding: deflate, gzip |
compressed with Gzip |
Accept-Encoding: identity |
uncompressed |
no Accept-Encoding header present |
compressed with Gzip |
- The update to Slick 3.2 requires a database stanza to
switch from usingdrivertoprofile.
database {
#driver = "slick.driver.MySQLDriver$" #old
profile = "slick.jdbc.MySQLProfile$" #new
db {
driver = "com.mysql.jdbc.Driver"
url = "jdbc:mysql://host/cromwell?rewriteBatchedStatements=true"
user = "user"
password = "pass"
connectionTimeout = 5000
}
}
Call Caching
Cromwell now supports call caching with floating Docker tags (e.g. docker: "ubuntu:latest"). Note it is still considered
a best practice to specify Docker images as hashes where possible, especially for production usages.
Within a single workflow Cromwell will attempt to resolve all floating tags to the same Docker hash, even if Cromwell is restarted
during the execution of a workflow. In call metadata the docker runtime attribute is now the same as the
value that actually appeared in the WDL:
"runtimeAttributes": {
"docker": "ubuntu:latest",
"failOnStderr": "false",
"continueOnReturnCode": "0"
}
Previous versions of Cromwell rewrote the docker value to the hash of the Docker image.
There is a new call-level metadata value dockerImageUsed which captures the hash of the Docker image actually used to
run the call:
"dockerImageUsed": "library/ubuntu@sha256:382452f82a8bbd34443b2c727650af46aced0f94a44463c62a9848133ecb1aa8"
Important note: There is a Cromwell-wide cache for the resolution of docker tags to docker hashes.
By default, this cache retains a tag -> hash association for 20 minutes before discarding it.
This means that if 2 workflows with a task using a docker tag (e.g. ubuntu:latest) are started within that time span, they will both be run with the same hash even if the tag latest is updated.
The purpose of this cache is to reduce the time spent by Cromwell resolving docker tags.
This cache can be adjusted or completely disabled in the configuration.
Docker
- The Docker section of the configuration has been slightly reworked
An option to specify how a Docker hash should be looked up has been added. Two methods are available.
"local" will try to look for the image on the machine where cromwell is running. If it can't be found, Cromwell will try topullthe image and use the hash from the retrieved image.
"remote" will try to look up the image hash directly on the remote repository where the image is located (Docker Hub and GCR are supported)
Note that the "local" option will require docker to be installed on the machine running cromwell, in order for it to call the docker CLI. - Adds hash lookup support for public quay.io images.
WDL Feature Support
- Added support for the new WDL
basenamefunction. Allows WDL authors to get just the file name from a File (i.e. removing the directory path) - Allows coercion of
Mapobjects intoArrays ofPairs. This also allows WDL authors to directly scatter over WDLMaps.
Miscellaneous
- Adds support for JSON file format for google service account credentials. As of Cromwell 27, PEM credentials for PAPI are deprecated and support might be removed in a future version.
google {
application-name = "cromwell"
auths = [
{
name = "service-account"
scheme = "service_account"
json-file = "/path/to/file.json"
}
]
}
General Changes
- The
/queryendpoint now supports querying bylabel. See the README for more information. - The
read_Xstandard library functions limit accepted filesizes. These differ by type, e.g. read_bool has a smaller limit than read_string. See reference.conf for default settings.