metadata database cannot be configured past 1000000 rows #7203

mvforster · 2023-08-14T16:35:15Z

Dear Cromwell Team,
I am trying to run a workflow written in WDL using Cromwell v.65. The workflow reports the following error in the stdout:

cromwell.services.MetadataTooLargeNumberOfRowsException: Metadata for workflow <UUID> exists in database but cannot be served because row count of 3138431 exceeds configured limit of 1000000.```
This is after having edited the `cromwell.conf` as suggested in [this thread](https://github.com/broadinstitute/cromwell/issues/2519)

The configuration file used is as follows (edited to remove the main script):

include required(classpath("application"))
backend {
default = LSF
providers {
LSF {
actor-factory = "cromwell.backend.impl.sfs.config.ConfigBackendLifecycleActorFactory"
config {
exit-code-timeout-seconds = 300
runtime-attributes = """
Int cpu
Int memory_mb
String? lsf_queue
String? lsf_project
String? docker
"""

            submit = """
            bsub \
            -q ${lsf_queue} \
            -P ${lsf_project} \
            -J ${job_name} \
            -cwd ${cwd} \
            -o ${out} \
            -e ${err} \
            -n ${cpu} \
            -R 'rusage[mem=${memory_mb}] span[hosts=1]' \
            -M ${memory_mb} \
            /usr/bin/env bash ${script}
            """

            submit-docker = """
            module load tools/singularity/3.8.3
            SINGULARITY_MOUNTS='<redacted>'
            export SINGULARITY_CACHEDIR=$HOME/.singularity/cache
            LOCK_FILE=$SINGULARITY_CACHEDIR/singularity_pull_flock

            export SINGULARITY_DOCKER_USERNAME=<redacted>
            export SINGULARITY_DOCKER_PASSWORD=<redacted>

            flock --exclusive --timeout 900 $LOCK_FILE \
            singularity exec docker://${docker} \
            echo "Sucessfully pulled ${docker}"

            bsub \
            -q ${lsf_queue} \
            -P ${lsf_project} \
            -J ${job_name} \
            -cwd ${cwd} \
            -o ${out} \
            -e ${err} \
            -n ${cpu} \
            -R 'rusage[mem=${memory_mb}] span[hosts=1]' \
            -M ${memory_mb} \
            singularity exec --containall $SINGULARITY_MOUNTS --bind ${cwd}:${docker_cwd} docker://${docker} ${job_shell} ${docker_script}
            """

            job-id-regex = "Job <(\\d+)>.*"
            kill = "bkill ${job_id}"
            kill-docker = "bkill ${job_id}"
            check-alive = "bjobs -w ${job_id} |& egrep -qvw 'not found|EXIT|JOBID'"

            filesystems {
                local {
                    localization: [
                        "soft-link", "copy", "hard-link"
                    ]
                    caching {
                        duplication-strategy: [
                            "soft-link", "copy", "hard-link"
                        ]
                        hashing-strategy: "path+modtime"
                    }
                }
            }
        }
    }
}

}
call-caching {
enabled = true
invalidate-bad-cache-results = true
}
database {
profile = "slick.jdbc.HsqldbProfile$"
db {
driver = "org.hsqldb.jdbcDriver"
url = """
jdbc:hsqldb:file:cromwell-executions/cromwell-db/cromwell-db;
shutdown=false;
hsqldb.default_table_type=cached;hsqldb.tx=mvcc;
hsqldb.result_max_memory_rows=10000;
hsqldb.large_data=true;
hsqldb.applog=1;
hsqldb.lob_compressed=true;
hsqldb.script_format=3;
hsqldb.log_size=0
"""
connectionTimeout = 86400000
numThreads = 2
}
insert-batch-size = 2000
read-batch-size = 5000000
write-batch-size = 5000000
metadata {
profile = "slick.jdbc.HsqldbProfile$"
db {
driver = "org.hsqldb.jdbcDriver"
url = """
jdbc:hsqldb:file:cromwell-executions/cromwell-db/cromwell-metadata-db/;
shutdown=false;
hsqldb.default_table_type=cached;hsqldb.tx=mvcc;
hsqldb.result_max_memory_rows=10000;
hsqldb.large_data=true;
hsqldb.applog=1;
hsqldb.lob_compressed=true;
hsqldb.script_format=3;
hsqldb.log_size=0
"""
connectionTimeout = 86400000
numThreads = 2
}
insert-batch-size = 2000
read-batch-size = 5000000
write-batch-size = 5000000
}
}

services {
MetadataService {
metadata-read-row-number-safety-threshold = 5000000
}
}

The main issue that I can see is that Cromwell is ignoring the increased metadata row count. this is despite my separating out the metadata database and increasing the thresholds on both databases.

Prior to running the changes listed above I have ensured that the working directory is completely purged of logs and metadata so as to ensure an unobstructed run.

The documentation currently provides no additional guidance on how to overcome the error. Any assistance will be appreciated.
Best wishes,

Matthieu

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

metadata database cannot be configured past 1000000 rows #7203

metadata database cannot be configured past 1000000 rows #7203

mvforster commented Aug 14, 2023

metadata database cannot be configured past 1000000 rows #7203

metadata database cannot be configured past 1000000 rows #7203

Comments

mvforster commented Aug 14, 2023