Skip to content

HCMI Memory issue when processing Transcriptomics files #444

@jjacobson95

Description

@jjacobson95

This is an inconsistent error because it is typically right at the edge of memory limits and background processes like docker can tip it over the top. The HCMI omics script downloads hundreds (thousands?) of samples and creates a dataframe for each before concatenating. By streaming each to file, this would avoid holding all of them in memory at once and causing a crash.

Memory error:

build_omics.sh: line 7:     7 Killed                  python 02-getHCMIData.py -m full_manifest.txt -t transcriptomics -o /tmp/hcmi_transcriptomics.csv.gz -g $1 -s $2
Error on or near line 7 while executing: python 02-getHCMIData.py -m full_manifest.txt -t transcriptomics -o /tmp/hcmi_transcriptomics.csv.gz -g $1 -s $2

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

Status

Done

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions