estimate

ESTIMATE provides researchers with scores for tumor purity, the level of stromal cells present, and the infiltration level of immune cells in tumor tissues based on expression data

Dependencies

Usage

Cromwell

java -jar cromwell.jar run estimate.wdl --inputs inputs.json

Inputs

Required workflow parameters:

Parameter	Value	Description
`inputData`	Array[Pair[File,File]]+	Input files with the first and second mate reads.
`launchEstimate.estimateScript`	String	Script to run ESTIMATE
`launchEstimate.rsemZscoreRScript`	String	calculation of zScore for ESTIMATE results
`launchEstimate.ensFile`	String	file for converting Ensembl gene_id to HUGO symbol

Optional workflow parameters:

Parameter	Value	Default	Description
`outputFileNamePrefix`	String	"ESTIMATE"	Output prefix, customizable. Default is the first file's basename.

Optional task parameters:

Parameter	Value	Default	Description
`preProcessRsem.jobMemory`	Int	8	Memory allocated to the task.
`preProcessRsem.timeout`	Int	20	Timeout in hours, needed to override imposed limits.
`preProcessRsem.tmpDir`	String	"tmp"	temporary directory
`preProcessRsem.dataDir`	String	"data"	data directory
`launchEstimate.jobMemory`	Int	8	Memory allocated to the task.
`launchEstimate.timeout`	Int	20	Timeout in hours, needed to override imposed limits.
`launchEstimate.dataDir`	String	"."	data directory
`launchEstimate.modules`	String	"estimate/1.0.13"	Names and versions of required modules. This needs to be customized by shesmu

Outputs

Output	Type	Description	Labels
`gRcounts`	File	File with RAW counts	vidarr_label: gRcounts
`gCounts`	File	File with estimated counts	vidarr_label: gCounts
`gFpkm`	File	FPKMS from RSEM	vidarr_label: gFpkm
`gTpm`	File	TPMS from RSEM	vidarr_label: gTpm
`estimateFile`	File	File with results from ESTIMATE	vidarr_label: estimateFile

Commands

This section lists command(s) run by estimate workflow

Running ESTIMATE

Merge data

Bash code is used to extract data from RSEM and STAR inputs into separate tables for TPMs, FPKMs and counts.

 
 TMP='~{tmpDir}'
 DATA='~{dataDir}'
 mkdir $TMP
 mkdir $DATA
 
 cp ~{sep=' ' rsemData} $DATA/
 cp ~{sep=' ' starData} $DATA/
 
 STARG=$(ls $DATA/*.tab | head -1);
 if [ ! -z $STARG ]; then
   awk 'NR>3 {print $1}' $STARG | sed 's/N_ambiguous/gene_id/' > $TMP/sgene;
 fi;
 
 RSEMG=$(ls $DATA/*.genes.results | head -1);
 if [ ! -z $RSEMG ]; then
   cut -f 1 $RSEMG  > $TMP/genes;
 fi;
 
 # We will use basename as a sample ID here
 for t in $DATA/*results;do
   BASE=$(basename $t | sed s/.results$//);
   NAME=$(echo $BASE | sed 's/\..*//');
   echo $t;
   echo $NAME > $TMP/$NAME.fpkm;
   cut -f 7 $t | awk 'NR>1' >> $TMP/$NAME.fpkm;
   echo $NAME > $TMP/$NAME.tpm;
   cut -f 6 $t | awk 'NR>1' >> $TMP/$NAME.tpm;
   echo $NAME > $TMP/$NAME.count;
   cut -f 5 $t | awk 'NR>1' >> $TMP/$NAME.count;
   echo $NAME > $TMP/$NAME.rcount;
   awk 'NR>4 {if ($4 >= $3) print $4; else print $3}' $DATA/$NAME.ReadsPerGene.out.tab >> $TMP/$NAME.rcount;
 done
 
 # Merging
 paste $TMP/sgene $TMP/*.rcount > ~{outputPrefix}_genes_all_samples_RCOUNT.txt;
 paste $TMP/genes $TMP/*.count > ~{outputPrefix}_genes_all_samples_COUNT.txt;
 paste $TMP/genes $TMP/*.fpkm > ~{outputPrefix}_genes_all_samples_FPKM.txt;
 paste $TMP/genes $TMP/*.tpm > ~{outputPrefix}_genes_all_samples_TPM.txt;

Run ESTIMATE using FPKM values

  set -euo pipefail
  Rscript ~{estimateScript} ~{inRSEM} ~{dataDir} ~{ensFile} ~{rsemZscoreRScript} ~{outputFileNamePrefix}

Support

For support, please file an issue on the Github project or send an email to gsi@oicr.on.ca .

Generated with generate-markdown-readme (https://github.com/oicr-gsi/gsi-wdl-tools/)

Name		Name	Last commit message	Last commit date
Latest commit History 87 Commits
scripts		scripts
tests		tests
workflow-Estimate		workflow-Estimate
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
Jenkinsfile		Jenkinsfile
LICENSE		LICENSE
README.md		README.md
commands.txt		commands.txt
estimate.wdl		estimate.wdl
vidarrbuild.json		vidarrbuild.json
vidarrtest-regression.json.in		vidarrtest-regression.json.in

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

estimate

Dependencies

Usage

Cromwell

Inputs

Required workflow parameters:

Optional workflow parameters:

Optional task parameters:

Outputs

Commands

Merge data

Run ESTIMATE using FPKM values

Support

About

Releases 1

Packages

Contributors 4

Languages

License

oicr-gsi/estimate

Folders and files

Latest commit

History

Repository files navigation

estimate

Dependencies

Usage

Cromwell

Inputs

Required workflow parameters:

Optional workflow parameters:

Optional task parameters:

Outputs

Commands

Merge data

Run ESTIMATE using FPKM values

Support

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Contributors 4

Languages

Packages