Winner algorithm for CAGI4 eQTL-causal SNP challenge. EnsembleExpr can predict MPRA reporter expression level from sequence, and predict which sequence varaints will lead to significant allele-specific expression.
docker pull haoyangz/ensembleexpr docker run -v VCF_FILE:/infile.vcf -v OUTPUT_DIR:/outdir -v /etc/passwd:/etc/passwd -u $(id -u) -it --rm haoyangz/ensembleexpr python main.py /infile.vcf /outdir ORDER
VCF_FILE: the absolte path to a list of sequence variants in VCF format (example)
OUTPUT_DIR: the absolute path to the output directory, under which the expression predictions from each components in the ensemble and the average will be saved.
ORDER: several orders can be concatenated separated by space
-f: feature generation
-e: predict expression for both alleles of each variant
-v: predict which variant will lead to significant allele-specific expression
python main.py -h for detail descriptions of the options.