Genotype-phenotype analysis using the ADAM genomics analysis platform. This is work-in-progress. Currently, we implement a simple case/control analysis using a Chi squared test.
To build, install Maven. Then run:
Maven will automatically pull down and install all of the necessary dependencies.
Occasionally, building in Maven will fail due to memory issues. You can work around this
by setting the
MAVEN_OPTS environment variable to
Once Spark is installed, set the environment variable
SPARK_HOME to point to the Spark
installation root directory. Then, you can run
We include test data. You can run with the test data by running:
./bin/gnocchi-submit regressPhenotypes testData/sample.vcf testData/samplePhenotypes.csv testData/associations -saveAsText
We accept phenotype inputs in a CSV format:
Sample,Phenotype,Has Phenotype mySample,a phenotype,true
has phenotype column is binary true/false. See the test data for more descriptions.
This project is released under an Apache 2.0 license.