Utilizes docker-compose, which should be installed on your machine
- Please ensure your terms of service either individidually or with your institution permits you to utilize the source data. Contact each source listed below to find out more.
- Additionally, this database is intended for nutrition research and is not a substitute for professional medical advice, diagnosis, or treatment. Always seek the advice of your health care provider with any medical or health-related questions. Don't disregard medical advice or delay seeking treatment because of anything you find in this database.
Name | Data |
---|---|
Chemical Entities of Biological Interest (ChEBI) | C |
Comparative Toxicogenomics Database | Co |
Disease Ontology | D |
Gene Ontology | Pa |
Human Phenotype Ontology | D |
Medical Subject Headings | O,Pl,C,Pa,D |
MONDO | D |
Nal Thesaurus | Pl,C,D |
NCBI Gene | G |
NCBI Taxonomy | O,Pl |
NCBI MedGen | Co |
Text-mined data from MEDLINE Abstracts | Co |
All sources from their respective URL with the extracted abbreviated node labels -- Plant (Pl), Chemical (C), Gene (G), Pathway (Pa), Phenotype (D), Organism (O), Connectivity (Co) -- are shown here.
Step 1: Clone/fork this project
git clone github.com/atrautm1/ABCkb
Step 2: Prepare for data
- Ensure proper permissions for data usage are acquired
- Allocate at least 8gb ram and 2gb swap for docker
- Storage requirements
- 20gb total
- 16gb purgeable data (docker/data;docker/neo4j/import)
- 4gb Database size
Step 3: Download the NLP results place them into the docker/data folder
- NLP results
- uncompress the archive, but leave the individual tsv files compressed
tar -xzvf linguamatics.tar.gz
Step 4: Start docker container
docker-compose -f docker-compose.yml up
- Make some coffee; this takes about 45 minutes on the first run
Credits: Aaron Trautman, Dr. Richard Linchangco, Steven Blanchard, Dr. Jeremy Jay, Dr. Cory Brouwer, and the interns that have contributed many features of this program
3 Ways to use the ABCkb
- Via the custom interface
- Use for searching and developing queries
- After the Knowledgebase is built and the database has started, you will see that an instance is running on http://0.0.0.0:8000/
- In the neo4j browser
- Create graphs and interact with the database using cypher
- After the Knowledgebase is built, you should see that an instance is running on localhost:7474/
- Via the terminal
- Interact with the database using cypher
- In the terminal, you can run cypher-shell ";"
Here are some example queries to get you started..
- Does the node exist?
MATCH (n) WHERE n.name = "Avena sativa" OR "Avena sativa" in n.synonyms RETURN n
- Open discovery with Avena sativa...
MATCH p=(:Plant {name:"Avena sativa"})-[:text_mined]->(:Chemical)-[:text_mined]->(:Gene) RETURN p
- Closed discovery with Avena sativa...
MATCH p=(:Plant {name:"Avena sativa"})-[:text_mined]->(:Chemical)-[:text_mined]->(:Gene {name:"HSD11B1"}) RETURN p
- Show schema
- There is not really a "schema" for graph databases but the following command works pretty well to show connectivity
call db.schema.visualization()
Downloading data failed...
- I have written a script which should automatically extract pieces from sources in the sources.json file but this may not always work. Some sources prevent programmatic access to their data in which case you will need to manually download the files.
I want to add my own data...
- There are a couple options depending on how you want to proceed
-
Add it to the already generated database using neo4j
-
Add it when building the KB