Skip to content
/ ABCkb Public

The knowledgebase connecting plants to human health

License

Notifications You must be signed in to change notification settings

atrautm1/ABCkb

Repository files navigation

Aliment to Bodily Condition Knowledgebase

Utilizes docker-compose, which should be installed on your machine

Important!

  • Please ensure your terms of service either individidually or with your institution permits you to utilize the source data. Contact each source listed below to find out more.
  • Additionally, this database is intended for nutrition research and is not a substitute for professional medical advice, diagnosis, or treatment. Always seek the advice of your health care provider with any medical or health-related questions. Don't disregard medical advice or delay seeking treatment because of anything you find in this database.

Data Sources

Name Data
Chemical Entities of Biological Interest (ChEBI) C
Comparative Toxicogenomics Database Co
Disease Ontology D
Gene Ontology Pa
Human Phenotype Ontology D
Medical Subject Headings O,Pl,C,Pa,D
MONDO D
Nal Thesaurus Pl,C,D
NCBI Gene G
NCBI Taxonomy O,Pl
NCBI MedGen Co
Text-mined data from MEDLINE Abstracts Co

All sources from their respective URL with the extracted abbreviated node labels -- Plant (Pl), Chemical (C), Gene (G), Pathway (Pa), Phenotype (D), Organism (O), Connectivity (Co) -- are shown here.

Instructions for Building the Knowledgebase

Step 1: Clone/fork this project git clone github.com/atrautm1/ABCkb

Step 2: Prepare for data

  • Ensure proper permissions for data usage are acquired
  • Allocate at least 8gb ram and 2gb swap for docker
  • Storage requirements
    • 20gb total
    • 16gb purgeable data (docker/data;docker/neo4j/import)
    • 4gb Database size

Step 3: Download the NLP results place them into the docker/data folder

  • NLP results
  • uncompress the archive, but leave the individual tsv files compressed
    • tar -xzvf linguamatics.tar.gz

Step 4: Start docker container

  • docker-compose -f docker-compose.yml up
  • Make some coffee; this takes about 45 minutes on the first run

Credits: Aaron Trautman, Dr. Richard Linchangco, Steven Blanchard, Dr. Jeremy Jay, Dr. Cory Brouwer, and the interns that have contributed many features of this program

How to use the knowledgebase

3 Ways to use the ABCkb

  1. Via the custom interface
  • Use for searching and developing queries
  • After the Knowledgebase is built and the database has started, you will see that an instance is running on http://0.0.0.0:8000/
  1. In the neo4j browser
  • Create graphs and interact with the database using cypher
  • After the Knowledgebase is built, you should see that an instance is running on localhost:7474/
  1. Via the terminal
  • Interact with the database using cypher
  • In the terminal, you can run cypher-shell ";"

Cypher reference guide

Here are some example queries to get you started..

  1. Does the node exist?
  • MATCH (n) WHERE n.name = "Avena sativa" OR "Avena sativa" in n.synonyms RETURN n
  1. Open discovery with Avena sativa...
  • MATCH p=(:Plant {name:"Avena sativa"})-[:text_mined]->(:Chemical)-[:text_mined]->(:Gene) RETURN p
  1. Closed discovery with Avena sativa...
  • MATCH p=(:Plant {name:"Avena sativa"})-[:text_mined]->(:Chemical)-[:text_mined]->(:Gene {name:"HSD11B1"}) RETURN p
  1. Show schema
  • There is not really a "schema" for graph databases but the following command works pretty well to show connectivity
  • call db.schema.visualization()

Troubleshooting

Downloading data failed...

  • I have written a script which should automatically extract pieces from sources in the sources.json file but this may not always work. Some sources prevent programmatic access to their data in which case you will need to manually download the files.

I want to add my own data...

  • There are a couple options depending on how you want to proceed
  1. Add it to the already generated database using neo4j

  2. Add it when building the KB