Welcome to the ENPKG Full Workflow!

🌟 We're delighted to have you explore our computational workflow. This guide will walk you through the installation, setup, and execution of the ENPKG full workflow. Interested in the science behind ENPKG? Check out the paper (https://doi.org/10.1021/acscentsci.3c00800) ! It's packed with insights and methodologies that power this workflow.

🌱 Getting Started

Clone the repository

First, clone the repository to your local machine:

git clone https://github.com/enpkg/enpkg_full.git

Navigate to the newly created folder:

cd enpkg_full

Install the required environment

We offer both Mamba or Poetry installation solutions, see below:

Mamba

Start your journey by setting up the required environment. It's a breeze (or not) with Mamba! See the Mamba documentation for more details.

mamba env create -f environment.yml

Poetry

First, see the Poetry documentation for more details.

poetry install

Activate the environment

Mamba

Once the environment is ready, bring it to life with this simple command:

conda activate enpkg_full

Poetry

poetry shell

🌐 Install Sirius Locally

Check details at https://boecker-lab.github.io/docs.sirius.github.io/install/

To get the latest version for your platform, run the install_sirius.sh script specifying the path chosen for the installation. For example, from the root of the repository:

bash src/install_sirius.sh /home/username/sirius

Once Sirius is installed, you will need to precise the path to the executable see section Editing config files.

🔐 Setting Up Environment Variables

Setting up the environment variables. To login to Sirius, you will need to set up the following environment variables (SIRIUS_USERNAME and SIRIUS_PASSWORD). You can do so launching the following command:

bash src/setup_sirius_env.sh

🛠 Editing Config Files

You will need to edit the following parameters files:

Parameters at user.yaml

All parameters are commented and should be self-explanatory.

E.g. selection of the dataset to process

For example you can enter the record_id and record_name of a Zenodo dataset on line. As it is set up here, this will download a small test dataset (https://doi.org/10.5281/zenodo.10018590).

🚀 Launching the Workflow

From the root of the repository, run:

sh workflow/00_workflow_all.sh

On the previous test dataset, this should take about 10 minutes to run.

🎉 Explore your newly generated Knowledge Graph

You can use GraphDB to explore the generated Knowledge Graph. To do so, you will need to install GraphDB (https://graphdb.ontotext.com/download/) and import the generated .ttl files. Make sure to read the latest Graph DB documentation (https://graphdb.ontotext.com/documentation/) to get started.

🌟 Your Feedback Matters

Facing an Issue? Encountering a glitch or have a suggestion? Your input is crucial for us. Here’s how you can help:

Report Issues: Use the 'Issues' tab in our GitHub repo to report any problems or ideas.
Detailed Descriptions Help: Include as much detail as possible - error messages, steps to reproduce, and screenshots are all super helpful.
Stay Updated: We’ll keep you in the loop as we work on fixing the issue or considering your suggestion.

🔎 Explore the ENPKG graph

A Knowledge Graph, has been build on a collection of 1600 tropical plants extracts (https://doi.org/10.1093/gigascience/giac124). This KG also integrates data from a metabolomics study led over 337 medicinal plants of the Korean Pharmacopeia (https://doi.org/10.1038/s41597-022-01662-2). It can be explored following these links.

The ENPKG graph is available at the following address https://enpkg.commons-lab.org/graphdb/. No need for login !
The SPARQL research interface can be reached at https://enpkg.commons-lab.org/graphdb/sparql. Make sure to check the paper for examples of queries.
The ENPKG vocabulary is described at https://enpkg.commons-lab.org/doc/index.html

📜 Citations

This set of script represents a pipeline calling multiple tools. Please make sure to cite the original authors of the tools used in this workflow.

Allard et al. 2016, Analytical Chemistry
Davies et al. 2015, Nucleic Acids Research
Djoumbou et al. 2016, Journal of Cheminformatics
Dührkop et al. 2015, Nature Methods
Dührkop et al. 2019, Nature Methods
Dührkop et al. 2020, Nature Biotechnology
Gaudry et al. 2022, Frontiers in Bioinformatics
Hoffmann et al. 2022, Nature Biotechnology
Huber et al. 2020, Journal of Open Source Software
Kim et al. 2021, Journal of Natural Products
Ludwig et al. 2020, Nature Machine Intelligence
McTavish et al. 2021, Systematic Biology
Rutz et al. 2016, Frontiers in Plant Science
Rutz et al. 2022, Elife

To cite the ENPKG workflow, please use the following reference Gaudry et al. 2023.

📋 Note

This workflow describes a pilot application aiming to transition from classical metabolomics datasets to Linked Open Data in such datasets. It is currently being ported to the more generic EMIKG framework (https://github.com/earth-metabolome-initiative/emikg) that we are developping in the frame of the Earth Metabolome Initiative (http://www.earthmetabolome.org/). Stay tuned.

Name		Name	Last commit message	Last commit date
Latest commit History 125 Commits
00_inat_metadat_fetcher/src		00_inat_metadat_fetcher/src
01_enpkg_data_organization		01_enpkg_data_organization
02_enpkg_taxo_enhancer		02_enpkg_taxo_enhancer
03_enpkg_mn_isdb_isdb_taxo		03_enpkg_mn_isdb_isdb_taxo
04_enpkg_sirius_canopus		04_enpkg_sirius_canopus
05_enpkg_meta_analysis		05_enpkg_meta_analysis
06_enpkg_graph_builder		06_enpkg_graph_builder
params		params
src		src
workflow		workflow
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
environment.yml		environment.yml
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

License

enpkg/enpkg_full

Folders and files

Latest commit

History

Repository files navigation

Welcome to the ENPKG Full Workflow!

🌱 Getting Started

Clone the repository

Install the required environment

Mamba

Poetry

Activate the environment

Mamba

Poetry

🌐 Install Sirius Locally

🔐 Setting Up Environment Variables

🛠 Editing Config Files

E.g. selection of the dataset to process

🚀 Launching the Workflow

🎉 Explore your newly generated Knowledge Graph

🌟 Your Feedback Matters

🔎 Explore the ENPKG graph

📜 Citations

📋 Note

About

Resources

License

Stars

Watchers

Forks

Languages