Skip to content

Commit

Permalink
refactor tutorial notebooks to default include nft minting (#579) (#580)
Browse files Browse the repository at this point in the history
  • Loading branch information
acashmoney committed Aug 11, 2023
1 parent c00ab30 commit b306791
Show file tree
Hide file tree
Showing 20 changed files with 1,428 additions and 1,008 deletions.
30 changes: 30 additions & 0 deletions docs/docs/concepts/data.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
---
title: Content-Addressed Data
sidebar_position: 2
sidebar_label: Data
---

Plex utilizes a decentralized storage protocol, [**IPFS**](https://docs.ipfs.tech/), for managing file storage in its scientific computing workflows. Within IPFS, all data is content-addressed, meaning each file is given a unique content identifier ([**CID**](https://docs.ipfs.tech/concepts/content-addressing/#what-is-a-cid)).

CIDs are derived from a file's content rather than its location.

Using CIDs not only enhances file retrieval but also promotes data integrity since the identifier changes if the content does, making any alterations immediately noticeable.

**Plex [pins](https://docs.ipfs.tech/how-to/pin-files/) all input and output data to IPFS.** See [Input / Output](io.md) for more details.

An example of content-addressed data:

```json
"protein": {
"class": "File",
"filepath": "6d08_protein_processed.pdb",
"ipfs": "QmeTreLhxMmBaRqHemJcStvdyHZThdzi4gTmvTyY1igeCk"
}
```
The CID, **QmeTreLhxMmBaRqHemJcStvdyHZThdzi4gTmvTyY1igeCk**, can be used to access the content in multiple ways.

| Source | Access |
| ------ | ---- |
| IPFS-enabled browser (ie, [Brave](https://brave.com/ipfs-support/)) | ipfs://QmeTreLhxMmBaRqHemJcStvdyHZThdzi4gTmvTyY1igeCk |
| IPFS Desktop | QmeTreLhxMmBaRqHemJcStvdyHZThdzi4gTmvTyY1igeCk |
| IPFS https gateway | https://ipfs.io/ipfs/QmeTreLhxMmBaRqHemJcStvdyHZThdzi4gTmvTyY1igeCk
15 changes: 15 additions & 0 deletions docs/docs/concepts/intro.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
---
title: Key Concepts
sidebar_position: 1
sidebar_label: Intro
---

Plex is a [Python package](https://pypi.org/project/PlexLabExchange/) designed for executing scientific workflows on a decentralized infrastructure. It supports distributed compute and storage, enabling workflows on any internet-connected machine.

A key feature of plex is its strict composability, ensuring each tool has defined inputs and outputs for seamless integration. Moreover, every file processed by plex is content-addressed, guaranteeing traceability and consistent data sharing. Explore plex's foundational concepts below:

* [Data](data.md)
* [Tools](tools.md)
* [Input / Output](io.md)
* [Tokens](tokens.md)
* [Additional resources](resources.md)
107 changes: 107 additions & 0 deletions docs/docs/concepts/io.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,107 @@
---
title: Input / Output (IO)
sidebar_position: 4
sidebar_label: Input / Output (IO)
---

Plex employs a streamlined approach to input and output data management, facilitating consistency and transparency throughout the computation process.

Plex begins its IO process with [`plex_init`](../reference/python.md), which creates an `io.json` file. This file serves as the cornerstone of instruction for the [Bacalhau](https://docs.bacalhau.org/) compute cluster, dictating the parameters and expected outputs for each computational job.

Key components of the initialized `io.json`

* **Input Data:** lists the provided input files, detailing their filename and corresponding CID
* **Output Data Placeholder:** lays out the expected outputs, as defined by the tool config
* **Tool Information:** indicates the computational tool to be used, along with the CID of its config
* **Job State:** initially set to `created`, it tracks the job's progression
* **Bacalhau Job ID Placeholder:** reserved for the unique job identifier once submitted to the Bacalhau compute cluster

## Initialized `io.json`

```json
[
{
"outputs": {
"best_docked_small_molecule": {
"class": "File",
"filepath": "",
"ipfs": ""
},
"protein": {
"class": "File",
"filepath": "",
"ipfs": ""
}
},
"tool": {
"name": "equibind",
"ipfs": "QmZ2HarAgwZGjc3LBx9mWNwAQkPWiHMignqKup1ckp8NhB"
},
"inputs": {
"protein": {
"class": "File",
"filepath": "6d08_protein_processed.pdb",
"ipfs": "QmeTreLhxMmBaRqHemJcStvdyHZThdzi4gTmvTyY1igeCk"
},
"small_molecule": {
"class": "File",
"filepath": "6d08_ligand.sdf",
"ipfs": "QmPErdymxLwpXcEHnWXYqEVHvRBVnh7kr3Uu5DNt2Y8wMR"
}
},
"state": "created",
"errMsg": "",
"bacalhauJobId": ""
}
]
```

## Execution with `plex_run`

The action commences with [`plex_run`](../reference/python.md). Upon its call, the computational job(s) outlined in the `io.json` are dispatched to the Bacalhau cluster for processing.

As the computations unfold and conclude, the `io.json` undergoes real-time updates

* **Output Data:** once a job completes, the `io.json` populates with the resultant data and its CID
* **Bacalhau Job ID:** the unique identifier for the job is added, facilitating traceability; useful in cases when a job fails to run
* **Updated Job State:** reflects the final status of the job, transitioning to `completed` if successful

## Completed `io.json`

```json
[
{
"outputs": {
"best_docked_small_molecule": {
"class": "File",
"filepath": "6d08_protein_processed_6d08_ligand_docked.sdf",
"ipfs": "QmWdzgrt5wtUJPyCrcKycU3voKGmT59FZXMasuaa1XCbkk"
},
"protein": {
"class": "File",
"filepath": "6d08_protein_processed.pdb",
"ipfs": "QmeTreLhxMmBaRqHemJcStvdyHZThdzi4gTmvTyY1igeCk"
}
},
"tool": {
"name": "equibind",
"ipfs": "QmZ2HarAgwZGjc3LBx9mWNwAQkPWiHMignqKup1ckp8NhB"
},
"inputs": {
"protein": {
"class": "File",
"filepath": "6d08_protein_processed.pdb",
"ipfs": "QmeTreLhxMmBaRqHemJcStvdyHZThdzi4gTmvTyY1igeCk"
},
"small_molecule": {
"class": "File",
"filepath": "6d08_ligand.sdf",
"ipfs": "QmPErdymxLwpXcEHnWXYqEVHvRBVnh7kr3Uu5DNt2Y8wMR"
}
},
"state": "completed",
"errMsg": "",
"bacalhauJobId": "7a01e92a-877e-4d1b-ba91-9effec6f170e"
}
]
```
12 changes: 12 additions & 0 deletions docs/docs/concepts/resources.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
---
title: Resources
sidebar_position: 6
sidebar_label: Additional resources
---

We believe in the power of open source. [Plex](https://github.com/labdao/plex) is built on the foundations of other, great open-source projects.
* [**IPFS**](https://docs.ipfs.tech/)
* [**Bacalhau**](https://docs.bacalhau.org/)
* [**Ethereum**](https://ethereum.org/en/developers/docs/intro-to-ethereum/)
* [**Optimism**](https://community.optimism.io/)
* [**OpenZeppelin**](https://docs.openzeppelin.com/)
75 changes: 75 additions & 0 deletions docs/docs/concepts/tokens.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
---
title: Tokens
sidebar_position: 5
sidebar_label: Tokens
---

**ProofOfScience** represents plex's unique approach to preserving, acknowledging, and ensuring the reproducibility of scientific computations. By leveraging the power of blockchain, each computation in plex can be minted into an [ERC-1155](https://ethereum.org/en/developers/docs/standards/tokens/erc-1155/) token called a ProofOfScience Non-Fungible Token (NFT).

## Minting with `plex_mint`

Once a computation concludes and its results are recorded in a completed `io.json`, the `plex_mint` command can be invoked. This process transforms the results into a tangible, traceable, and verifiable ProofOfScience NFT.

## Metadata Preservation

Within the NFT's metadata, the `graph` key contains the `io.json` content. All completed job runs are visible. All input and output data are accessible.

By providing this level of transparency and detail, others can validate, reproduce, or build upon the work.

```json
{
"description": "Research, Reimagined. All Scientists Welcome.",
"graph": [
{
"errMsg": "",
"inputs": {
"protein": {
"class": "File",
"filepath": "7n9g.pdb",
"ipfs": "QmUWCBTqbRaKkPXQ3M14NkUuM4TEwfhVfrqLNoBB7syyyd"
},
"small_molecule": {
"class": "File",
"filepath": "ZINC000003986735.sdf",
"ipfs": "QmV6qVzdQLNM6SyEDB3rJ5R5BYJsQwQTn1fjmPzvCCkCYz"
}
},
"outputs": {
"best_docked_small_molecule": {
"class": "File",
"filepath": "7n9g_ZINC000003986735_docked.sdf",
"ipfs": "QmZdoaKEGtESnLoHFMb9bvqdwXjyUuRK6DbEoYz8PYpZ8W"
},
"protein": {
"class": "File",
"filepath": "7n9g.pdb",
"ipfs": "QmUWCBTqbRaKkPXQ3M14NkUuM4TEwfhVfrqLNoBB7syyyd"
}
},
"state": "completed",
"tool": {
"ipfs": "QmZ2HarAgwZGjc3LBx9mWNwAQkPWiHMignqKup1ckp8NhB",
"name": "equibind"
}
}
],
"image": "ipfs://bafybeiba666bzbff5vu6rayvp5st2tk7tdltqnwjppzyvpljcycfhshdhq",
"name": "yielding hubble proteins"
}
```

## Reproducibility and Acknowledgement

Storing computations as ProofOfScience tokens on-chain sets a gold standard for scientific reproducibility. It becomes an immutable record of achievement, open to scrutiny and validation by peers.

## Gasless Transactions

Plex employs an [OpenZeppelin Defender Relayer](https://docs.openzeppelin.com/defender/relay) so users don't have to pay [gas fees](https://ethereum.org/en/developers/docs/gas/) to mint ProofOfScience tokens.

:::warning

Please only interact with the official [**smart contract**](https://goerli-optimism.etherscan.io/address/0xda70C0709d4213eE8441E4731A5F662C0406ed7e#code). The only blockchain we are on is the Optimism Goerli testnet. We are **NOT** on mainnet.

**Official address:** 0xda70C0709d4213eE8441E4731A5F662C0406ed7e

:::
60 changes: 60 additions & 0 deletions docs/docs/concepts/tools.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
---
title: Tools
sidebar_position: 3
sidebar_label: Tools
---

Plex is paving the way for permissionless science by ensuring that computational biology tools are not just available, but also easily accessible for open-source, early-stage drug discovery.

To facilitate ease of access and the spirit of open science, plex uses Docker containers of computational biology tools and makes them publicly available. These tools enable transparency and easy replication, ensuring that researchers can validate, reproduce, and build upon existing work with confidence.

## Tool Configs

Plex employs **tool configs** as computation templates which dictate how computations should be carried out. As demonstrated in the JSON below, these configs

* Specify the Docker container used
* Detail the input data format, ensuring that the data fed into the tool aligns with its expectations
* Define the output data format, allowing for standardized retrieval and further processing

This approach, reminiscent of the Common Workflow Language ([**CWL**](https://www.commonwl.org/)), ensures consistency, interoperability, and reproducibility across different tools and workflows.

### Colabfold Tool Config

```json
{
"class": "CommandLineTool",
"name": "colabfold-mini",
"description": "Protein folding prediction using Colabfold (mini settings)",
"baseCommand": ["/bin/bash", "-c"],
"arguments": [
"colabfold_batch --templates --max-msa 32:64 --num-recycle $(inputs.recycle.default) /inputs /outputs;"
],
"dockerPull": "public.ecr.aws/p7l9w5o7/colabfold:latest",
"gpuBool": true,
"networkBool": true,
"inputs": {
"sequence": {
"type": "File",
"item": "",
"glob": ["*.fasta"]
},
"recycle": {
"type": "int",
"item": "",
"default": "1"
}
},
"outputs": {
"best_folded_protein": {
"type": "File",
"item": "",
"glob": ["*rank_1*.pdb"]
},
"all_folded_proteins": {
"type": "Array",
"item": "File",
"glob": ["*rank*.pdb"]
}
}
}
```
12 changes: 6 additions & 6 deletions docs/docs/quickstart/installation.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,11 @@
---
title: Install PLEX
description: How to install PLEX
sidebar_label: Install PLEX
title: Install plex
description: How to install plex
sidebar_label: Install Plex
sidebar_position: 1
---

PLEX is a Python package developed by LabDAO that enables you to seamlessly run computational biology tools. PLEX manages all dependencies and installations and requests compute-time from the LabDAO network, ensuring an effortless experience.
Plex is a [Python package](https://pypi.org/project/PlexLabExchange/) developed by LabDAO that enables you to seamlessly run computational biology tools. PLEX manages all dependencies and installations and requests compute-time from the LabDAO network, ensuring an effortless experience.

:::note

Expand All @@ -20,7 +20,7 @@ PLEX is a Python package developed by LabDAO that enables you to seamlessly run

## Installation

To install [PLEX](https://pypi.org/project/PlexLabExchange/), run the following command:
To install [plex](https://pypi.org/project/PlexLabExchange/), run the following command:

```
pip install PlexLabExchange
Expand All @@ -36,7 +36,7 @@ If using a Jupyter notebook or Google Colab, you should prefix the command with

## Verification

After installation, ensure PLEX is working as expected by running one of the following tools:
After installation, ensure plex is working as expected by running one of the following tools:

- [Small Molecule Binding Tool](../tutorials/small-molecule-binding): A quick-run algorithm; complete a job and visualize results within 5 minutes.
- [Protein Folding Tool](../tutorials/protein-folding): Comprehensive guide provided for a step-by-step walkthrough.
11 changes: 0 additions & 11 deletions docs/docs/tutorials/protein-folding-nft-minting.md

This file was deleted.

0 comments on commit b306791

Please sign in to comment.