Skip to content

BasedLabs/NoLabs

Repository files navigation

NoLabs

NoLabs

Open source biolab

Github top language Github language count Repository size License

Contents

About

NoLabs is an open source biolab that lets you run experiments with the latest state-of-the-art models for bio research.

The goal of the project is to accelerate bio research by making inference models easy to use for everyone. We are currently supporting protein biolab (predicting useful protein properties such as solubility, localisation, gene ontology, folding, etc.), drug discovery biolab (construct ligands and test binding to target proteins) and small molecules design biolab (design small molecules given a protein target and check drug-likeness and binding affinity).

We are working on expanding both and adding a cell biolab and genetic biolab, and we will appreciate your support and contributions.

Let's accelerate bio research!

Features

Bio Buddy - drug discovery co-pilot:

BioBuddy is a drug discovery copilot that supports:

  • Downloading data from ChemBL
  • Downloading data from RcsbPDB
  • Questions about drug discovery process, targets, chemical components etc
  • Writing review reports based on published papers

For example, you can ask

  • "Can you pull me some latest approved drugs?"
  • "Can you download me 1000 rhodopsins?"
  • "How does an aspirin molecule look like?" and it will do this and answer other questions.

To enable biobuddy run this command when starting nolabs:

$ ENABLE_BIOBUDDY=true docker compose up nolabs

And also start the biobuddy microservice:

$ OPENAI_API_KEY=your_openai_api_key TAVILY_API_KEY=your_tavily_api_key docker compose up biobuddy

Nolabs is running on GPT4 for the best performance. You can adjust the model you use in microservices/biobuddy/biobuddy/services.py

You can ignore OPENAI_API_KEY warnings when running other services using docker compose.

Drug discovery lab:

  • Drug-target interaction prediction, high throughput virtual screening (HTVS) based on:
  • Automatic pocket prediction via P2Rank
  • Automatic MSA generation via HH-suite3

Protein lab:

  • Prediction of subcellular localisation via fine-tuned ritakurban/ESM_protein_localization model (to be updated with a better model)
  • Prediction of folded structure via facebook/esmfold_v1
  • Gene ontology prediction for 200 most popular gene ontologies
  • Protein solubility prediction

Protein design Lab:


Conformations Lab:

Small molecules design lab:

  • Small molecules design using a protein target with drug-likeness scoring component REINVENT4

Specify the search space (location) where designed molecule would bind relative to protein target. Then run reinforcement learning to generate new molecules in specified binding region.

WARNING: Reinforcement learning process might take a long time (with 128 molecules per 1 epoch and 50 epochs it could take a day)


Starting

# Clone this project
$ git clone https://github.com/BasedLabs/nolabs
$ cd nolabs

Generate a new token for docker registry https://github.com/settings/tokens/new Select 'read:packages'

$ docker login ghcr.io -u username -p ghp_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

If you want to run a single feature (recommended)

$ docker compose up nolabs
$ docker compose up diffdock
$ docker compose up p2rank
...

OR if you want to run everything on one machine:

$ docker compose up

Server will be available on http://localhost:9000

⚠️ Warning: For macOS users, there are known issues with running Docker Compose properly for certain setups. As an alternative, please follow these recommended steps to run nolabs:

  1. Create a Python Environment with Python 3.11

    • First, ensure you have Python 3.11 installed. If not, you can download it from python.org or use a version manager like pyenv.
    • Create a new virtual environment:
      python3.11 -m venv nolabs-env
  2. Activate the Virtual Environment and Install Poetry

    • Activate the virtual environment:
      source nolabs-env/bin/activate
    • Install Poetry, a tool for dependency management and packaging in Python. You can install it with pip:
      pip install poetry uvicorn
  3. Install Dependencies Using Poetry

    poetry install
  4. Start a Uvicorn Server

    • Set your environment variable and start the Uvicorn server with the following command:
      NOLABS_ENVIRONMENT=dev poetry run uvicorn nolabs.api:app --host=127.0.0.1 --port=8000
    • This command runs the nolabs API server on localhost at port 8000.
  5. Set Up the Frontend

    • In a separate terminal, ensure you have npm installed. If not, you can install Node.js and npm from nodejs.org.
    • Run npm install to install the necessary Node.js packages:
      npm install
  • After installing the packages, start the frontend development server:
    npm run dev

Server will be available on http://localhost:9000

APIs

We provide individual Docker containers backed by FastAPI for each feature, which are available in the /microservices folder. You can use them individually as APIs.

For example, to run the esmfold service, you can use Docker Compose:

$ docker compose up esmfold

Once the service is up, you can make a POST request to perform a task, such as predicting a protein's folded structure. Here's a simple Python example:

import requests

# Define the API endpoint
url = 'http://127.0.0.1:5736/run-folding'

# Specify the protein sequence in the request body
data = {
    'protein_sequence': 'YOUR_PROTEIN_SEQUENCE_HERE'
}

# Make the POST request and get the response
response = requests.post(url, json=data)

# Extract the PDB content from the response
pdb_content = response.json().get('pdb_content', '')

print(pdb_content)

This Python script makes a POST request to the esmfold microservice with a protein sequence and prints the predicted PDB content.

Running services on a separate machine

Since we provide individual Docker containers backed by FastAPI for each feature, available in the /microservices folder, you can run them on separate machines. This setup is particularly useful if you're developing on a computer without GPU support but have access to a VM with a GPU for tasks like folding, docking, etc.

For instance, to run the diffdock service, use Docker Compose on the VM or computer equipped with a GPU.

On your server/VM/computer with a GPU, run:

$ docker compose up diffdock

Once the service is up, you can check that you can access it from your computer by navigating to http://< gpu_machine_ip>:5737/docs

If everything is correct, you should see the FastAPI page with diffdock's API surface like this:

Next, update the nolabs/infrastructure/settings.ini file on your primary machine to include the IP address of the service (replace 127.0.0.1 with your GPU machine's IP):

...
p2rank = http://127.0.0.1:5731
esmfold = http://127.0.0.1:5736
esmfold_light = http://127.0.0.1:5733
msa_light = http://127.0.0.1:5734
umol = http://127.0.0.1:5735
diffdock = http://127.0.0.1:5737 -> http://74.82.28.227:5737
...

And now you are ready to use this service hosted on a separate machine!

Supported microservices list

1) Protein design docker API

Model: RFdiffusion

RFdiffusion is an open source method for structure generation, with or without conditional information (a motif, target etc).

docker compose up protein_design

Swagger UI will be available on http://localhost:5789/docs

or install as a python package

2) ESMFold docker API

Model: ESMFold - Evolutionary Scale Modeling

docker compose up esmfold

Swagger UI will be available on http://localhost:5736/docs

or install as a python package

3) ESMAtlas docker API

Model: ESMAtlas

docker compose up esmfold_light

Swagger UI will be available on http://localhost:5733/docs

or install as a python package

4) Protein function prediction docker API

Model: Hugging Face

docker compose up gene_ontology

Swagger UI will be available on http://localhost:5788/docs

or install as a python package

5) Protein localisation prediction docker API

Model: Hugging Face

docker compose up localisation

Swagger UI will be available on http://localhost:5787/docs

or install as a python package

6) Protein binding site prediction docker API

Model: p2rank

docker compose up p2rank

Swagger UI will be available on http://localhost:5731/docs

or install as a python package

7) Protein solubility prediction docker API

Model: Hugging Face

docker compose up solubility

Swagger UI will be available on http://localhost:5786/docs

or Install as python package

8) Protein-ligand structure prediction docker API

Model: UMol

docker compose up umol

Swagger UI will be available on http://localhost:5735/docs

or Install as python package

9) RoseTTAFold docker API

Model: RoseTTAFold

docker compose up rosettafold

Swagger UI will be available on http://localhost:5738/docs

or Install as python package

WARNING: To use Rosettafold you must change the volumes '.' to point to the specified folders.

10) REINVENT4 Reinforcement Learning on a Protein receptor API

Model: REINVENT4

Misc: DockStream, QED, AutoDock Vina

docker compose up reinvent

Swagger UI will be available on http://localhost:5790/docs

or Install as python package

WARNING: Do not change the number of guvicorn workers (1), this will lead to microservice issues.

Technologies

The following tools were used in this project:

Requirements

[Recommended for laptops] If you are using a laptop, use --test argument (no need to have a lot of compute):

  • RAM > 16GB
  • [Optional] GPU memory >= 16GB (REALLY speeds up the inference)

[Recommended for powerful workstations] Else, if you want to host everything on your machine and have faster inference (also a requirement for folding sequences > 400 amino acids in length):

  • RAM > 30GB
  • [Optional] GPU memory >= 40GB (REALLY speeds up the inference)

Made by Igor and Tim

 

Back to top