# OpenFold Local Notebook

Provides the flexibility to run inference using a local installation of OpenFold with Docker, along with the convenience of visualizing results using the same plots from the OpenFold Colab Notebook.

This notebook uses the provided utility functions to execute OpenFold via Docker it adds logic to handle results, so you can experiment with different parameters, re-use computed msas, filter the best model, plot metrics and, async and handle long running executions.

If you have access to a machine and want to do quick inference and visualize results, there are some useful things you can do with this notebook:

- You can use precomputed alignments, which enables you to run inference with different model parameters to compare results
- Get the best model and metric plots  
- Handle long-running executions
- Work with big datasets i.e split your input and perform async runs using asyncio on multiple GPUs

Of course, you can do this solely using Docker commands in the terminal, but you would need to code/adjust the Colab functions to work with data locally. This notebook gives you a head start.

## Setup the notebook

Fist, build Openfold using Docker. Follow this [guide](https://openfold.readthedocs.io/en/latest/original_readme.html#building-and-using-the-docker-container).

Start your notebook:

Go to the notebook folder:

`cd notebooks`

Install the requirements in your env:

`pip install -r utils/requirements.txt`

Start your Jupyter server

`jupyter lab . --ip="0.0.0.0"`


## Running Inference 

**Inputs:** files or strings with sequences

**Output:** 

```bash
data/ 
├── run_<run_id>/ # each is run stored with a random ID, this id can be use to re-run inference 
│   ├── fasta_dir/ 
│   │   ├── tmp/ # generated .fasta file per sequence
│   │   └── sequences.fasta # validated input sequences are merged into a .fasta file
│   └── output/
│       ├── alignments/ #  one folder per sequence of resulted MSA
│       ├── predictions/ # inference results .pkl and .pdb files
│       └── timings.json # inference time
```

#### Example usage 1: Using a sequence string

In [None]:
from src.model import InferenceClientOpenFold

# Initialize the OpenFold Docker client setting the database path 
databases_dir = "/home/juliocesar/DataZ/Models/alphafold_all_data"
openfold_client = InferenceClientOpenFold(databases_dir)

# Set the input, for multiple sequences, separate sequences with a colon `:`
input_string = "DAGAQGAAIGSPGVLSGNVVQVPVHVPVNVCGNTVSVIGLLNPAFGNTCVNA:AGETGRTGVLVTSSATNDGDSGWGRFAG"

model_name = "multimer"
weight_set = 'AlphaFold'


# Run inference
run_id = openfold_client.run_inference(weight_set, model_name, inference_input=input_string)

#### Example usage 2: Using a fasta file

In [None]:
input_file = "/home/juliocesar/Models/openfold/notebooks/data/test.fasta"

run_id = openfold_client.run_inference(weight_set, model_name, inference_input=input_file)

#### Example usage 3: Using a pre-computed aligments for a run_id

In [None]:
openfold_client.run_inference(weight_set, model_name, use_precomputed_alignments=True, run_id="8CJUIY")