# Chroma Tutorial

First, run the [setup cell](#setup) below. Then, run [this cell](#unconditional-chain) to get a Chroma sample. Further examples are below.

In [None]:
# @title Setup

# @markdown [Get your API key here](https://chroma-weights.generatebiomedicines.com) and enter it below before running.

from google.colab import output

output.enable_custom_widget_manager()

import os

os.environ["CUBLAS_WORKSPACE_CONFIG"] = ":4096:8"
import contextlib

api_key = ""  # @param {type:"string"}

!pip install git+https://github.com/generatebio/chroma.git > /dev/null 2>&1

import torch

torch.use_deterministic_algorithms(True, warn_only=True)

import warnings
from tqdm import tqdm, TqdmExperimentalWarning

warnings.filterwarnings("ignore", category=TqdmExperimentalWarning)
from functools import partialmethod

tqdm.__init__ = partialmethod(tqdm.__init__, leave=False)

from google.colab import files
import ipywidgets as widgets


def create_button(filename, description=""):
    button = widgets.Button(description=description)
    display(button)

    def on_button_click(b):
        files.download(filename)

    button.on_click(on_button_click)


def render(protein, trajectories=None, output="protein.cif"):
    display(protein)
    print(protein)
    protein.to_CIF(output)
    create_button(output, description="Download sample")
    if trajectories is not None:
        traj_output = output.replace(".cif", "_trajectory.cif")
        trajectories["trajectory"].to_CIF(traj_output)
        create_button(traj_output, description="Download trajectory")


import locale

locale.getpreferredencoding = lambda: "UTF-8"

from chroma import Chroma, Protein, conditioners
from chroma.models import graph_classifier, procap
from chroma.utility.api import register_key
from chroma.utility.chroma import letter_to_point_cloud, plane_split_protein

register_key(api_key)

device = "cuda"

## Sampling basics

Use `Chroma.sample` to get a protein from Chroma. By default, a backbone is generated through reverse diffusion from random noise, and then the sequence and associated side chain atoms are designed on this backbone.

In [None]:
chroma = Chroma()

chain_lengths = [160]  # can have multiple chains in a single complex

protein = chroma.sample(chain_lengths=chain_lengths, steps=200)

Print the protein sequence or display the full structure. There's a `render` function in the setup cell that lets you do both and gives a download button, using `Protein.to_CIF`.

In [None]:
print(protein)
display(protein)

In [None]:
render(protein)

## Sampling options

There are several ways to control the backbone generation and sequence design processes. For instance, the `inverse_temperature` argument to `Chroma.sample` controls the temperature of the backbone sampling. Lower inverse temperature corresponds to more risky sampling.

In [None]:
torch.manual_seed(42)
hight = chroma.sample(chain_lengths=[100], steps=200, inverse_temperature=1)

In [None]:
torch.manual_seed(42)
lowt = chroma.sample(chain_lengths=[100], steps=200, inverse_temperature=10)

## Scoring proteins

We can score the generated proteins with `Chroma.score`. Generally, lower temperature sampling gives better quality at the expense of diversity.

In [None]:
lowt_scores = chroma.score(lowt)
hight_scores = chroma.score(hight)
print(lowt_scores["elbo"].score, hight_scores["elbo"].score)

## Getting diffusion trajectories

Let's make a complex with two chains. This time, we'll set `full_output` to also get the diffusion trajectories.

In [None]:
protein, trajectories = chroma.sample(
    chain_lengths=[140, 140], steps=200, full_output=True
)
render(protein, trajectories)

At each step in the reverse diffusion process, the model produces a best guess of what the sample should look like when fully denoised. These predictions are stored in the `Xhat_trajectory` key of the trajectory output. We can output these and see how the generated sample evolves towards the denoised prediction.

In [None]:
print(list(trajectories.keys()))

trajectories["Xhat_trajectory"].to_CIF("xhat_trajectory.cif")

## Conditional generation

Usually, we want to generate a protein that satisfies particular conditions. Chroma's conditioner framework enables this. Here, we show an example where we redesign the backbone of a protein with some residues fixed; the condition is that the coordinates of the fixed residues can't change through the diffusion process.

We also show the `design_selection` option, which allows us to fix part of the sequence. There's even more you can do with sequence design, including specifying which residues are allowed by position.

In [None]:
protein = Protein("1XYZ", device="cuda")
substructure_conditioner = conditioners.SubstructureConditioner(
    protein=protein,
    backbone_model=chroma.backbone_network,
    selection="not (chain A and resid 30-60)",
).to("cuda")

new_protein = chroma.sample(
    protein_init=protein,
    conditioner=substructure_conditioner,
    langevin_factor=4.0,
    langevin_isothermal=True,
    inverse_temperature=8.0,
    sde_func="langevin",
    steps=500,
    design_selection="chain B and resid 30-60",
)

render(new_protein)

Note that sequence and sidechain design can also be done independently of backbone generation. Here's an example of redesigning the sequence of the same PDB structure.

In [None]:
designed_protein = chroma.design(protein)
print(designed_protein)
print(protein)

## Residue-level conditioning

While the above example used a conditioner applied to the generated structure as a whole, Chroma can also condition on individual residues. Here's a conditioner where we can specify the secondary structure for each residue. You can specify a string where H = helix, E = strand, and T = turn.

In [None]:
SS = "HHHHHHHTTTHHHHHHHTTTEEEEEETTTEEEEEEEETTTTHHHHHHHH"

proclass_model = graph_classifier.load_model("named:public", device=device)
ss_conditioner = conditioners.ProClassConditioner(
    "secondary_structure", SS, max_norm=10.0, model=proclass_model
)
ss_conditioned_protein = chroma.sample(
    conditioner=ss_conditioner, steps=500, chain_lengths=[len(SS)]
)
render(ss_conditioned_protein, output="ss_conditioned_protein.cif")

## Composing conditioners

The `conditioners` module in Chroma allows for composition, via `composed_conditioner = conditioners.ComposedConditioner([conditioner1, conditioner2, ...])`. We can use the secondary structure conditioner from above along with a symmetry conditioner.

In [None]:
symm_conditioner = conditioners.SymmetryConditioner(G="C_3", num_chain_neighbors=2)
composed_cond = conditioners.ComposedConditioner([ss_conditioner, symm_conditioner])

symm_ss_protein = chroma.sample(
    chain_lengths=[len(SS)],
    conditioner=composed_cond,
    langevin_factor=8,
    inverse_temperature=8,
    sde_func="langevin",
    steps=500,
)

render(symm_ss_protein, output="symm_ss_protein.cif")