# Third-party models


In [None]:
import numpy as np

from pepme.core import (
    compute_metrics,
    show_table,
)
from pepme.metrics.fid import FID
from pepme.third_party import ThirdPartyModel

Some models of interest are not available through e.g., PyPI or Huggingface - only the git repository may be available. Here we show how to run such models in pepme.


An external model is compatible with pepme if satisfies the following three criteria:

- Git repository is public
- Defines a dependency file which `pip install` can detect.
- Contains a function with signature `Callable[[list[str], ...], np.ndarray]`


Let's show this through a toy repository which satisfies these three criteria.


In [None]:
thirdparty_model = ThirdPartyModel(
    entry_point="pepmem.model:embed",
    repo_url="git+https://github.com/RasmusML/pepme-models",
    save_dir="../plugins/pepme-models/embed-1",
    python_bin=None,  # Path to an enviroments python executable. If none, a venv is created.
    branch="embed-1",
)

`ThirdPartyModel` clones the model repository, creates a virtual enviroment (venv) (if `python_bin=None`) and installs the dependencies using `pip install .`.

Assuming everything went well, let's now compute a metric using this embedding model.

In [None]:
def embedder(seq: list[str]) -> np.ndarray:
    return thirdparty_model(seq, batch_size=32)


embedder(["MKQW", "RKSPL"])

array([[22.        ,  4.        , 26.        , 16.        ],
       [24.59674775,  4.47213595, 29.06888371, 17.88854382]])

In [None]:
sequences = {
    "HydrAMP": ["MMRK", "RKSPL", "RRLSK", "RRLSK"],
    "hyformer": ["MKQW", "RKSPL"],
    "Random": ["KKKKK", "PLQ", "RKSPL"],
}

metrics = [FID(reference=sequences["Random"], embedder=embedder)]
df = compute_metrics(sequences, metrics)

show_table(df, decimals=2)

100%|██████████| 3/3 [00:00<00:00, 16.69it/s, data=Random, metric=FID]  


Unnamed: 0,FID↓
HydrAMP,-
hyformer,6.40
Random,0.00
