GitHub - krdav/SPURF: Substitution Profiles Using Related Families

Substitution Profiles Using Related Families (SPURF)

This code repository contains a command line implementation of SPURF, that takes a single antibody heavy chain DNA sequence and returns its inferred substitution profile and a logo plot of this. SPURF uses cached data from a large-scale Rep-Seq dataset as input to a statistical model made to determine a detailed clonal family specific substitution profile for a single input sequence. Source code to fit the SPURF model from scratch using another dataset is also provided. Results and methods are described in our preprint. The dataset used in the paper is available in our Zenodo bucket.

Cloning this repo

Clone this GitHub repo recursively to get the necessary submodules:

git clone --recursive https://github.com/krdav/SPURF.git
cd SPURF
git pull --recurse-submodules https://github.com/krdav/SPURF.git

Installation

There are two supported ways of installing the command line implementation of SPURF: 1) using Conda on Linux and 2) using Docker and the provided Dockerfile. The Conda installation has been tested on our own servers and a fresh Ubuntu installation on a VirtualBox. Using VirtualBox, SPURF can be installed on both Mac and Windows. Alternatively, Docker can also be used on any platform that supports it.

Using Conda

First, install Conda for Python 2. Miniconda is sufficient and much faster at installing. Remember to source ~/.bashrc if continuing installing in the same terminal window.

Install dependencies with apt-get:

sudo apt-get update
sudo apt-get upgrade -y
sudo apt-get install -y libz-dev cmake scons libgsl0-dev libncurses5-dev libxml2-dev libxslt1-dev mafft hmmer

Use the INSTALL executable to install the required python environment and partis (via ./INSTALL). After installation, the Conda environment needs to be loaded every time before use, like this:

source activate SPURF

Using Docker

First install Docker.

We have a Docker image on Docker Hub that is automatically kept up to date with the master branch of this repository. It can be pulled and used directly:

sudo docker pull krdav/spurf

Alternatively you can build the container yourself from inside the main repository directory:

sudo docker build -t spurf .

To run this container, use a command such as (see modifications below)

sudo docker run -it -v host-dir:/host krdav/spurf /bin/bash

replace host-dir with the local directory to which you would like access inside your container
replace /host with the place you would like this directory to be mounted
if you built your own container, use spurf in place of krdav/spurf

Detach using ctrl-p ctrl-q.

Running SPURF

SPURF is wrapped into an Rscript named run_SPURF.R that takes three inputs:

an antibody heavy chain DNA sequence
(optional) the basename for the two output files which are a substitution profile and a logo plot
the model type (i.e. l2 or jaccard).

Example run:

Rscript --vanilla run_SPURF.R <input_sequence> <output_base> <model_type>

E.g.:

Rscript --vanilla run_SPURF.R CGCAGGACTGTTGANGCCTTCGGAGACCCTGTCCCTCACCTGCGTTGTCTCTGGCGGGTCCTTCAGTGATTACTACTGGAGCTGGATCCATCAGCCCCCAGGGAAGGGGCTGGAGTGGATTGGGGAAATCAATCATAGTGGGAGCACCAACTACAACCCGTCCCTCGAAAGTCGAGCCACCATATCAGTAGACACGTCCCAGAACAACCTCTCCCTGAAGCTGAGCTCTGTGACCGCCGCGGACTCGGCTGTGTATTACTGTGCGAGAGGCCCGACTACAATGGCTCACGACTTTGACTACTGGGGCCAGGGAACCCTGGTCACC seqXYZ_SPURF_output l2

By default, the model type l2 is used.

Output example

Zooming in on CDR2 and its flanking frameworks:

Name		Name	Last commit message	Last commit date
Latest commit History 76 Commits
ANARCI_KD_copy @ 5c4a17f		ANARCI_KD_copy @ 5c4a17f
S5F		S5F
cached_data		cached_data
data		data
data_processing_and_model_fitting_code		data_processing_and_model_fitting_code
output_examples		output_examples
partis @ 1342fce		partis @ 1342fce
.gitignore		.gitignore
.gitmodules		.gitmodules
Dockerfile		Dockerfile
INSTALL		INSTALL
INSTALL_docker		INSTALL_docker
LICENSE		LICENSE
README.md		README.md
SPURF.R		SPURF.R
annotate_sequence.py		annotate_sequence.py
model_fitting.cpp		model_fitting.cpp
modules2install.R		modules2install.R
neutral_profile.py		neutral_profile.py
run_SPURF.R		run_SPURF.R

License

krdav/SPURF

Folders and files

Latest commit

History

Repository files navigation

Substitution Profiles Using Related Families (SPURF)

Cloning this repo

Installation

Using Conda

Using Docker

Running SPURF

Output example

About

Topics

Resources

License

Stars

Watchers

Forks

Languages