# Example: PD-L1 Antibodies

This is an example to generate multi-sequence alignment viewer with Bokeh. I am using PD-L1 antibody sequences from Patent ID US 20220298244 A1. You can find the patent [here](https://ppubs.uspto.gov/pubwebapp/) by searching for the ID. 

I used some code from [Damien Farrell](https://github.com/dmnfarrell) at UC Davis from [this](https://dmnfarrell.github.io/bioinformatics/bokeh-sequence-aligner) article.

All the files generated and used are in the examples folder. Each alignment creates a fasta and alignment file.

Lets start by importing all the necessary libraries and the modules I created for this:

In [1]:
import sys 
from bokeh.plotting import figure
import panel as pn
import pandas as pd
from Bio import SeqIO
from Bio.Seq import Seq
from Bio.SeqRecord import SeqRecord
pn.extension()

from bokehMSA import *
from AbNum import *

# Read Sequences and Create Fasta File

I created a csv with some heavy chain sequences in the patent:

In [2]:
df = pd.read_csv("example/PD-L1_Abs.csv", sep=",")
df

Unnamed: 0,Seq_ID_No,Seq
0,12,QVQLNQSGPELMKAGTSVKISCKASGYSFTDYHVNWVKQRPGQGLE...
1,13,QVQLVQSGAEVKKPGASVKVSCKASGYTFTDYHVNWVRQAPGQGLE...
2,14,QVQLVQSGAEVKKPGASVKVSCKASGYTFTDYHVNWVKQRPGQGLE...
3,15,QVQLVQSGAEVKKPGASVKVSCKASGYSFTDYHVNWVKQRPGQGLE...
4,16,QVQLVQSGAEVKKPGASVKVSCKASGYSFTDYHVNWVKQRPGQGLE...
5,17,QVQLVQSGAEVKKPGASVKVSCKASGYTFTDYHVNWVRQAPGQGLE...
6,18,EVQLQESGPGLAKPSQTLSLTCSVTGYSITSDYWNWIRKFPGNKLE...
7,19,QVQLQESGPGLVKPSQTLSLTCTVSGGSITSDYWNWIRQHPGKGLE...
8,20,QVQLQESGPGLVKPSQTLSLTCTVSGGSITSDYWNWIRQHPGNKLE...
9,21,QVQLQESGPGLVKPSQTLSLTCTVSGGSITSDYWNWIRQHPGNKLE...


Generate list of sequences as BioPython SeqRecord objects:

In [3]:
seqs = []
for index, row in df.iterrows():
    seqs.append(SeqRecord(Seq(row['Seq']), id=str(row['Seq_ID_No'])))
print(seqs[0])

ID: 12
Name: <unknown name>
Description: <unknown description>
Number of features: 0
Seq('QVQLNQSGPELMKAGTSVKISCKASGYSFTDYHVNWVKQRPGQGLEWIGWIFPG...VSS')


Output to fasta file using SeqIO:

In [4]:
SeqIO.write(seqs, "example/PD-L1_Abs.fasta", "fasta")

11

# Alignment of All Antibodies

Next, read the fasta file into MSA builder and view with Panel Bokeh pane:

In [5]:
p = getAbMSA("example/PD-L1_Abs.fasta")
p.title = 'PD-L1 Abs Alignment'
pn.pane.Bokeh(p)

I also have numbering support for IMGT. We'll stick with the default IMGT CDR definitions.

In [6]:
pIMGT = getAbMSA("example/PD-L1_Abs.fasta", numbering="i")
pIMGT.title = 'PD-L1 Abs Alignment'
pn.pane.Bokeh(pIMGT)

Looks like theres 2 groups of related antibodies. Lets split them up and look at the alignments for each group.

# Aligning selected antibodies

Create fasta files from each group:

In [7]:
seqs12_17 = []
seqs18_22 = []

for index, row in df.iterrows():
    if int(row['Seq_ID_No']) <= 17:
        seqs12_17.append(SeqRecord(Seq(row['Seq']), id=str(row['Seq_ID_No'])))

    else:
        seqs18_22.append(SeqRecord(Seq(row['Seq']), id=str(row['Seq_ID_No'])))

SeqIO.write(seqs12_17, "example/PD-L1_Abs12_17.fasta", "fasta")

SeqIO.write(seqs18_22, "example/PD-L1_Abs18_22.fasta", "fasta")

5

In [8]:
p2 = getAbMSA("example/PD-L1_Abs12_17.fasta","i")
p2.title = ("PD-L1 Abs 12-17 Alignment")
p3 = getAbMSA("example/PD-L1_Abs18_22.fasta","i")
p3.title = ("PD-L1 Abs 18-22 Alignment")

In [9]:
pn.pane.Bokeh(p2)

In [10]:
pn.pane.Bokeh(p3)

# Outputting all plots to HTML

In [11]:
title = pn.pane.Markdown("""# PD-L1 Antibody Alignments (HC)
PD-L1 heavy chain alignments from US Patent ID#20220298244. 
""",
style={'font-family':'arial, sans-serif'})

column1 = pn.Column(title, p, p2, p3) 
column1.save('example/Example PD-L1 Antibody Alignment.html', embed=True, title="PD-L1 Antibody Alignments")

In [12]:
column1