<div align="center">
<h1> XBCR-net (Cross B-Cell Receptor network) for antibody-antigen binding prediction </h1>

[![DOI](https://img.shields.io/badge/DOI-10.1038%2Fs41422--022--00727--6-darkyellow)](https://www.nature.com/articles/s41422-022-00727-6) \|
<a href="https://github.com/jianqingzheng/XBCR-net"><img src="https://img.shields.io/github/stars/jianqingzheng/XBCR-net?style=social&label=Code+★" /></a>
\|
[![Explore XBCR-net in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/jianqingzheng/XBCR-net/blob/main/XBCR_net.ipynb)
</div>


Code for *Cell Research* paper [Deep learning-based rapid generation of broadly reactive antibodies against SARS-CoV-2 and its Omicron variant](https://doi.org/10.1038/s41422-022-00727-6)

> This implementation includes the training and inference pipeline of XBCR-net based on tensorflow and Keras. The original implementation of its backbone network ACNN could be found in [ACNN repo](https://github.com/XiaoYunZhou27/ACNN).


---
#### Contents ####
- 1. Installation
- 2. Usage
  - 2.1. Training (optional)
  - 2.2a. Inference by entering data
  - 2.2b. Batch Inference
- 3. Citing this work
---

In [None]:
#@title 1. Installation {run: "auto"}
#@markdown Clone code from Github repo: https://github.com/jianqingzheng/XBCR-net.git

!git clone https://github.com/jianqingzheng/XBCR-net.git
%cd XBCR-net/

#@markdown and Install packages

#@markdown [![OS](https://img.shields.io/badge/OS-Windows%7CLinux-darkblue)]()
#@markdown [![PyPI pyversions](https://img.shields.io/badge/Python-3.8-blue)](https://pypi.python.org/pypi/ansicolortags/)
#@markdown [![TensorFlow](https://img.shields.io/badge/TensorFlow-2.4.1-lightblue)](www.tensorflow.org)
#@markdown [![Numpy](https://img.shields.io/badge/Numpy-1.19.5-lightblue)](https://numpy.org)
#@markdown [![Pandas](https://img.shields.io/badge/Pandas-1.1.0-lightblue)](https://pandas.pydata.org/)

#@markdown > Other versions of the packages could also be applicable

!pip install tensorflow==2.4.1
!pip install numpy==1.19.5
!pip install pandas==1.1.0

## 2. Usage

\* Setup
```
[$DOWNLOAD_DIR]/XBCR-net/           
├── data/[$data_name]/
|   ├── exper/
|   |	|   # experimental dataset for training (.xlsx|.csv files)
|   |   └── example-experimental_data.xlsx
|   ├── nonexp/
|   |	|   # negative samples for training (.xlsx|.csv files)
|   |   └── example-negative_data.xlsx
|   └── test/
|       ├── ab_to_pred/
|       |   |   # the antibody data for inference
|       |   └── example-antibody_to_predict.xlsx
|       ├── ag_to_pred/
|       |   |     # the antigen data for inference
|       |   └── example-antigen_to_predict.xlsx
|       └── results/
|           |    # the files to print the inference results
|           └── results_rbd_[$model_name]-[$model_num].xlsx
└── models/[$data_name]/
    └── [$data_name]-[$model_name]/
        |   # the files of model parameters (.tf.index and .tf.data-000000-of-00001 files)
        ├── model_rbd_[$model_num].tf.index
        └── model_rbd_[$model_num].tf.data-000000-of-00001
```
> Default data can be also downloaded from [Data_S1](https://static-content.springer.com/esm/art%3A10.1038%2Fs41422-022-00727-6/MediaObjects/41422_2022_727_MOESM2_ESM.xlsx) (unnecessary in usage)


### 2.1. Training (optional)
1. Upload the experimental data in ```XBCR-net/data/$data_name/exper/``` and the non-experimental data in ```XBCR-net/data/$data_name/nonexp/```

2. Run
```!python main_train.py --model_name XBCR_net --data_name $data_name --model_num $model_num --max_epochs max_epochs --include_light [1/0]```

3. Check the saved model in ```XBCR-net/models/$data_name/$data_name-XBCR_net/```

<div align="center">

| Argument              | Description                                	|
| --------------------- | ----------------------------------------------|
| `--data_name` 	| The data folder name                       	|
| `--model_name`        | The used model                      	     	|
| `--model_num`         | The index number of trained model          	|
| `--max_epochs`        | The max epoch number for training 	     	|
| `--include_light`     | 1/0: include/exclude input of a light chain	|

</div>

In [None]:
#@markdown \* Example for training (optional):

model_name = 'XBCR_net' #@param {type:"string"}
data_name = 'binding' #@param {type:"string"}
model_num = 0     #@param {type:"integer"}
max_epochs = 100   #@param {type:"integer"}
include_light = 1     #@param [0,1]

!python main_train.py --model_name {model_name} --data_name {data_name} --model_num {model_num} --max_epochs {max_epochs} --include_light {include_light}

#@markdown > This training process is optional as the trained model has been provided.

### 2.2a. Inference by entering data ###

In [None]:
#@markdown \* Example for a single data point:

HEAVY='VQLVESGGGLVQPGGSLRLSCAASGFTFSSYDMHWVRQTTGKGLEWVSTIGTAGDTYYPDSVKGRFTISREDAKNSLYLQMNSLRAGDTAVYYCARGDSSGYYYYFDYWGQGTLLTVSS' #@param {type:"string"}
LIGHT='DIEMTQSPSSLSAAVGDRVTITCRASQSIGSYLNWYQQKPGKAPKLLIYAASSLQSGVPSRFSGSGSGTDFTLTISSLQPEDFAIYYCQQSYVSPTYTFGPGTKVDIK'      #@param {type:"string"}
ANTIG='RVQPTESIVRFPNITNLCPFGEVFNATRFASVYAWNRKRISNCVADYSVLYNSASFSTFKCYGVSPTKLNDLCFTNVYADSFVIRGDEVRQIAPGQTGKIADYNYKLPDDFTGCVIAWNSNNLDSKVGGNYNYLYRLFRKSNLKPFERDISTEIYQAGSTPCNGVEGFNCYFPLQSYGFQPTNGVGYQPYRVVVLSFELLHAPATVCGPKKSTNLVKNKCVNF' #@param {type:"string"}

!python pred_bcr.py --heavy $HEAVY --light $LIGHT --antig $ANTIG --model_name XBCR_net --data_name binding --model_num 0


In [None]:
#@markdown \* Example for multiple data points (split by ','):

HEAVY='VQLVESGGGLVQPGGSLRLSCAASGFTFSSYDMHWVRQTTGKGLEWVSTIGTAGDTYYPDSVKGRFTISREDAKNSLYLQMNSLRAGDTAVYYCARGDSSGYYYYFDYWGQGTLLTVSS,EVQLVESGGGLVQPGGSLRLSCAASGFTFNNYWMSWVRQAPGKGLEWVANINQDGSEKYYVDSVMGRFAISRDNAKNSLYLQMNSLRAEDTAVYYCARDQGYGDYFEYNWFDPWGQGTLVTVSS' #@param {type:"string"}
LIGHT='DIEMTQSPSSLSAAVGDRVTITCRASQSIGSYLNWYQQKPGKAPKLLIYAASSLQSGVPSRFSGSGSGTDFTLTISSLQPEDFAIYYCQQSYVSPTYTFGPGTKVDIK,DIQLTQSPSFLSASVGDRVTITCRASQGIYSYLAWYQQKPGKAPKLLIYAASTLQSGVPSRFSGSGSGTEFTLTISSLQPEDFATYYCQQLNSYPITFGQGTRLEIK' #@param {type:"string"}
ANTIG='RVQPTESIVRFPNITNLCPFGEVFNATRFASVYAWNRKRISNCVADYSVLYNSASFSTFKCYGVSPTKLNDLCFTNVYADSFVIRGDEVRQIAPGQTGKIADYNYKLPDDFTGCVIAWNSNNLDSKVGGNYNYLYRLFRKSNLKPFERDISTEIYQAGSTPCNGVEGFNCYFPLQSYGFQPTNGVGYQPYRVVVLSFELLHAPATVCGPKKSTNLVKNKCVNF,RVQPTESIVRFPNITNLCPFGEVFNATRFASVYAWNRKRISNCVADYSVLYNSASFSTFKCYGVSPTKLNDLCFTNVYADSFVIRGDEVRQIAPGQTGKIADYNYKLPDDFTGCVIAWNSNNLDSKVGGNYNYLYRLFRKSNLKPFERDISTEIYQAGSTPCNGVEGFNCYFPLQSYGFQPTNGVGYQPYRVVVLSFELLHAPATVCGPKKSTNLVKNKCVNF' #@param {type:"string"}

#@markdown > Spaces (' ' or '_') and carriage returns ('\n') do not affect data recognition

!python pred_bcr.py --heavy $HEAVY --light $LIGHT --antig $ANTIG --model_name XBCR_net --data_name binding --model_num 0


<div align="center">

| Argument              | Description                                	|
| --------------------- | ----------------------------------------------|
| `--heavy` 		| The heavy chain           			|
| `--light` 		| The light chain                       	|
| `--antig` 		| The antigen                       		|
| `--data_name` 	| The data folder name                       	|
| `--data_name` 	| The data folder name                       	|
| `--model_name`        | The used model                      	     	|
| `--model_num`         | The index number of the used model         	|

</div>

### 2.2b. Batch Inference ###
1. Upload the antibody file in ```XBCR-net/data/$data_name/ab_to_pred/``` and the antibody file in ```XBCR-net/data/$data_name/ag_to_pred/```

2. Run
```!python main_infer.py --model_name XBCR_net --data_name $data_name --model_num $model_num --include_light [1/0]```

3. Download the result excel file from ```XBCR-net/data/binding/test/results/*```

<div align="center">

| Argument              | Description                                	|
| --------------------- | ----------------------------------------------|
| `--data_name` 	| The data folder name                       	|
| `--model_name`        | The used model                      	     	|
| `--model_num`         | The index number of trained model          	|
| `--include_light`     | 1/0: include/exclude input of a light chain	|

</div>

In [None]:
#@markdown \* Example for batch inference:

model_name = 'XBCR_net' #@param {type:"string"}
data_name = 'binding' #@param {type:"string"}
model_num = 0     #@param {type:"integer"}
include_light = 1     #@param [0,1]

!python main_infer.py --model_name {model_name} --data_name {data_name} --model_num {model_num} --include_light {include_light}


In [None]:
#@markdown \* Download the result file.

from google.colab import files
import os
download_path = os.path.join('data',data_name,'test','results','results_rbd_'+model_name+'-'+str(model_num)+'.xlsx')
files.download(download_path)
print('Download the file: '+download_path)

## 3. Citing this work

Any publication that discloses findings arising from using this source code or the network model should cite
- Hantao Lou, Jianqing Zheng, Xiaohang Leo Fang, Zhu Liang, Meihan Zhang, Yu Chen, Chunmei Wang, Xuetao Cao, "Deep learning-based rapid generation of broadly reactive antibodies against SARS-CoV-2 and its Omicron variant." *Cell Research* 33.1 (2023): 80-82.

```bibtex
@article{lou2022deep,
  title={Deep learning-based rapid generation of broadly reactive antibodies against SARS-CoV-2 and its Omicron variant},
  author={Lou, Hantao and Zheng, Jianqing and Fang, Xiaohang Leo and Liang, Zhu and Zhang, Meihan and Chen, Yu and Wang, Chunmei and Cao, Xuetao},
  journal={Cell Research},
  pages={1--3},
  year={2022},
  publisher={Nature Publishing Group},
  doi={10.1038/s41422-022-00727-6},
}
```
and, if applicable, the [ACNN paper](https://ieeexplore.ieee.org/abstract/document/9197328):
- Xiao-Yun Zhou, Jian-Qing Zheng, Peichao Li, and Guang-Zhong Yang, "ACNN: a full resolution dcnn for medical image segmentation." *2020 IEEE International Conference on Robotics and Automation (ICRA)*. IEEE, 2020.

```bibtex
@inproceedings{zhou2020acnn,
  title={Acnn: a full resolution dcnn for medical image segmentation},
  author={Zhou, Xiao-Yun and Zheng, Jian-Qing and Li, Peichao and Yang, Guang-Zhong},
  booktitle={2020 IEEE International Conference on Robotics and Automation (ICRA)},
  pages={8455--8461},
  year={2020},
  organization={IEEE},
  doi={10.1109/ICRA40945.2020.9197328},
}
```