# XBCR-net (Cross B-Cell Receptor network) for antibody-antigen binding prediction

[![DOI](https://img.shields.io/badge/DOI-10.1038%2Fs41422--022--00727--6-darkyellow)](https://www.nature.com/articles/s41422-022-00727-6)
|
<a href="https://github.com/jianqingzheng/XBCR-net"><img src="https://img.shields.io/github/stars/jianqingzheng/XBCR-net?style=social&label=Code+â˜…" /></a>

Code for *Cell Research* paper [Deep learning-based rapid generation of broadly reactive antibodies against SARS-CoV-2 and its Omicron variant](https://doi.org/10.1038/s41422-022-00727-6)

This implementation includes the training and inference pipeline of XBCR-net based on tensorflow and Keras. The original implementation of its backbone network ACNN could be found in [ACNN repo](https://github.com/XiaoYunZhou27/ACNN).


## Installation
Clone code from Github repo: https://github.com/jianqingzheng/XBCR-net.git


In [None]:
!git clone https://github.com/jianqingzheng/XBCR-net.git
%cd XBCR-net/

Install packages

[![OS](https://img.shields.io/badge/OS-Windows%7CLinux-darkblue)]()
[![PyPI pyversions](https://img.shields.io/badge/Python-3.8-blue)](https://pypi.python.org/pypi/ansicolortags/)
[![TensorFlow](https://img.shields.io/badge/TensorFlow-2.4.1-lightblue)](www.tensorflow.org)
[![Numpy](https://img.shields.io/badge/Numpy-1.19.5-lightblue)](https://numpy.org)
[![Pandas](https://img.shields.io/badge/Pandas-1.1.0-lightblue)](https://pandas.pydata.org/)


In [None]:
!pip install tensorflow==2.4.1
!pip install numpy==1.19.5
!pip install pandas==1.1.0

## Usage

### Training (optional)
1. Upload the experimental data in ```XBCR-net/data/$data_name/exper/``` and the non-experimental data in ```XBCR-net/data/$data_name/nonexp/```

2. Run ```!python main_train.py --model_name XBCR_net --data_name $data_name --model_num $model_num --max_epochs max_epochs --include_light [1/0]```

3. Check the saved model in ```XBCR-net/models/$data_name/$data_name-XBCR_net/```

#### - Example:
1. Check the experimental data in ```XBCR-net/data/$data_name/exper/``` and the non-experimental data in ```XBCR-net/data/$data_name/nonexp/```

2. Run

In [None]:
!python main_train.py --model_name XBCR_net --data_name binding --model_num 0 --max_epochs 100 --include_light 1

3. Check the saved model in ```XBCR-net/models/$data_name/$data_name-XBCR_net/```

### Inference by entering data ###
- Example for a single data point:

In [23]:
HEAVY='VQLVESGGGLVQPGGSLRLSCAASGFTFSSYDMHWVRQTTGKGLEWVSTIGTAGDTYYPDSVKGRFTISREDAKNSLYLQMNSLRAGDTAVYYCARGDSSGYYYYFDYWGQGTLLTVSS'
LIGHT='DIEMTQSPSSLSAAVGDRVTITCRASQSIGSYLNWYQQKPGKAPKLLIYAASSLQSGVPSRFSGSGSGTDFTLTISSLQPEDFAIYYCQQSYVSPTYTFGPGTKVDIK'
ANTIG='RVQPTESIVRFPNITNLCPFGEVFNATRFASVYAWNRKRISNCVADYSVLYNSASFSTFKCYGVSPTKLNDLCFTNVYADSFVIRGDEVRQIAPGQTGKIADYNYKLPDDFTGCVIAWNSNNLDSKVGGNYNYLYRLFRKSNLKPFERDISTEIYQAGSTPCNGVEGFNCYFPLQSYGFQPTNGVGYQPYRVVVLSFELLHAPATVCGPKKSTNLVKNKCVNF'

!python pred_bcr.py --heavy $HEAVY --light $LIGHT --antig $ANTIG --model_name XBCR_net --data_name binding --model_num 0 --include_light 1


2023-09-04 03:13:21.718444: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
Instructions for updating:
non-resource variables are not supported in the long term
2023-09-04 03:13:28.747137: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:353] MLIR V1 optimization pass is not enabled
2023-09-04 03:13:28.927718: W tensorflow/c/c_api.cc:300] Operation '{name:'conv1d_76/bias/Assign' id:5144 op device:{requested: '', assigned: ''} def:{{{node conv1d_76/bias/Assign}} = AssignVariableOp[_has_manual_control_dependencies=true, dtype=DT_FLOAT, validate_shape=false](conv1d_76/bias, conv1d_76/bias/Initializer/zeros)}}' was changed by setting attribute after it was run by a session. This mutation will have no effect, and will trigger an error in the future.

- Example for multiple data points (split by ','):

In [22]:
HEAVY='VQLVESGGGLVQPGGSLRLSCAASGFTFSSYDMHWVRQTTGKGLEWVSTIGTAGDTYYPDSVKGRFTISREDAKNSLYLQMNSLRAGDTAVYYCARGDSSGYYYYFDYWGQGTLLTVSS,EVQLVESGGGLVQPGGSLRLSCAASGFTFNNYWMSWVRQAPGKGLEWVANINQDGSEKYYVDSVMGRFAISRDNAKNSLYLQMNSLRAEDTAVYYCARDQGYGDYFEYNWFDPWGQGTLVTVSS'
LIGHT='DIEMTQSPSSLSAAVGDRVTITCRASQSIGSYLNWYQQKPGKAPKLLIYAASSLQSGVPSRFSGSGSGTDFTLTISSLQPEDFAIYYCQQSYVSPTYTFGPGTKVDIK,DIQLTQSPSFLSASVGDRVTITCRASQGIYSYLAWYQQKPGKAPKLLIYAASTLQSGVPSRFSGSGSGTEFTLTISSLQPEDFATYYCQQLNSYPITFGQGTRLEIK'
ANTIG='RVQPTESIVRFPNITNLCPFGEVFNATRFASVYAWNRKRISNCVADYSVLYNSASFSTFKCYGVSPTKLNDLCFTNVYADSFVIRGDEVRQIAPGQTGKIADYNYKLPDDFTGCVIAWNSNNLDSKVGGNYNYLYRLFRKSNLKPFERDISTEIYQAGSTPCNGVEGFNCYFPLQSYGFQPTNGVGYQPYRVVVLSFELLHAPATVCGPKKSTNLVKNKCVNF,RVQPTESIVRFPNITNLCPFGEVFNATRFASVYAWNRKRISNCVADYSVLYNSASFSTFKCYGVSPTKLNDLCFTNVYADSFVIRGDEVRQIAPGQTGKIADYNYKLPDDFTGCVIAWNSNNLDSKVGGNYNYLYRLFRKSNLKPFERDISTEIYQAGSTPCNGVEGFNCYFPLQSYGFQPTNGVGYQPYRVVVLSFELLHAPATVCGPKKSTNLVKNKCVNF'

!python pred_bcr.py --heavy $HEAVY --light $LIGHT --antig $ANTIG --model_name XBCR_net --data_name binding --model_num 0 --include_light 1


2023-09-04 03:12:17.573340: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
Instructions for updating:
non-resource variables are not supported in the long term
2023-09-04 03:12:25.497433: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:353] MLIR V1 optimization pass is not enabled
2023-09-04 03:12:25.671421: W tensorflow/c/c_api.cc:300] Operation '{name:'conv1d_68/bias/Assign' id:4662 op device:{requested: '', assigned: ''} def:{{{node conv1d_68/bias/Assign}} = AssignVariableOp[_has_manual_control_dependencies=true, dtype=DT_FLOAT, validate_shape=false](conv1d_68/bias, conv1d_68/bias/Initializer/zeros)}}' was changed by setting attribute after it was run by a session. This mutation will have no effect, and will trigger an error in the future.

### Batch Inference
1. Upload the antibody file in ```XBCR-net/data/$data_name/ab_to_pred/``` and the antibody file in ```XBCR-net/data/$data_name/ag_to_pred/```

2. Run ```!python main_infer.py --model_name XBCR_net --data_name $data_name --model_num $model_num --include_light [1/0]```

3. Download the result excel file from ```XBCR-net/data/binding/test/results/*```

#### - Example:
1. Check the antibody file in ```XBCR-net/data/$data_name/ab_to_pred/``` and the antibody file in ```XBCR-net/data/$data_name/ag_to_pred/```
2. Run

In [None]:
!python main_infer.py --model_name XBCR_net --data_name binding --model_num 0 --include_light 1

3. Download the result excel file from ```XBCR-net/data/binding/test/results/results_rbd_XBCR_net-0.xlsx```

## Citing this work

Any publication that discloses findings arising from using this source code or the network model should cite
```bibtex
@article{lou2022deep,
  title={Deep learning-based rapid generation of broadly reactive antibodies against SARS-CoV-2 and its Omicron variant},
  author={Lou, Hantao and Zheng, Jianqing and Fang, Xiaohang Leo and Liang, Zhu and Zhang, Meihan and Chen, Yu and Wang, Chunmei and Cao, Xuetao},
  journal={Cell Research},
  pages={1--3},
  year={2022},
  publisher={Nature Publishing Group},
  doi={10.1038/s41422-022-00727-6},
}
```
and, if applicable, the [ACNN paper](https://ieeexplore.ieee.org/abstract/document/9197328):
```bibtex
@inproceedings{zhou2020acnn,
  title={Acnn: a full resolution dcnn for medical image segmentation},
  author={Zhou, Xiao-Yun and Zheng, Jian-Qing and Li, Peichao and Yang, Guang-Zhong},
  booktitle={2020 IEEE International Conference on Robotics and Automation (ICRA)},
  pages={8455--8461},
  year={2020},
  organization={IEEE},
  doi={10.1109/ICRA40945.2020.9197328},
}
```