As we explore strageties to mitigate dual use risks in predictive chemistry (DURPC), we present our data-level mitigation strategy: Selective Noise Addition. In pursuit of public distribution of chemical data in safe ways, we test adding noise to only selected data in the dataset wiht labels identified as sensitive. We test this method with three models:
- 1-D Polynomial Regression
- Multilayer Perceptron (MLP)
- Graph Convolutional Network (GCN) predicting lipophilicity
Read the paper
@article{campbell2023censoring,
title={Censoring chemical data to mitigate dual use risk},
author={Quintina L. Campbell and Jonathan Herington and Andrew D. White},
year={2023},
eprint={2304.10510},
archivePrefix={arXiv},
primaryClass={cs.LG}
}