Skip to content
/ BCD Public

Bayesian optimization for machine learning model compromise detection

Notifications You must be signed in to change notification settings

nphdang/BCD

Repository files navigation

BCD: Bayesian Optimization for Compromise Detection

This is the implementation of the BCD method in the paper "Detection of Compromised Models Using Bayesian Optimization", AI 2019: https://link.springer.com/chapter/10.1007/978-3-030-35288-2_39

Introduction

A developer implements a machine learning model locally and then deploys it on cloud. The "cloud" model provides machine learning services e.g. image classification to end-users. However, hosting a model on cloud exposes a risk that hackers may attack the "cloud" model and alter it to achieve their attack purpose e.g. trojan attack or poison attack. Our goal is to utilize the "local" model and its training data to generate a sensitive sample which can be used to detect whether the "cloud" model was modified or compromised. We formalize the problem of finding a sensitive sample as an optimization problem where the sensitive sample (e.g. an image) maximizes the difference in prediction between the "local" model and the "cloud" model.

We propose the method BCD (Bayesian Optimization for Compromise Detection) to solve our optimization problem. Our method has two main steps: (1) train a generative model (Variational AutoEncoder - VAE) to transform the high-dimensional data space to a non-linear low-dimensional data space and (2) use Bayesian optimization to find the optimal sensitive sample.

Detection rate on the Olivetti dataset

Detection rate

Sensitive samples generated by our method

Sensitive samples 1 Sensitive samples 2

How to run

  1. Run "python 1_nn_vae" to train the local model (a neural network) and train VAE to transform the high-dimensional data space to a low-dimensional data space
  2. Run "python 2_attack_detect" to compare the detection rates of three methods: Using a random training image (Random), Local optimization method (VerIDeep), and our method (BCD)
  3. Run "python 3_plot_result" to plot the results: detection rate and sensitive image

Reference

Deepthi Kuttichira, Sunil Gupta, Dang Nguyen, Santu Rana, Svetha Venkatesh (2019). Detection of Compromised Models Using Bayesian Optimization. AI 2019, Adelaide, Australia. Springer LNCS, 11919, 485-496