Skip to content

Latest commit

 

History

History
84 lines (45 loc) · 6.91 KB

adversarial-machine-learning.md

File metadata and controls

84 lines (45 loc) · 6.91 KB

Adversarial Machine Learning

Table of Contents

Conferences

  • Evading Machine Learning Malware Detection, Anderson et al., Blackhat US 2017

    Use Reinforcement Learning to evade Machine Learning model, but don't break functionality.

  • EvadeML, Weilin Xu, USENIX Enigma 2017

    Use Genetic Algorithm to generate evasive samples to bypass PDF detection model.

Papers

General

In order to better understand the space of adversarial examples, we survey ten recent proposals that are designed for detection and compare their efficacy. We show that all can be defeated by constructing new loss functions. We conclude that adversarial examples are significantly harder to detect than previously appreciated, and the properties believed to be intrinsic to adversarial examples are in fact not. Finally, we propose several simple guidelines for evaluating future proposed defenses.

We propose a new framework for estimating generative models via an adversarial process, in which we simultaneously train two models: a generative model G that captures the data distribution, and a discriminative model D that estimates the probability that a sample came from the training data rather than G. The training procedure for G is to maximize the probability of D making a mistake.

Attack

Machine learning has been used to detect new malware in recent years, while malware authors have strong motivation to attack such algorithms.Malware authors usually have no access to the detailed structures and parameters of the machine learning models used by malware detection systems, and therefore they can only perform black-box attacks. This paper proposes a generative adversarial network (GAN) based algorithm named MalGAN to generate adversarial malware examples, which are able to bypass black-box machine learning based detection models. MalGAN uses a substitute detector to fit the black-box malware detection system. A generative network is trained to minimize the generated adversarial examples’ malicious probabilities predicted by the substitute detector. The superiority of MalGAN over traditional gradient based adversarial example generation algorithms is that MalGAN is able to decrease the detection rate to nearly zero and make the retraining based defensive method against adversarial examples hard to work.

Defence

Theory

Blogs

Github

Recently several deep learning approaches have been attempted to detect malware binaries using convolutional neural networks and stacked deep autoencoders. Although they have shown respectable performance on a large corpus of dataset, practical defense systems require precise detection during the malware outbreaks where only a handful of samples are available. This project demonstrates the effectiveness of the latent representations obtained through the adversarial autoencoder for malware outbreak detection, which solves fine-grained unsupervised clustering problem on a limited number of unlabeled samples. Using program code distributions mapped to semantic latent vectors, the model provides a highly effective neural signature that helps detecting newly arrived malware samples mutated with minor functional upgrade, function shuffling, or slightly modified obfuscations. In order to get the cluster number, the latent representation (z) is fed to any distance based clustering algorithms such as HDBSCAN.

An adversarial example library for constructing attacks, building defenses, and benchmarking both

This is a library dedicated to adversarial machine learning. Its purpose is to allow rapid crafting and analysis of attacks and defense methods for machine learning models. The Adversarial Robustness Toolbox provides an implementation for many state-of-the-art methods for attacking and defending classifiers.

Deep-pwning is a lightweight framework for experimenting with machine learning models with the goal of evaluating their robustness against a motivated adversary.

We investigate conditional adversarial networks as a general-purpose solution to image-to-image translation problems. These networks not only learn the mapping from input image to output image, but also learn a loss function to train this mapping. This makes it possible to apply the same generic approach to problems that traditionally would require very different loss formulations. We demonstrate that this approach is effective at synthesizing photos from label maps, reconstructing objects from edge maps, and colorizing images, among other tasks. As a community, we no longer hand-engineer our mapping functions, and this work suggests we can achieve reasonable results without hand-engineering our loss functions either.

Summary