Skip to content

mguarin0/LearningToProtectCommunicationsWithAdversarialNeuralCryptography

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

28 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

GAN Cryptosystem

Overview

This repo contains a PyTorch implementation of the Google Brain paper Learning to Protect Communications with Adversarial Neural Cryptography [1] which introduces a novel deep learning architecture to learn a symmetric encryption system.

The system is organized as three parties: Alice, Bob, and Eve in accordance with the classic symmetric cryptosystem probelm introduced by Rivest, et al. [2]. Alice and Bob wish to communicate securely, and Eve wishes to eavesdrop on their communication. Desired security property is secrecy (not integrity) therefore Eve can intercept the communications but can do nothing more.

To model this problem we introduce the system displayed in Figure 1. Each of the three parties are themselves neural networks with different objectives. The Alice and Bob networks wish to communicate such that they are able to communicate with maximal clarity while also maximally hiding their communications from Eve, the eavsedropper in the system. To communicate with Bob, Alice sends a confidential message P (plaintext) to Bob. The message P is an input to Alice along with K (key). When Alice processes the input P and K and produces C the cipher text. Both the Bob and Eve networks receive C, in an attempt to recover P, the original message from Alice. The Bob and Eve networks output as PBob and PEve, respectively. Alice and Bob share the same secrete key, K, which provides them an advantage over Eve. The secrete key, K, is regenerated for each plaintext.

The objectives of each of the network is as follows: Eve’s goal is to reconstruct P accurately and therefore minimize the error between P and PEve. Alice and Bob's goal(s) are to communicate clearly thus to minimize the error between P and PBob but also to hide their communication from Eve. Eve’s objectives contrast with Alice and Bob’s therefore this problem is a great candidate for adversarial training. Alice and Bob discover a cryptosystem to achieve their objectives.

To review the networks objective in a more formal sense:

  • Alice Network: C = A (θA, P, K)
  • Bob Network: PBob = B (θB, C, K)
  • Eve Network: PEve = E (θE, C)
  • L1 distance: d (P, P prime) = Σi=1, N | Pi - Piprime | where N is the length of plaintexts.
  • Bob Reconstruction Error: LBA, θB, P, K) = d (P, B (θB, A (θA, P, K), K))
  • Eve Reconstruction Error: LEA, θE, P, K) = d (P, E (θE, A (θA, P, K)))
  • Loss for Alice and Bob: LABA, θB) = LBA, θB, P, K) - LEA, θE, P, K) the combination reflects that Alice and Bob want to minimize Bob’s reconstruction error and to maximize the reconstruction error.

Files in this repo

  • Main Interface to the program contains Model Training and Inference: main.py
  • Definition of Mix and Transform Architecture: models.py
  • Paths, Plaintext/Key Generation, Encoding/Decoding between UTF-8 and Binary: utils.py
  • Plots of training results: plots.ipynb

Dependencies

  • python 3
  • PyTorch 0.4.0

How to run

python3 src/main.py -h get a list of all command line arguments

python3 src/main.py train model with default command line arguments

python3 src/main.py --run_type inference run model inference

Network Details

The network architecture introduced by Adbadi, et al. [1] is known as the Mix and Transform Architecture. All binary encoded plaintext bits are mapped to [-1,1]. Alice and Bob consists of 1 x Fully Connected Layer 2N x 2N where N is the length in bits of the message. The fully connected layer is then followed by 4 x 1D Convolutional Layers with filter sizes [4, 2, 1, 1], input channels [1, 2, 1, 1], output channels[2, 4, 4, 1]. The strides for the 1D convolution by layer are [1, 2, 1, 1]. Note that same convolution is used to all convolutional layers in order to keep input and output diminsions the same. The activation functions used at each layer are the Sigmoid for all layers except final layer which is a Tanh used to bring values back to a range [-1, 1] that can map to binary values. The Eve uses more or less the same architecture except the Fully Connected Layer dimensions are N x 2N where N is the length in bits of the message because only receiving C. It should be noted that P, K are vectors of same size; however, there is no reason that K has to be the same size as P. P and K are generated from uniform distribution and values are mapped from [0,1] to [-1,1] All Network parameters randomly initialized.

The optimization strategy is mini-batch Gradient Descent with the Adam Optimizer. We want to approximate optimal Eve and therefore alternate training between the Alice-Bob and Eve training, training Eve on 2 mini-batches per step in order to give advantage to the adversary.

Results

Bob and Eve accomplish their training goals within approximately 15,000 training steps. It can be observed that Bob's reconstruction error continues to improve while Eve's reconstruction error remains just slightly better than 1 or 2 bits better than random guessing. This shows that Bob's error is being minimized while Eve's error is being maximized.

In the figure above it can be observed that Alice and Bob's total training error is improved throughout the training process. These findings were consistent with the results published in the original paper Adbadi, et al. [1].

Limitations of the Mix and Transform Architecture

The symmetric cryptosystem proposed by Adbadi, et al. [1] shows very promising results for encrypting binary encoded plaintext messages; however, here are several implementation details that prevent this system from having practical application. Working on binary encoded plaintext messages is problematic because if the reconstructed plaintext has any bit that is not properly reconstructed it could result in a non valid binary string or could completely change the meaning of the original message. In the rehelm of deep learning working with bits does not provide the same computational advantage that it does with classic encryption algorithms. The neural networks proposed in this system could therefore operate over tokenized plaintext rather than binary encoded plaintext with the same computational complexity. In future work to improve the practical application of a deep learning based symmetric cryptosystem input to the system could be tokenized plaintext or word embedded vectors.

References

  1. Martin Abadi, David G. Andersen. Learning To Protect Communications With Generative Adversarial Neural Networks. arXiv:1610.06918, October 2016.
  2. Rivest, R. L.; Shamir, A.; Adleman, L. A Method for Obtaining Digital Signatures and Public-key Cryptosystems. January 1978.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published