👥 A Python post-quantum cryptography library
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.
attacks updated structure Jul 15, 2016
crypto added cli functionality Jul 16, 2016
operations discrepancy Jul 17, 2016
README.md updated description Sep 9, 2016
__init__.py updated structure Jul 15, 2016
pqp added cli functionality Jul 16, 2016
test.py refresh Jul 16, 2016


libPQP - a Python post-quantum library

Update 8 September 2016: Current version of libPQP is deprecated due to a newly published attack. Possible mitigations:

  • Forward-secrecy method where the secret key and corresponding public key gets updated over short intervals.
  • Automatically reject any message which is decoded with higher error rate than some constant t, where t implies a negligable decoding error.

This is a simplistic prototype of a post-quantum cryptography library in Python. PQP stands for Post-Quantum PGP. The library is not production ready and should not be used in a real-life context, but works fine for testing purposes. The plan is, once the code has been audited, to translate it to Javascript and create a webapp.

In this prototype, the focus has mainly been on making the QC-MDPC part efficient and not the actual protocol. Hence, you may find vulnerabilities in the current implementation of the protocol. Also, the primitives used in the code are not the ones mentioned below. This prototype uses:

  • AES-256(m, k, iv) as symmetric cipher,
  • SHA-256(token + salt) as PBKDF2,
  • A truncated SHA-512(token + salt) for iv.

The final product will use Salsa-20 as symmetric-cipher primitive and Poly1305 for authentication purposes. Moreover, PBKDF2 or similar will be used for symmetric-key generation.

Speed-ups in the decoding use the fast fourier transform (FFT) to achieve O(n log n) complexity in modular polynomial multiplications, instead of O(n²). Because the FFT implementation in Numpy is restricted to certain lengths (multiples of powers of 2), we use pyfftw which is a wrapper for FFTW3. FFTW3 implements Winograd's FFT algoritm and supports prime-length blocks. See below for known vulnerabilities.

Below are given the proposed parameters for rate R = 1/2.

Public-key size Private-key size Rate Error weight Bit security
4801 9602 1/2 84 80
9857 19714 1/2 134 128
32771 65542 1/2 264 256

Since the encrypted token is a codeword of length 9602 (for 80-bit security), we add approximately 1200 bytes of data to the ciphertext. Apart from this, a 32-byte MAC is included. This inflates a (padded) message of size M to size 1232 + M. For higher security levels, the inflation will be larger — but still constant. In the DER format, the inflation is about 35 %.

What is post-quantum cryptography?

Today, most security assertions depend on primitives based on number theory. In principle, all of these primitives stand and fall on the problem of factoring integers. In the 1980's, a theoretical model of a computer that exploits certain quantum mechanical effects to achieve better complexity in certain classes of problems (BQP) was proposed. It so happens that the problem of factoring integers is contained in this class. Such a computer, called a quantum computer, can factor any integer N in time polynomial to the number of bits in N. This poses an actual problem for the security industry because RSA, ECC and DH, to name a few, can be broken efficiently. In turn, such a break causes the whole security architecture, upon which the secure internet is built, to collapse. Even symmetric primitives, such as AES, are subject to quantum attacks. However, the impact is much less severe; the speed-up in attacks is only a square-root factor.

To remedy the problem of quantum attacks, post-quantum cryptography was proposed. There has been many candidates, often based on so-called NP-complete problems. One such candidate is McEliece public-key cryptosystem, which is based on a hard problem called random linear decoding. Most of the linear-decoding problems are hard, but we should point out that some linear codes with special properties are very easy to decode. McEliece PKC extends this idea by defining such an easily decodable linear code, which becomes the private key. Then, a scrambled (random-looking) version of the linear code is used as public key. This faces the holder of the private key with an easy problem, but the attacker faces a hard problem.

1. Generate a linear code with generator matrix G (usually a Goppa code).
2. Compute G' = S × G × P, where S is an invertible matrix and P a permutation matrix.
3. Return keypair G, G'

Suppose Bob wants to send Alice a message m. To encrypt, he does the following:

1. Retrieves Alice's public key G'.
2. Compute ciphertext c = m × G' + e and send it to Alice

Note that these operations require basically no work at all, so encryption is very fast. Now Alice receives the ciphertext c.

1. Alice obtains c. Knowing S, G and P, she computes u = c × inv(P) = m × (S × G) = (m × S) × G + e × inv(P).
2. Using the generator matrix G (which defined an efficiently decodable code), she can decode m × S.
3. Having c' = m × S, the message is formed by removing the S, i.e., c' × inv(S) = m × S × inv(S) = m.

The QC-MDPC McEliece is a bit different, but the principle is the same. Instead of using matrices, it operates on polynomials (or, equivalently, circular matrices). Here, the private key consists of two sparse polynomials H₀ and H₁, which can be used to perform efficient decoding. H₀ and H₁ form the private key. The public key is H₀ × inv(H₁), which is not sparse and (presumably) cannot be used for efficient decoding. Encryption and decryption is done is similar ways to the above.

High-level description

The sender side

In this section, we will briefly describe the protocol. Much like a Fujisaki-Okamoto transform, it contains both an asymmetric part and a symmetric one. Consider the following scenario. Assume that Bob wants to send Alice a message. Denote Alice's keypair (pubkey, privkey). Bob takes the following steps:

1. Bob picks a random token T.
2. He then uses Alice's public key denoted pubkey and encrypts the token T using QC-MDPC McEliece.
3. The token T is used to generate the symmetric key k₁ and the MAC key k₂ (PBKDF2).
4. The error vector used in the second step is concatenated with the message and a MAC is generated using k₂.
4. The message and the MAC are then encrypted with the symmetric key k₁.
5. The ciphertext is the concatenation of the encrypted token and encrypted message + MAC.

The ciphertext can now be distributed to Alice, using arbitrary means of communication. Below is a graphical interpretation of the above steps.

protocol sender

The receiver end

Now Alice wants to decrypt the message sent by Bob. She performs the following steps in order to do so:

1. Alice decrypts the encrypted token T using her private key privkey. In the decryption, the error vector is determined.
2. Using the decrypted token T, she derives the same symmetric key k₁ as Bob and decrypts the message.
3. The message and the MAC are extracted.
4. The MAC of the message and error vector is verified using the key k₂ derived from the token.
5. If the verification returns True, Alice accepts the message. Otherwise she rejects it.

This completes the outline of the protocol. Below is a graphical interpretation of the above steps.

protocol receiver

Key format

The keys are encoded in Base64 and ASN.1, just like the normal ascii-armored keys used in public-key cryptography. The public-key structures contains only the generator polynomial G.

class ASN1PublicKey(pyasn1.type.univ.Sequence):
    componentType = namedtype.NamedTypes(
        namedtype.NamedType('G',     pyasn1.type.univ.BitString())

A typical (encoded) public key has the following appearance:


The private key contains the two secret-key polynomials H₀ and H₁. Because inversion takes quite a while to perform, the inverse of H₁ is also contained in the private-key structure. This is not necessary, since inv(H₁) is only used when deriving the public key.

class ASN1PrivateKey(pyasn1.type.univ.Sequence):
    componentType = namedtype.NamedTypes(
        namedtype.NamedType('H0',    pyasn1.type.univ.BitString()),
        namedtype.NamedType('H1',    pyasn1.type.univ.BitString()),
        namedtype.NamedType('H1inv', pyasn1.type.univ.BitString()),

Below is a typical private key given. As we can see, the elements H₀ and H₁ are very sparse. These polynomials could be encoded more efficiently, but we don't care too much about private-key size and is therefore left as is.


The messages are structured in a similar way. The encrypted token (C₀, C₁) is stored in two separate Bitstrings and the symmetric ciphertext are stored as an OctetString.

 class ASN1Ciphertext(pyasn1.type.univ.Sequence):
     componentType = namedtype.NamedTypes(
         namedtype.NamedType('C0',    pyasn1.type.univ.BitString()),
         namedtype.NamedType('C1',    pyasn1.type.univ.BitString()),
         namedtype.NamedType('Sym',   pyasn1.type.univ.OctetString()),

We encrypted the message

this is a really secret message that is padded with some random.

and put it into the ASN.1 format. Below is the ciphertext.



Below is a graph of timings of my computer (Macbook Pro 15" Retina 2 GHz Intel Core i7) running 1000 decryptions of a small plaintext (thus, isolating the asymmetric timing) using the same private key (σ = 0.0386, μ = 0.493):


Here is the output of encrypting and decrypting a 190 mb video file:

$ time ./pqp --encrypt libpqp.mov --pubkey pub --output libpqp.enc
Encrypting: libpqp.mov

real   0m11.331s
user   0m9.293s
sys   0m1.715s

$ time ./pqp --decrypt libpqp.enc --privkey priv --output libpqp2.mov
Decrypting: libpqp.enc
MAC verified.

real   0m11.422s
user   0m8.633s
sys   0m2.446s

$ ls -la libpqp* 
-rw-r--r--  1 carl  staff  258900672 17 Jul 00:57 libpqp.enc
-rw-r--r--@ 1 carl  staff  191652442 16 Jul 23:22 libpqp.mov
-rw-r--r--@ 1 carl  staff  191652442 17 Jul 00:58 libpqp2.mov

$ md5 libpqp2.mov 
MD5 (libpqp2.mov) = 88ebe9d8aa74ebba7c5de6faa048af46

$ md5 libpqp.mov 
MD5 (libpqp.mov) = 88ebe9d8aa74ebba7c5de6faa048af46

Command-line tool

The library can be invoked by a supplied CLI tool (the pqp file). Padding is not included, so some kind of PKCS padding will be added.

To generate a keypair, run the following command:

pqp --gen [private-key file] [public-key file]

This creates two files in the same folder containing the two keys.

To encrypt a file, call pqp as follows:

pqp --encrypt [plaintext file] --pubkey [public-key file]

This writes an ASN.1 encoded ciphertext to stdout. Appending --output writes to file:

pqp --encrypt [plaintext file] --pubkey [public-key file] --outout [ciphertext file]

To decrypt, invoke the following command. To write to file, use the above methodology.

pqp --decrypt [ciphertext file] --privkey [private-key file]


Below are the required steps to make libPQP run on Linux (tested on Ubuntu 16.04 Xenial):

git clone https://github.com/grocid/libPQP.git
sudo pip install numpy
sudo pip install pycrypto
sudo apt-get install libfftw3-dev
sudo pip install pyFFTW
sudo pip install pyasn1

Possible vulnerabilities

Decryption oracle

The protocol can be designed using normal McEliece or Niederreiter. In case of McEliece, the error vector should be part of the authentication (for instance, generate MAC using a concatenation of message and error vector). Such a measure will mitigate the usual decryption oracle attack, described below.

1. Intercept an encrypted message.
2. Pick a random bit of the ciphertext.
3. Flip it. If decryption fails, this was not an error position.
4. Repeat until all error positions have been unraveled.

Obviously, there is an implicit assumption that the receiver will either reject any error larger than T or the decoder will fail (which rarely is the case).

If the protocol instead is designed using the Niederreiter model, the error vector will be/encode the token. In this case, there is no need to authenticate the error vector. Since any flipped bit in the ciphertext will cause the receiver to decode a different token, it will break the decryption oracle.

Timing attacks

This is a slight variation of the above. Instead of observing decryption errors, we measure the timing. There has been some effort in making decoding run in constant time. See this paper.

The decoding we use is probabilistic and susceptible to timing attacks. Below is an image with the timing attack implemented running 15 decryptions for each bit flipped.


However, in the PGP-like setting we do not worry too much about this.

Distinguishing attacks

The simplest imaginable distinguisher will detect a constant-error encryption with probability 1.

1. Pick a ciphertext block with block weight l and error weight w.
2. Sum all symbols mod 2 and check if it equals (l + w) mod 2.

The theory is described in more detail here. There is an easy counter-measure; we can thwart this attack completely by picking the error weight odd with probability 1/2:

1. Flip a balanced coin.
2. If the coin shows tails, pick a position at random and flip it.

This attack is contained in distinguisher.py.

Squaring/subcode attacks

Squaring attacks exploit that (the now deprecated) p = 4800 = 2⁶ × 75. By squaring the polynomial, the vector space decreases in size by a factor 2 (which can be done six times). It may also lead to collisions in the error vector, causing a decrease in error weight. This allows an attacker to go quite far below 80-bit security. See this paper.

This attack can be mitigated by picking a prime block length p. In the example above, p = 4801.

Academic papers

MDPC-McEliece: New McEliece Variants from Moderate Density Parity-Check Codes

Lightweight Code-based Cryptography: QC-MDPC McEliece Encryption on Reconfigurable Devices

Squaring attacks on McEliece public-key cryptosystems using quasi-cyclic codes of even dimension


Miroslav Kratochvil (creator of codecrypt) for pointing out a weakness in the protocol.