# Preparation

## Cleartext

We create a text file with some content. We will use that file as a byte sequence to be encrypted and decrypted.

Obviously, you can use any other kind of file: pdf, ppts, mp3, jpg...whatever. In that case the content of the file will have to be visualized with a different command.

In [None]:
!echo "kakà, shevchenko, gullit, van basten" > top_secret.txt
!cat top_secret.txt

## Modifying a single byte in a file

Some of the examples below suggest to modify one single byte in a file and see what happens (in particular, in a file that contains an encrypted byte sequence or that contains an encryption key).

A simple way for doing that is by using the following code: just copy it and modify it as appropriate. It first copies file *tobemodified* in file *modified*; then it overwrites the first byte of *modified* with value 31 (hex).

In [None]:
!cp tobemodified modified
!printf '\x31' | dd of=modified bs=1 seek=1 count=1 conv=notrunc


## Man pages

Openssl is very powerful and, as such, quite complex. We will explore just a few of its numerous functionalities.

The relevant man pages with all the (complex) options are:
*   Symmetric cryptography: [enc command](https://www.openssl.org/docs/man1.0.2/man1/openssl-enc).
*   Asymmetric cryptography: [pkeyutil command](https://www.openssl.org/docs/man3.0/man1/openssl-pkeyutl.html).

# Private key (symmetric) cryptography

## Encryption

The *openssl enc* command encrypts a specified input file (-in) and writes the result in another specified output file (-out). There are many encryption algorithms available (called ciphers), here we use AES-CBC with key length 256 bits.

We generate the key from a *passphrase* (a password). There are several deterministic algorithms (*key derivation functions*) for deriving a key from a text string, here we use pbkdf2.

Note that the passphrase will not be stored anywhere: it is asked by  *openssl enc* while encrypting/decrypting. If one wanted to execute the command without involving the user, the *-pass passphrase* option can be used.


In [None]:
!openssl enc -aes-256-cbc -pbkdf2 -in top_secret.txt -out top_secret.enc

Print the encrypted file to make sure you do not understand anything (of course, the content is not merely "strange"; it does not provide any information about the original file, nor about the encryption key that has been used).

In [None]:
!cat top_secret.enc

You might want to have the encrypted data as a sequence of characters rather than as an arbitrary byte sequence. To do that you need to use the -a option while encrypting (and while decrypting).

In [None]:
!openssl enc -aes-256-cbc -pbkdf2 -a -in top_secret.txt -out top_secret_encrypted.txt
!cat top_secret_encrypted.txt

## Decryption

Decryption from an input file to an output file is done with the same *openssl enc* command, with the -d option. The command will ask the passphrase.

The encryption algorithm and key derivation function have to be specified (since we have not used the default).

By omitting the -out option the result of the decryption would be printed on the standard output rather than on a file (of course that would make sense only for a textual cleartext, not for a pdf or a jpg).

In [None]:
!openssl enc -aes-256-cbc -pbkdf2 -d -in top_secret.enc -out decrypted

In [None]:
!cat decrypted

# Practical considerations

## How to choose the encryption key

Deriving a key from a passphrase is simple but not very secure. The reason is because the set of passphrases that are likely to be used in practice is much smaller than the set of keys that can be used, because people tend to use easy to remember sequences of characters as passphrases. Thus, an attacker with the cyphertext may attempt to decrypt it by using large dictionaries of "commonly used passphrases", trying to use one after the other. In practice, this kind of brute force attack may succeed.

A more secure way for choosing the key consists in instructing *openssl* to choose one at random. In that case the chosen key will be a sequence of bits whose length depends on the specified encryption algorithm. The key will be stored by *openssl* in a file and that file will have to be passed to the decryption procedure. The relevant options are more complex to use.


## Providing the wrong key

When decrypting, try to provide an incorrect passphrase. You will see that decryption of the input file fails.

This is an interesting and important property.

Decryption is a procedure that transforms an input byte sequence to an output byte sequence, where the transformation depends on a second input parameter, i.e., the decryption key. How can the decryption procedure determine whether you provided the correct decryption key (i.e., the same key that was used for encrypting)? The decryption procedure constructs a byte sequence: it has no way to tell whether that sequence is the correct one, because the encryption key is not stored anywhere.

The reason why decryption is able to detect whether the correct decryption key is used is this: every practical application of cryptography makes use of additional techniques and mathematical tools not discussed here; all that matters to us is that every practical application of cryptography has this important property, that we state very informally: **after decrypting with a certain key you can tell whether that key was the correct one or not**.


## Modifying an encrypted byte sequence

Modify the encrypted file (before doing that, make a backup copy). Then try to decrypt the modified encrypted file by providing the correct passphrase. You will see that decryption fails.

This is also an interesting and important property.

By a similar reasoning as above, how can the decryption procedure tell that the byte sequence that it has produced is not the correct one? In this case the problem is even subtler than the previous one, because the correct decryption key was used.

The reason why decryption is able to detect that the encrypted file was modified is this: every practical application of cryptography makes use of additional techniques and mathematical tools not discussed here; all that matters to us is that every practical application of cryptography has this important property, that we state very informally: **secrecy is always associated with integrity; after decrypting with the correct key you can tell whether the byte sequence is identical to the one that was encrypted or not**.

## Cryptography in practice

The two properties discussed above apply to *every* practical application of cryptography. In particular, they apply to byte sequences implemented as *files* as well as to byte sequences implemented as *messages*.

Furthermore, they apply to private key (symmetric) cryptography as well as to public key (asymmetric) cryptography.

In summary, in practice we always have the following properties (that we state very informally):
*   whenever you decrypt something, you can tell whether you used the correct decryption key.
*   whenever you decrypt with the correct decryption key, you can tell whether the decrypted byte sequence is identical to the one that was encrypted (integrity).


# Public key (asymmetric) cryptography

## Generate a keypair

Generate a 1024 bit public-private keypair and store it in a file called alice_private.pem (-out).

This file must be encrypted because it contains the private key of the keypair. The file is encrypted with symmetric cryptography (AES with 128 bit key). We use a passphrase as encryption key (as described in the above sections).

Note that the passphrase is not stored anywhere.

Note also that the public-private keypair is not associated with any identity. They are merely a pair of numbers without any implicit link to a specific person, organization, machine, application---they are not associated with any *subject*.

In [None]:
!openssl genrsa -aes128 -out alice_private.pem 1024

Print the beginning of the file where the (encrypted) keypair is stored, just to have an idea of how it looks like.

In [None]:
ir

In [None]:
!head alice_private.pem

## Extract the public key

Print the keypair in textual form on the standard output.

The command execution will ask for the passphrase.

In [None]:
!openssl rsa -in alice_private.pem -noout -text

Read the public key from the keypair file and store it in a file without the matching private key.

In [None]:
!openssl rsa -in alice_private.pem -pubout > alice_public.pem


Print the public key in human-readable form, by reading it from a public key file.

This time no passphrase is required.

In [None]:
!openssl rsa -in alice_public.pem -pubin -text -noout

## Encryption

Encrypt our example cleartext with the public key stored in alice_public.pem and store the encrypted byte sequence in file top_secret.enc

The encypted file can be decrypted only with the private key associated with the public key used for encryption.

In [None]:
!openssl rsautl -encrypt -inkey alice_public.pem -pubin -in top_secret.txt -out top_secret.enc

Print the encrypted file to make sure you do not understand anything (of course, the content is not merely "strange"; it does not provide any information about the original file, nor about the encryption key that has been used).

In [None]:
!cat top_secret.enc

## Decryption

Decrypt the file with the private key stored in alice_private.pem and store the decrypted byte sequence in file decrypted.txt. Of course, this time the passphrase is needed.

In [None]:
!openssl rsautl -decrypt -inkey alice_private.pem -in top_secret.enc > decrypted.txt

Print the content of decrypted.txt to make sure it is indeed the same as the original file. Of course, if you encrypted/decrypted a file of a different type (pdf, jpg,...) you will need a different program for inspecting or visualizing its content.

In [None]:
!cat decrypted