# Asymmetric Encryption

Asymmetric encryption does not have passwords or keys like symmetric but rather it splits the security into a pair of keys, one used to encrypt (usually called Public Key) and another used to decrypt (usually call Private Key). As the name implies one can be shared publicly and the other should be kept secret. This method is also known as [**Public-key cryptography**](https://en.wikipedia.org/wiki/Public-key_cryptography).

The keys are sequences of bytes generated together and are mathematically linked, they are also called "Key Pair". Each party in the communication should have its own key pair and have their Public Keys shared. This means that there is no need for a "secure channel" to exchange keys as in symmetric encryption since the Public key are by design sharable.

This type of encryption is used nowadays in many applications, ranging from the commonplace SSH protocol to the trendy Bitcoin transactions.

A party can generate as many Key Pairs as needed, meaning that, in case of a Private Key being compromised, a new Key Pair can be generated.

## Public Key Encryption != Certificate Based Communication

One way to implement Public Key Encryption is through [**Certificates**](https://en.wikipedia.org/wiki/Public_key_certificate), a certificate is something bigger in scope than the Key Pair because the Key Pair is simply two sequences of bytes whereas the certificate includes not only the public key but also information about the algorithm, versioning, subject and issuer information and so on. 

Therefore, a Key Pair is enough to encrypt and decrypt but a certificate has metadata and additional information to establish a **secure channel of communication**. One of the most widely used algorithms for Public Key Encryption is [`RSA`](https://en.wikipedia.org/wiki/RSA_(cryptosystem)) and the current standard for certificate is [`X.509`](https://en.wikipedia.org/wiki/X.509) (which uses RSA Public Key Encryption). Certificate based communication is used in most website, the key indicator is the use of TLS, more commonly reflected by the use of HTTPS (instead of HTTP).

Some of the information a certificate contains is:

- Information about the **identity** (Email, Organization Name, Country, State, [among others](https://en.wikipedia.org/wiki/Certificate_signing_request#Procedure)) of who generated the Public Key (the subject).
- A **digital signature** [^1] that validates the identity information is correct and it corresponds to the subject.

The drawback of this approach is that nothing impedes an attacker to sign their own certificates, i.e. anyone can claim being anyone else. These are the so-called [self-signed certificates](https://en.wikipedia.org/wiki/Self-signed_certificate). To avoid that, the signature should come from a trusted third party, known as [**Certificate Authority**](https://en.wikipedia.org/wiki/Certificate_authority). It needs to be an independent and trusted-by-everyone party that validates the identity of who is generating the Public Key. This is necessary if the subject is unknown or could not be trusted.

When the subject acts as its own CA, generating a self-signed certificate, it triggers warnings in most environments, e.g. most programming libraries will throw validation errors. Web browsers will consider a self-signed certificate to be **insecure**. It is not a matter of security (i.e. the data will be encrypted anyway) but rather a matter of trust, trusting that the received Public Key comes from the intended subject and that there is no other man-in-the-middle.

For debugging and testing, self-signed certificates are usually not a concern, for all other use cases, a certificate signed by a CA should be used.

To get a certificate signed by a CA, one has to submit a [**Certificate Signing Request**](https://en.wikipedia.org/wiki/Certificate_signing_request), it normally takes several days and it is usually paid service, they also validate that the identity information corresponds to the one asking for the validation. A CA will not sign certificates to anyone on behalf of anyone else, i.e. I cannot get a CA signing a certificate saying I am Google.

[^1]: Digital signature is the topic of the next chapter

### Example

When logging in a service (e.g. email), one enters a username and a password. It is desired that the password travels encrypted through internet, so that no one can read it but the service provider. To encrypt that, certificate based communication is used, the Public Key is used to encrypt the password and then the service provider can use its private key to read it. 

That being said, what happens if the Public Key used does not come from the service provider and instead was injected by an attacker's Key Pair? They would have a corresponding Private Key with which they can see the password (and then if needed redirect to the real service provider).

That is when Certificate Authorities come in, because the certificate will also include a digital signature saying "This is the public key for this service provider". Attackers can mimic Public Keys from any subject, however, they cannot bypass the digital signature, because everyone can verify if a digital signature comes from a CA or not.

### Other usage for Certificates

Another usage of certificates is **User Authentication**, in this case one organization can generate specific certificates for each user with all the authentication relevant information. The user can add that certificate to their operating system, and then when making a request only a user id (email, GUID or similar) should be provided. 

If the user id matches the information in the certificate and the signature of the certificate is valid (i.e. is was signed by the organization server), the user is consider authenticated and the request is processed.

An example of such authentication mechanism in Flask can be seen in this [Anaconda Repo](https://github.com/ContinuumIO/flask-ssl-authentication)

### Practical Example

This particular site has **HTTPS** with a certificate signed by [DigiCert](https://www.digicert.com/), one of many Certificate Authorities.

<center>
<img src="images/certificate_CA_example.png">
</center>

When opened, some details are shown

<center>
<img src="images/certificate_details.png">
</center>

However, something might draw special attention, it says "**Issued to**: www.github.com" but the site is [**elc.github.io**](https://elc.github.io/). If we inspect the details, we can see that the **Issuer** is the CA, the **Subject** is Github and the there is an special field called **Subject Alternative Name**, there one of the DNS Names is **\*.github.io** which is compatible which this site URL. Therefore, the browser knows the certificate is from Github, validated by DigiCert and even though the URL is not Github's it is under one of the registered alternative names.

<center>
<img src="images/certificate_details_dns.png">
</center>

### Using Certificates - Implementation Considerations

Some times the process of getting a signed certificate might be troublesome or tedious, fortunately there are shortcuts:

- Using a hosting service that provides HTTPS out of the box (e.g. Github Pages does it freely)
- Outsourcing the certificate request (e.g. Most cloud providers will do that on our behalf)
- Using hosted services (e.g. Azure Web Apps comes with HTTPS support)

It is usually useful to work in layers, many times one can have a gateway server (Apache, Nginx or similar) that takes care of the TLS connection (certificate based communications) while the underlying application use it seemlessly. Meaning there is no code change needed on the application side.

The following examples will focus on the Public Key Encryption without the CA signing process for simplicity. To implement a certificate signed by a CA, [follow the official tutorial](https://cryptography.io/en/latest/x509/tutorial). However, **unless being supported by a security expert it is always recommended to trust hosted/managed services instead of doing security from scratch**.

## RSA

RSA is one of the asymmetric encryption algorithms available in the PyCA cryptography library. The particular objects used here are part of the `hazmat` package, `hazmat` stands for "Hazardous Materials" and quoting from their [site](https://cryptography.io/en/latest/hazmat/primitives/asymmetric/rsa/):

    This is a “Hazardous Materials” module. You should ONLY use it if you’re 100% absolutely sure that you know what you’re doing because this module is full of land mines, dragons, and dinosaurs with laser guns.

In [6]:
from cryptography.hazmat.primitives import hashes
from cryptography.hazmat.primitives.asymmetric import rsa, padding

### Generating Keys

The first step is to generate the private key, for that two parameters are needed the `public_exponent` and the `key_size`. The former should be fixed to `65537` whereas the second can be changed and as per modern security standards it should be at least `2048`. Then the Public Key is generated from the private Key object.

In [81]:
key_size = 2048  # Should be at least 2048

private_key = rsa.generate_private_key(
    public_exponent=65537,  # Do not change
    key_size=key_size,
)

public_key = private_key.public_key()

### Encrypting

In [8]:
message = b"Hello World!"

message_encrypted = public_key.encrypt(
    message,
    padding.OAEP(
        mgf=padding.MGF1(algorithm=hashes.SHA256()),
        algorithm=hashes.SHA256(),
        label=None
    )
)

print(f"Encrypted Text: {message_encrypted.hex()}")

Encrypted Text: 52a643e71a32707517fd5ba144de0a4a53ad2c74b0a6d1569ade22b1237619acf646eed8fc3703b1f3e0ebf0c6fa62a42d88ba0fe61815904a81e50c0e16999d0dcf0f00ad6724bc5c40d2a7109da8c4e2f7f0c887201ed150439b21ca7a0d4dc4c6bfca7c4b6f3909b439951af10918fd4464af5dac0d90cf4531a3df85808fd266947d3ca653db713a6b8092c24861f04d9d11ab64340b1f50cad40aba16225c5d9af06025ccda6cb0d3a1b618333d0f16f1af059203f84a5c17566f76e1d15120451d6ce0a1966c47f7f8090c178602e37f42d9f912a3cfc0dfb6409edf2cac8006d010ba1a6158a84ac7e5036a046c10f03bdbba7a6e9f5e8a7e0a940508


### Decrypting

In [9]:
message_decrypted = private_key.decrypt(
    message_encrypted,
    padding.OAEP(
        mgf=padding.MGF1(algorithm=hashes.SHA256()),
        algorithm=hashes.SHA256(),
        label=None
    )
)

print(f"Decrypted Message: {message_decrypted}")

Decrypted Message: b'Hello World!'


## Using PEM Files

Public Keys, Private Keys and certificates can be saved as files, in that case the [Privacy-Enhanced Mail (PEM)](https://en.wikipedia.org/wiki/Privacy-Enhanced_Mail) format is used. Those are files with suffix `.pem`, `.cer`, `.cert` or `.crt`. They have a characteristic `BEGIN` and `END` line which encloses the content. Since a key pair is a sequence of bytes, base64 is used to convert those to string.

The extension is just to tell the content but the PEM format is a plain-text file, i.e. it can be opened with any text editor.

In [10]:
from pathlib import Path

from cryptography.hazmat.primitives import serialization

### Storing the Keys as PEM Files

#### Saving Private Key

The PEM file for the private key **should be kept secret** and never shared, the **whole asymmetric encryption depends on it being hidden**.

Because the [Public-Key Cryptography Standards (PKCS) #8 (`PKCS8`)](https://en.wikipedia.org/wiki/PKCS_8) is used as the serialization format, the private key is not stored in raw bytes but rather it is encrypted using symmetric encryption. The symmetric algorithm used is [PBKDF2](https://en.wikipedia.org/wiki/PBKDF2). Therefore, in theory there should not be any risks if the files is leaked, that being said, sharing private key files is against all good practices.

**Note: NEVER upload private key files to source control, make sure to add the file to your `.gitignore`**

In [87]:
password = b"my secret"

key_pem_bytes = private_key.private_bytes(
   encoding=serialization.Encoding.PEM,  # PEM Format is specified
   format=serialization.PrivateFormat.PKCS8,
   encryption_algorithm=serialization.BestAvailableEncryption(password),
)

# Filename could be anything
key_pem_path = Path("key.pem")
key_pem_path.write_bytes(key_pem_bytes);

warning_message = "\n\n     TRUNCATED CONTENT TO REMIND THIS SHOULD NOT BE SHARED\n"

content = key_pem_path.read_text()
content = content[:232] + warning_message + content[1597:]

print(content)

-----BEGIN ENCRYPTED PRIVATE KEY-----
MIIFLTBXBgkqhkiG9w0BBQ0wSjApBgkqhkiG9w0BBQwwHAQIOmcnePGVUgkCAggA
MAwGCCqGSIb3DQIJBQAwHQYJYIZIAWUDBAEqBBB+ac/2DQgW64toFlxGQbf0BIIE
0MRARXl0T3SAzEh1IzmrYfGHZDJXv8dVunjRVTbCky8IuCzztHLsP06YQnM/KEnW

     TRUNCATED CONTENT TO REMIND THIS SHOULD NOT BE SHARED

yAD4jKkUo5ypvtXbsAXy0DssarPHQSkOKgziItRAb3au7gjgBkZN5BT1MNvaBpHg
oOknekFchNR+Y/FmiDX7Pcw80fBoeI8dn2QgpCA6vTeiLCYRGnnl10wkakyF/Pl/
vYMvXFDh6nCOE6vUkQs6m46l3ZJkluTXGRsQYn4bJwhG+6ONhYvJ7k0IJgDKLRmm
VY3UT9mP/sk1q8WmW1+qhbI/CFS/Raq7Stfc/fxLb8E7
-----END ENCRYPTED PRIVATE KEY-----



#### Saving Public Key

The PEM file for the public key will be part of a public certificate which will be send to everyone wanting to comunicate with the subject. **There is no risk sharing this**.

In [90]:
public_key = private_key.public_key()

public_pem_bytes = public_key.public_bytes(
   encoding=encoding,
   format=serialization.PublicFormat.SubjectPublicKeyInfo,
)

# Filename could be anything
public_pem_path = Path("public.pem")
public_pem_path.write_bytes(public_pem_bytes);

public_key_content = public_pem_path.read_text()
print(public_key_content)

-----BEGIN PUBLIC KEY-----
MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEA3jq5qPmNS+wMFp1Lhu7K
yBm/9IgKa0aO8QlT2ZgkJZTzqxcliswQzuQEpZzZEzOkEbUEfGRPAWTSJuu4KnfK
J+6Kn47X3d/YoUq6N2rZEP03mJSqEf3tC1in8bwWO/3IPNv0R28FZIwdRxZxhvvs
VcTc6LRLu9i8t7A5zq41sBNetUEY4SObv4UBatz6/+SlDO2hJWCN6wx1b2OS4pxy
COWQtgV1KqPyozBuX6wdLMWk1/yGRd7WHPXZKx07smltkxzu7FII/JYj4Nfoh77Q
ujHI3a6IvQWIu+Mihj7QdHgzFfN0lJHm+nqwlz8+dqdXGbEgcjX5dn559dSkrUMn
4QIDAQAB
-----END PUBLIC KEY-----



### Loading PEM Files

PEM files can be loaded directly using the load methods, in the case of the private key, a password should be provided because the `PKCS8` was used to serialized it.

#### Wrong Password

The `cryptography` library will throw `ValueError` if the password is incorrect

In [75]:
private_pem_bytes = Path("key.pem").read_bytes()
public_pem_bytes = Path("public.pem").read_bytes()

guess_password = b"my pass"

try:
    private_key_from_pem = serialization.load_pem_private_key(
        private_pem_bytes,
        password=guess_password,
    )
    public_key_from_pem = serialization.load_pem_public_key(public_pem_bytes)
    print("Keys Correctly Loaded")
except ValueError:
    print("Incorrect Password")

Incorrect Password


#### Right Password

If the correct password is used, no errors should be thrown

In [76]:
private_pem_bytes = Path("key.pem").read_bytes()
public_pem_bytes = Path("public.pem").read_bytes()

try:
    private_key_from_pem = serialization.load_pem_private_key(
        private_pem_bytes,
        password=password,
    )
    public_key_from_pem = serialization.load_pem_public_key(public_pem_bytes)
    print("Keys Correctly Loaded")
except ValueError:
    print("Incorrect Password")

Keys Correctly Loaded


### Encryption and Decryption using PEM Files

#### Encrypting

Only the Public Key is needed for Encryption

In [79]:
message = b"Hello World!"

public_pem_bytes = Path("public.pem").read_bytes()
public_key_from_pem = serialization.load_pem_public_key(public_pem_bytes)

message_encrypted = public_key_from_pem.encrypt(
    message,
    padding.OAEP(
        mgf=padding.MGF1(algorithm=hashes.SHA256()),
        algorithm=hashes.SHA256(),
        label=None
    )
)

print(f"Encrypted Text: {message_encrypted.hex()}")

Encrypted Text: 3849a8acc1ab3ed9ad7c4f7a2af60616943254a2d85e6e1420c080f77234026f6895cf845ab7c04a096727d3ab8420ac16641c2917d0ba6189552dd0a8fe24c56d8af039dd4f3a9faa17393a584bbc9a1f5dbe7765dc45376e31ba33dcb1caa0e549d0df7070857548680023c1a03b70340ef3fa1cdb2e9b25769476b7fbdcaa285b6da0a4f8b0b4dbadc833204ea23cdc68a7b0f2b74f315071d08b92aa6098b996f74d49888d19bb78174d843943ccb0d2a6d0fe018d4de38f15829b1b624ad5d6fe50f2ccfecfc848b23f1fbfeee1c756b8a5e9e042576f27b26ad2fb7ae79481821326e71fe3f023f5d53cca795a7917a3cda2b790bcb513e7f2a7a758ea


#### Decrypting

To decrypt, both the Private Key PEM file as well as the password used are needed. If any of the two is missing, the message could not be decrypted.

In [80]:
private_pem_bytes = Path("key.pem").read_bytes()

private_key_from_pem = serialization.load_pem_private_key(
    private_pem_bytes,
    password=password,
)

message_decrypted = private_key_from_pem.decrypt(
    message_encrypted,
    padding.OAEP(
        mgf=padding.MGF1(algorithm=hashes.SHA256()),
        algorithm=hashes.SHA256(),
        label=None
    )
)

print(f"Decrypted Message: {message_decrypted}")

Decrypted Message: b'Hello World!'


## Conclusion

As opposed to symmetric encryption, asymmetric encryption does not rely on a single key shared across parties. Instead, a Key Pair, consisting of a Public and a Private Key which are mathematically linked, is generated by each party, then the public keys are shared. The public key is used to encrypt the message and only the linked Private Key can decrypt it.

For serialization and persistance Key Pairs can be stored in disk using the PEM format, for security reasons the Private Key PEM file is saved encrypted with symmetric encryption and hence requires a passphrase.

Asymmetric Encryption is mostly used as part of a bigger technology called Certificates which provides not only access to the Public Key needed to encrypt the message but also information about the subject, the issuer and a digital signature (among other fields). Certificates could be self-signed (insecure or only suitable for testing) or signed by a trusted third party called Certificate Authority.

The most widely used algorithm for Public Key Encryption is RSA and for Certificate Based Comunication is X.509