Demo that shows how to understand RSA encryption, decryption and key generation.
This system generates a wheel that contains tools to implement the RSA algorithm to help people understand how key generation, encryption and decryption work at a somewhat detailed level.
It provides tools that allow a user to generate public and private key files using keygen and then uses those files to encrypt and decrypt files. It also provides tools to read and dump the public and private key files.
One interesting feature is that it will encrypt and decrypt text of sizes greater than a single block. Another is that it uses the same key structure as production tools (PKCS#1 and ssh-rsa).
The code is written in python3 (compatible with 3.7 or later). It exists
in a local module named rsa_demo that is bundled as a wheel that is
released in a local pipenv environment. It has very simple mypy
stubs for the pyasn1 and faker packages.
The theory behind RSA including descriptions of Fermats Little Theorem, the Extended Euclidean Algorithm, Bezout's Identity, modulus arithmetic, factoring complexity, prime numbers (and their relation to the Riemann Hypothesis - a personal favorite) and the host of other concepts and history related to the RSA algorithm are not discussed here because there are so many great resources on the web. My hope is that you will find and read those references and then compare them to the implementations here.
The goal is purely pedagogical. Do not try to use it for any
production work. It is too slow and it is not secure (see the
discussion of vulnerabilities below). For production work always
use tools like openssl and openssh.
The key generation is done by the keygen program.
It generates three files: a private key file in PKCS#1 (Public Key Cryptograph Standards)
format encoded using ASN.1 (Abstract Syntax Notation) DER (Distinguished Encoding Rules),
a public key file with the .pub.pem extension using the same format as the public key
and another public key file with the .pub extension using the SSH RSA key format as
described in RFC-4716. Both public key files have the same information.
The generated keys can be read by tools like openssl and openssh.
Note that the key generation only supports version 0. It does not support multiprimes (version 1).
Here is the list of key tools. They must all be run using pipenv.
| Tool | Description |
|---|---|
| keygen | Generates the keys. Run with --help for detailed information. |
| read_rsa_pkcs1_pem_public | Simple utility to read and dump a standard RSA PKCS#1 public key file. It is basically the same as running openssl asn1parse. |
| read_rsa_pkcs1_private. | imple utility to read and and dumps a standard RSA PKCS#1 public key file. It is basically the same as running openssl asn1parse. |
| read_rsa_ssh_public | imple utility to read and and dumps a standard RSA SSH public key file. |
Here is the list of files generated by the keygen tool.
| File | Format | Standard | Description |
|---|---|---|---|
| FILE | PKCS#1 ASN.1 DER private | RFC 8017 | The private key data. It contains the following fields:
|
| FILE.pub | SSH RSA public | RFC 4716 | The public key data. It contains the following fields:
|
| FILE.pub.pem | PKCS#1 ASN.1 DER public | RFC 8017 | The public key data. It contains the following fields:
|
The RSA encryption and decryption algorithms are defined in the encrypt
and decrypt programs respectively. They are very simple implementations
of the RSA algorithm without optimizations. Both programs have extensive
help available through the --help option.
To facilitate communication between the programs I created a very simple
custom format called joes-rsa that has a short prefix that identifies
the file as having been encrypted by the encrypt1 program. This header
information is not present in encrypted files generated by production
worthy tools (like openssl) so these files cannot be read by other tools.
The decision to not make the encrypted file interoperable with standard production tools was intentional because this is a demo system. The tools and generated files should not be used outside of this learning environment.
The header consists of the following fields.
| Field | Position | Description |
|---|---|---|
| id | 0..7 | The bytes "joes-rsa". |
| version | 8..9 | The version number in big-endian format. |
| padding | 10..11 | The number of padding bytes in big-endian format. This is the number of dummy bytes present at the end of the last block to fill it out. |
There is a program called gendata that generates dummy test data using the faker package.
It was used to generate the test data.
Here is a simple example of how to use it:
$ # Download
$ git clone https://github.com/jlinoff/rsa_demo.git
$ cd rsa_demo
$ # Install the system in a local pipenv.
$ make
$ # Create a dummy data file.
$ pipenv run gendata >dummy.txt
$ # Create public and private keys.
$ # This can take a few minutes.
$ # The output is exactly the same as running ssh-keygen
$ # in the following, insecure way except that the .pub.pem
$ # file will not be generated.
$ # ssh-keygen -t rsa -b 2048 -f dummykeys -N '' -m PEM -q
$ time pipenv run keygen -o dummykeys -v
$ ls -1 dummykeys*
dummykeys
dummykeys.pub
dummykeys.pub.pem
$ # Encrypt the dummy.
$ pipenv run encrypt -k dummykeys.pub -i dummy.txt -o dummy.txt.enc -v
$ # Decrypt the encrypted data.
$ pipenv run decrypt -k dummykeys -i dummy.txt.enc -o dummy.txt.dec -v
$ # Verify that the decrypted file matches the original file.
$ diff dummy.txt dummy.txt.dec
$ # Dump the private key file using the openssl tool (it is PKCS#1 compatible).
$ openssl asn1parse -in dummykeys | tr '\t' ' ' | cat -n | cut -c -80
1 0:d=0 hl=4 l=2343 cons: SEQUENCE
2 4:d=1 hl=2 l= 1 prim: INTEGER :00
3 7:d=1 hl=4 l= 512 prim: INTEGER :6A24F79C34A35D5EC513E8930
4 523:d=1 hl=2 l= 3 prim: INTEGER :010001
5 528:d=1 hl=4 l= 512 prim: INTEGER :41B2A7DFF364BA421251843E9
6 1044:d=1 hl=4 l= 257 prim: INTEGER :C9AFEB6710E640C5D4C9E2142
7 1305:d=1 hl=4 l= 257 prim: INTEGER :86BA706D7C6963EA766940BED
8 1566:d=1 hl=4 l= 257 prim: INTEGER :9E25B8A397A7C4F09B5B36507
9 1827:d=1 hl=4 l= 256 prim: INTEGER :1DC14B5742E4DBC64A8490621
10 2087:d=1 hl=4 l= 256 prim: INTEGER :2BD63A3167B42596BC34FF3B5
$ # Dump the public PEM file.
$ openssl asn1parse -in dummykeys.pub.pem | tr '\t' ' ' | cat -n | cut -c -80
1 0:d=0 hl=4 l= 521 cons: SEQUENCE
2 4:d=1 hl=4 l= 512 prim: INTEGER :6A24F79C34A35D5EC513E8930
3 520:d=1 hl=2 l= 3 prim: INTEGER :010001
$ # Dump the public SSH file using the key reader provided.
$ pipenv run read_rsa_ssh_public dummykeys.pub | tr '\t' ' ' | cat -n | cut -c -80
1 dummykeys.pub
2 algorithm = ssh-rsa
3 pubexp = 0x10001
4 modulus = 0x6a24f79c34a35d5ec513e893085f997296a544a013dd161b519f5e26In this example Alice and Bob want to communicate a message that cannot be decoded by someone observing their communication.
The important idea is that the public key is used to encrypt a message and the private key is used to decrypt that same message.
- Alice creates the key files:
alice(private) andalice.pub(public). - Alice then stores the private key in a safe place and sends the public key to Bob. It doesn't matter if anyone intercepts the public key because they can only use it to encrypt messages for Alice.
- Bob creates the key files:
bob(private) andbob.pub(public). - Bob then stores the private key in a safe place and sends the public key to Bob. It doesn't matter if anyone intercepts the public key because they can only use it to encrypt messages for Bob.
- At this point Alice has three files:
alice,alice.pubandbob.pub. Bob also has three files:bob,bob.pubandalice.pubAlice will usebob.pubto encrypt messages that are sent to Bob and Bob will usealice.pubto encrypt messages that are sent to Alice. - Alice composes her plaintext message in
message-to-bob.txt, encrypts it usingbob.pubas the public key file and sends it to Bob. - Bob receives the encrypted message and then decrypts it using the
bobprivate key file. If anyone intercepts the encrypted message, they cannot decrypt because they do not have thebobprivate key file. - Bob then composes a response to Alice in
message-to-alice.txt, encrypts it usingalice.puband sends it to Alice. - Alice receives the encrypted message and then decrypts it using the
aliceprivate key file. If anyone intercepts the encrypted message, they cannot decrypt because they do not have thealiceprivate key file.
At this point Alice and Bob have communicated back forth. The messages are secure from observers that can only observe their communications. Unfortunately this does not mean that the messages are secure.
Here are some of the tactics an attacker could employ to access communications between Alice and Bob.
- System hack: an attacker could access their computer systems and take their private key files. That would allow the attacker to decrypt all messages.
- MITM (man-in-the-middle) attach: An attacker could sit in the middle of the communications between Alice and Bob and spoof them. That means that when Alice talks to Bob, the attacker intercepts the communications from Alice and substitutes their (the attackers) public key in the message to Bob. When Bob responds he is using the attackers public key file to encrypt messages for Alice. When he sends the encrypted message back to Alice, the attacker intercepts it, decodes the message using their (the attackers private key), re-encrypts using Alice's original public key and then sends it to Alice. Thus, Alice and Bob see the same communication pattern as before but their communications have been compromised. And vice-versa for the reverse direction.
- Library/Tool compromise - An attacker could provide Alice and Bob with a hacked version of openssl or openssh or a hacked system library (like a pseudo-random-number (PNG) library). Whenever Alice or Bob create their keys using the compromised tools, the attacker will be able to decrypt the messages.
- Algorithm compromise - An attacker figures out a vulnerability in one or more of the underlying algorithms. This is the main reason that you should never use tools and libraries (like the ones in this demo) for secure communications. You want battle tested tools that are under constant scrutiny by experts to detect and fix vulnerabilities.
Here are some of the mitigation tactics for the vulnerabilities in the previous section.
- The probability of system hacks can be reduced by good security hygiene.
- The probability of MITM attacks can be reduced by using certificates.
- The probability of library/tool compromise attacks can be reduced by verifying the official checksums of all libraries and tools used (white-listing). This is part of good security hygiene but deserves to be called out because it is often neglected.
- The probabiliy of an algorithm compromise is reduced by continuing to encourage security research.
The source code is available in the rsa_demo module directory.
The keys generated are equivalent to running the following command.
$ ssh-keygen -t rsa -b 2048 -f test1 -N '' -m PEM -qThis above command is not secure. Do not use it for production keys. One should always use a non-empty passphrase.
I hope that this helps you understand how RSA works.