# Tiny-TLS 1.3 Toy Implementation for COSI107a, Spring 2025, version 0.1

This document contains the implementation of a toy version of TLS 1.3, to be used as material for Brandeis' COSI107a course. The goal is to implement a minimalist version of TLS 1.3 that can communicate with a server using the protocol.

This is a work in progress, and it is expected that changes will be made to the protocol as we move forward


## 1. Import the necessary libraries

Since we're working with TLS in Python, it is helpful to use Python libraries that allow us to pack our data into binary form. TLS protocol specifies exact byte length and format, and we'll be doing a lot of conversion between numbers and bytes. We'll use the Python Struct library for this purpose

We'll also be importing the x25519 curve from the Cryptography library to generate the key used in our messages



In [53]:
import struct
from cryptography.hazmat.primitives.asymmetric import x25519
from cryptography.hazmat.primitives import serialization, hashes
from cryptography.hazmat.primitives.kdf import hkdf


import os #for random nonce generation

# 2a. Define some helper functions

TLS represents all of its messages in byte form. As such, some elements (such as the content length) must be converted from integer to big edian, specifically with 2 bytes. We also need to be able to join the distinct elements in a message into a single byte slice


In [54]:
def u16_to_byte(x: int) -> bytes:
    return struct.pack('>H', x) # Use the struct package to pack a number into big-edian 2 bytes

def concatenate(*bufs: bytes) -> bytes: # Concatenate multiple byte slices into one singular byte slice
    return b''.join(bufs)



# 2b. Define our private key and public key

We can then generate the public key and private key using the x25519 curve from Cryptography


In [55]:
def key_pair() -> bytes:
    private_key = x25519.X25519PrivateKey.generate()
    public_key = private_key.public_key().public_bytes(
    encoding=serialization.Encoding.Raw,
    format=serialization.PublicFormat.Raw
    )
    return private_key, public_key

print(key_pair())


(<cryptography.hazmat.backends.openssl.x25519._X25519PrivateKey object at 0x7fd2f9888250>, b'\\Bm\x9d\xbf[9\xb8\xd5\xf6\xaa\x1a#\xf9\xc6\xf4\x15\x90\xa18\xd5\xdbt\x0f\xbd\xfa\x99&l\xe9C\x1d')


Notice how each time we run the code, the public_key is different. This is intended behavior, as it prevents attackers from being able to predict our key pairs

# 3. The Extension Blueprint:

Extensions play a large part in shaping a TLS message. From negotiating supported key group to exchanging keys, all of these are achieved using extensions. Luckily for us, these different extensions have a common blueprint



In [56]:
def extension(id: int, content: bytes) -> bytes:
    return concatenate(
        u16_to_byte(id),                         #The ID of the extension. (e.g 0x0a = Supported Group)
        u16_to_byte(len(content)),               #Length of content
        content                                  #The actual content itself
    )

print(extension(0x0a, bytes([0x00, 0x1d]))) #Example Supported Group extension


b'\x00\n\x00\x02\x00\x1d'


# 4. The ClientHello

With those building blocks, we can now write our first TLS message: The ClientHello. The ClientHello is always the first message to be sent in a TLS handshake, indicating that the client wants to connect with the server

The ClientHello has these components, in this order:
1. ProtocolVersion (Negotiate which version of TLS we're using)
2. Random Nonce (32 bit, for key generation)
3. Legacy Session ID (For our purposes, we won't be using sessions)
4. Cipher Suites (contains a suite of cipher - how to actually encrypt the key once we have it)
5. Legacy Compression Method (For TLS 1.3, this is null)
6. Extensions 

The code for it is as follows:

In [57]:
private_key, public_key = key_pair()

def client_hello() -> bytes:
    client_random = os.urandom(32)
    def key_share(pubkey: bytes) -> bytes:      # Encode our public key to be sent over the message
        return concatenate(
            u16_to_byte(len(pubkey) + 4),       # +4 represents the 4 extra byte before the pubkey (2 bytes for the x25519, 2 bytes for len of pubkey)
            u16_to_byte(0x1d),                  # 0x1d is the value for x25519 key
            u16_to_byte(len(pubkey)),           
            pubkey
        )
    
    def DNI(domain: str) -> bytes:
        return concatenate(
            u16_to_byte(len(bytes(domain,'utf-8')) + 3),
            bytes([0x00]),
            u16_to_byte(len(bytes(domain,'utf-8'))),
            bytes(domain, 'utf-8')
        )
    
    def extensions() -> bytes: #This intializes the extensions we need in our message
        return concatenate(
            extension(0x00, DNI('www.brandeis.edu')),
            extension(0x0a, bytes([0x00, 0x02, 0x00, 0x1d])), #Supported Group extensions. Currently only contains the x25519 curve
            extension(0x33, key_share(public_key)), #Key Share. Contains the public key generated from the x25519 curve
            extension(0x2b, bytes([0x02, 0x03, 0x04])) #TLS Version. This is how we negotiate TLS 1.3
        )
    
    def handshake() -> bytes: #This constitutes our actual ClientHello message
        return concatenate(
            bytes([0x03, 0x03]),                          # This value is for TLS 1.2. TLS 1.3 must disguise itself as TLS 1.2 to be received, after which it negotiates into TLS 1.3 through the TLS 1.3 extension
            client_random,                             # Random Nonce for key
            bytes(0x00),                                # Session ID. Empty for our purposes
            bytes([0x00, 0x02, 0x13, 0x01]),              # Cipher Suite. We have a single cipher for our cipher suite (SHA256)
                                                          # I'm aware that there are 2 SHA256 ciphers: AES and CHACHA. I've included one here. Not sure
                                                          # if we need the other one or not
            bytes([0x01, 0x00]),                          # Compression Method. Empty for our purposes
            u16_to_byte(len(extensions())),
            extensions()
        )
    
    return concatenate(                                 #Include record layers for TLS 1.3 to complete message
        bytes([0x16, 0x03, 0x01]),
        u16_to_byte(len(handshake()) + 4),
        bytes([0x01]),
        u16_to_byte(len(handshake())),
        handshake()
    ), client_random

client_hello_msg, client_random = client_hello()
print(client_hello_msg)
print("hello lmao")
print(client_random)
        

b'\x16\x03\x01\x00\x80\x01\x00|\x03\x03\xcc(T\x8e_\xcd2\x9b0\xb02U\x9cg\n\x04@\xd3\xdf\xd7\xb7\xfd\x11\xe3\xb5\xf3J\x1e&\xdd L\x00\x02\x13\x01\x01\x00\x00R\x00\x00\x00\x15\x00\x13\x00\x00\x10www.brandeis.edu\x00\n\x00\x04\x00\x02\x00\x1d\x003\x00&\x00$\x00\x1d\x00 =\x0e\xe7\xcd\xe7i\x8b\x9e\xc0\xf5F\x9e\x0f\x93\xe5\xfd\xae\x02\xdf\x8fB\x88\x9bs\xb4\xa6\xac\x84f\x08Pv\x00+\x00\x03\x02\x03\x04'
hello lmao
b'\xcc(T\x8e_\xcd2\x9b0\xb02U\x9cg\n\x04@\xd3\xdf\xd7\xb7\xfd\x11\xe3\xb5\xf3J\x1e&\xdd L'


# 5. Parsing the ServerHello

Once the ClientHello is sent, the server responds with its own message, called the ServerHello. Assuming that our ClientHello message is configured correctly, the server will respond with its chosen cipher suite and its own key. 

We are particularly interested in the server's key and server random, which we use to establish cryptographic parameters. We can then extract these properties and start the key calculation process

# 5a. Creating a Parser class 

The ServerHello message isn't always of the same size. Certain elements like the sessionID, as well as the content of extensions, may have variable length depending on the message itself. That's why we can't just extract the information based on indexes alone. We need a parser that would keep track of what element we're at and how many bytes we have to skip forward


In [58]:

class Parser:
    def __init__(self, data: bytes) -> None:
        self.data = data
        self.cursor = 0
    
    def skip(self, position: int) -> None:
        self.cursor += position

    def read(self, position: int) -> bytes:
        result = self.data[self.cursor : self.cursor + position]
        self.cursor += position
        return result 
    
    def read_uint8_prefixed(self) -> bytes:                           #Most extensions with variable lengths have bytes that denote their length. We can use this to skip forward appropriately 
        length = self.data[self.cursor]
        self.cursor += 1
        result = self.data[self.cursor : self.cursor + length]
        self.cursor += length
        return result

    def read_uint16_prefixed(self) -> bytes:
        length = int.from_bytes(self.data[self.cursor:self.cursor + 2], 'big') #Since some lengths are represented with 2 bytes, we need to convert them 
        self.cursor += 2
        result = self.data[self.cursor:self.cursor + length]
        self.cursor += length
        return result


# 5b. Parsing the ServerHello

With our Parser class, we can now parse the ServerHello to extract the ServerRandom and the public key. Keep in mind that we are only dealing with 1 cipher suite, 1 cryptographic and as such expects only 1 public key. A full version of TLS 1.3 will be much more complex

In [59]:
def server_hello_parser(msg: bytes):
    parser = Parser(msg)
    parser.skip(4)                                  #Skip HandShake Header
    parser.skip(2)                                  #Skip Server Version
    server_random = parser.read(32)                 #Get the serverRandom
    parser.read_uint8_prefixed()                    #Skip SessionID
    parser.skip(2)                                  #Skip Cipher Suite
    parser.skip(1)                                  #Skip Compression Method
    public_key = None
    extensions = parser.read_uint16_prefixed()
    extension_reader = Parser(extensions)
    while(extension_reader.cursor < len(extensions)):
        extension_type = extension_reader.read(2)
        extension_data = extension_reader.read_uint16_prefixed()
        if (extension_type == b'\x00\x33'):
            data = Parser(extension_data)
            data.skip(2)
            public_key = data.read_uint16_prefixed()
    return server_random, public_key

    # extension_number = int.from_bytes(parser.read(2), 'big')
    # print(extension_number)
    # for i in range (extension_number):
    #     extension_type = parser.read(2)
    #     extension_content = parser.read_uint16_prefixed()
    #     if extension_type == 0x0033:
    #         print("Key_Share Identified")
    #         extension_parser = Parser(extension_content)
    #         extension_parser.skip(2)
    #         public_key = extension_parser.read_uint16_prefixed()
    # return server_random, public_key

server_hello_msg = b'\x02\x00\x00\x76\x03\x03\x70\x71\x72\x73\x74\x75\x76\x77\x78\x79\x7a\x7b\x7c\x7d\x7e\x7f\x80\x81\x82\x83\x84\x85\x86\x87\x88\x89\x8a\x8b\x8c\x8d\x8e\x8f\x20\xe0\xe1\xe2\xe3\xe4\xe5\xe6\xe7\xe8\xe9\xea\xeb\xec\xed\xee\xef\xf0\xf1\xf2\xf3\xf4\xf5\xf6\xf7\xf8\xf9\xfa\xfb\xfc\xfd\xfe\xff\x13\x02\x00\x00\x2e\x00\x2b\x00\x02\x03\x04\x00\x33\x00\x24\x00\x1d\x00\x20\x9f\xd7\xad\x6d\xcf\xf4\x29\x8d\xd3\xf9\x6d\x5b\x1b\x2a\xf9\x10\xa0\x53\x5b\x14\x88\xd7\xf8\xfa\xbb\x34\x9a\x98\x28\x80\xb6\x15'
server_random, public_key = server_hello_parser(server_hello_msg)
print(server_random)
print(public_key)


           

        
        
        


b'pqrstuvwxyz{|}~\x7f\x80\x81\x82\x83\x84\x85\x86\x87\x88\x89\x8a\x8b\x8c\x8d\x8e\x8f'
b'\x9f\xd7\xadm\xcf\xf4)\x8d\xd3\xf9m[\x1b*\xf9\x10\xa0S[\x14\x88\xd7\xf8\xfa\xbb4\x9a\x98(\x80\xb6\x15'


# 6. Key Derivation

Now that we have our private key and the server has sent back their public key, we can now encrypt our data to the server via key calculations. We can break down the key calculations into a few steps

# 6a. Transcript Hash

Transcript Hash refers to the hash of the ClientHello and the ServerHello messages. The idea is to associate these messages with the keys we are about to derive: These keys work with these messages. That way, even if an attacker somehow got hold of our keys, it would not work with whatever messages they try to send



In [60]:
def transcript_hash(client: bytes, server: bytes) -> bytes:
    digest = hashes.Hash(hashes.SHA256()) # We'll be using SHA384 to create the hash for our messages.
    digest.update(client)     # Note that both the ClientHello and ServerHello is used here
    digest.update(server)
    transcript_hash = digest.finalize()
    return transcript_hash

print(client_hello_msg)       #This is the clientHello message we generated from our code
print(server_hello_msg)       #Right now, this server_hello is pre-generated for testing purposes. We'll be parsing an actual server hello when the entire thing is ready

hello_hash = transcript_hash(client_hello_msg, server_hello_msg)

b'\x16\x03\x01\x00\x80\x01\x00|\x03\x03\xcc(T\x8e_\xcd2\x9b0\xb02U\x9cg\n\x04@\xd3\xdf\xd7\xb7\xfd\x11\xe3\xb5\xf3J\x1e&\xdd L\x00\x02\x13\x01\x01\x00\x00R\x00\x00\x00\x15\x00\x13\x00\x00\x10www.brandeis.edu\x00\n\x00\x04\x00\x02\x00\x1d\x003\x00&\x00$\x00\x1d\x00 =\x0e\xe7\xcd\xe7i\x8b\x9e\xc0\xf5F\x9e\x0f\x93\xe5\xfd\xae\x02\xdf\x8fB\x88\x9bs\xb4\xa6\xac\x84f\x08Pv\x00+\x00\x03\x02\x03\x04'
b'\x02\x00\x00v\x03\x03pqrstuvwxyz{|}~\x7f\x80\x81\x82\x83\x84\x85\x86\x87\x88\x89\x8a\x8b\x8c\x8d\x8e\x8f \xe0\xe1\xe2\xe3\xe4\xe5\xe6\xe7\xe8\xe9\xea\xeb\xec\xed\xee\xef\xf0\xf1\xf2\xf3\xf4\xf5\xf6\xf7\xf8\xf9\xfa\xfb\xfc\xfd\xfe\xff\x13\x02\x00\x00.\x00+\x00\x02\x03\x04\x003\x00$\x00\x1d\x00 \x9f\xd7\xadm\xcf\xf4)\x8d\xd3\xf9m[\x1b*\xf9\x10\xa0S[\x14\x88\xd7\xf8\xfa\xbb4\x9a\x98(\x80\xb6\x15'


# 6a. Shared secret
The idea of the key exchange is that given the private key and the public key of the other party, both client and server can perform calculations to arrive at the same number. Essentially, client private x server public = server private x client public. This is called the shared secret. We'll perform the calculations on our side while the server does it on theirs.


In [61]:
# Our public key right now is raw bytes. Our private key, on the other hand, is a x25519PrivateKey object. The most convenient way to calculate the 
# shared secret is to turn the public key into a x25519PublicKey object and leverage the built in methods to generate the shared secret

server_public_key = x25519.X25519PublicKey.from_public_bytes(public_key) 
shared_secret = private_key.exchange(server_public_key)
print(shared_secret)



b"\x04\xce\x02'\xd4\xaa\x00\xf1G\x12\xf0\x94Of\x1eS\xf3b]\x95h!<a\xb5\x84\x0e<\x9al\x0bK"


# 6b. Early Secret, Derived Secret, Handshake Secret
We then go on to generate the early secret, the derived secret and finally the handshake secret. 

- Early secret is for 0-RTT data (essentially data sent with pre-shared keys). Since we're not using PSKs, this is a bunch of 0 bytes.
- Derived secret is mainly used as salt, mixed in with the early secret to get the handshake key. This way, even if an attacker has the early secret, it's hard to determine the handshake key

Each of these secrets have a "label" associated with them to identify their use, usually with the format "tls13 + ID of the component". 

* Note 1: We can pass in the label directly into the HKDF info parameter, or we can make a method that will append the TLS13 infront and add the necessary bytes for length, after which it is passed into the infor parameter. To avoid hardcoding, I went with the second options. If this example works, we can go back and try hardcoding the labels to see if it affects anything


In [62]:
## REMEMBER TO CHANGE IT BACK TO SHA256 AFTER TESTING. Algorithm goes back to SHA256, Length goes back to 32.


# UNCOMMENT THIS FOR TESTING. SHOULD WORK FOR SHA384 IMPLEMENTATIONS:

#shared_secret = bytes.fromhex('df4a291baa1eb7cfa6934b29b474baad2697e29f1f920dcc77c8a0a088447624')
#hello_hash = bytes.fromhex('e05f64fcd082bdb0dce473adf669c2769f257a1c75a51b7887468b5e0e7a7de4f4d34555112077f16e079019d5a845bd')

# expected result
# handshake secret: bdbbe8757494bef20de932598294ea65b5e6bf6dc5c02a960a2de2eaa9b07c929078d2caa0936231c38d1725f179d299
# client secret: db89d2d6df0e84fed74a2288f8fd4d0959f790ff23946cdf4c26d85e51bebd42ae184501972f8d30c4a3e4a3693d0ef0
# server secret: 23323da031634b241dd37d61032b62a4f450584d1f7f47983ba2f7cc0cdcc39a68f481f2b019f9403a3051908a5d1622
# client handshake key: 1135b4826a9a70257e5a391ad93093dfd7c4214812f493b3e3daae1eb2b1ac69
# client handshake iv: 4256d2e0e88babdd05eb2f27
# server handshake key: 9f13575ce3f8cfc1df64a77ceaffe89700b492ad31b4fab01c4792be1b266b7f
# server handshake iv: 9563bc8b590f671f488d2da3


# TLS specs make use of 2 methods: HKDF-Extract and HKDF-Expand. HKDF-Extract can be found in hkdf.HKDF()_extract(). HKDF-Expand is hkdf.HKDFExpand()


#Define our own hkdf_expand_label. This is basically calling HKDF to derive secrets, with an added step of configuring the labels associated with the component.

def hkdf_expand_label(secret: bytes, label: str, context: bytes, length: int) -> bytes:
    # Construct the HkdfLabel as specified in TLS 1.3
    label = b"tls13 " + label.encode('utf-8')
    hkdf_label = (
        struct.pack("!H", length) +  # 2 bytes for length
        struct.pack("!B", len(label)) + label +  # 1 byte for label length + label
        struct.pack("!B", len(context)) + context  # 1 byte for context length + context
    )
    
    # Use HKDF-Expand
    encryption = hkdf.HKDFExpand(
        algorithm=hashes.SHA256(),
        length=length,
        info=hkdf_label,
    )
    
    return encryption.derive(secret)


early_secret = hkdf.HKDF(
    algorithm=hashes.SHA256(), #We negotiated SHA256 in our cipher suite
    length=32,
    salt=b'\x00',
    info=b'\x00'
)._extract(b'\x00' * 48) #The first input key is simply 00

empty_hash = hashes.Hash(hashes.SHA384())
empty_hash.update(b"")  # Empty string
empty_hash = empty_hash.finalize()

# We then derive another secret called the derived secret from the early secret
derived_secret = hkdf_expand_label(
    secret = early_secret,
    label="derived",
    context=empty_hash,
    length=32
)

# The derived secret is mixed into the early secret to get the handshake secret
handshake_secret = hkdf.HKDF(
    algorithm=hashes.SHA256(),
    length=32,
    salt=derived_secret,  # derived_secret is used as salt
    info=None
)._extract(shared_secret)

# 6c. Traffic Secrets and Key derivation

Now that we have our handshake secret, we can now move on to derive the traffic secrets, and subsequently the keys and IVs for both the client and the server

In [63]:
# Deriving the secrets and subsequent keys + IVs.
client_secret = hkdf_expand_label(
    secret = handshake_secret, 
    label = "c hs traffic", 
    context = hello_hash, 
    length = 32
)

server_secret = hkdf_expand_label(
    secret = handshake_secret,
    label = "s hs traffic",
    context = hello_hash,
    length = 32
)


client_handshake_key = hkdf_expand_label(
    secret = client_secret,
    label = "key",
    context = b"",
    length = 16
)

server_handshake_key = hkdf_expand_label(
    secret = server_secret,
    label = "key",
    context = b"",
    length = 16
)

client_handshake_iv = hkdf_expand_label(
    secret = client_secret,
    label = "iv",
    context = b"",
    length = 12
)

server_handshake_iv = hkdf_expand_label(
    secret = server_secret,
    label = "iv",
    context = b"",
    length = 12
)




print("handshake secret: " + handshake_secret.hex())
print("client secret: " + client_secret.hex())
print("server secret: " + server_secret.hex())
print("client handshake key: " + client_handshake_key.hex())
print("client handshake iv: "+ client_handshake_iv.hex())
print("server handshake key: " + server_handshake_key.hex())
print("server handshake iv: " + server_handshake_iv.hex())


handshake secret: aa9e4e5a9f72a7ffe2f7a8cd5a6e6a4c08e8f4c2a85002e077ff9c166745f971
client secret: 13c6f6d395684fe3e00ad590c9f10d370785a727dfd668f9cdf8a4c6b51f85fd
server secret: 4da087cf5612f6b7f391570c0b79e369876609af65cb9d97e2acc37f71b5b379
client handshake key: 33e9884a8e478c463358f0b8260ffd05
client handshake iv: bf566b739942fe2f2e751768
server handshake key: 7e11c9303a0b18c8d815bc66e21a2c82
server handshake iv: 9c24614595e5a4fb9de63273
