# Stream Ciphers: Concepts, Implementation, and Cryptanalysis

## Introduction

2 ---->  Cryptography plays a crucial role in protecting digital communication, and stream ciphers are widely used for their efficiency in encrypting data bit by bit using a pseudorandom keystream. A common keystream generator is the Linear Feedback Shift Register (LFSR), which is simple and fast but vulnerable (due to its linear structure, mention in LFSR). The Berlekamp-Massey Algorithm can exploit this vulnerability by reconstructing the minimal LFSR that produces a given sequence, 
enabling Known-Plaintext Attacks (KPA) where an attacker recovers the internal state if enough keystream is exposed. To enhance security, stream cipher designs often include (nonlinear) techniques like the Alternating Step Generator (ASG), which combines multiple LFSRs where one LFSR controls the clocking of the others, adding complexity. This report focuses on these components, including bit manipulation, LFSR behavior, the Berlekamp-Massey algorithm, and the ASG, to better understand and implement stream ciphers and related techniques.

3 ---->  Cryptography is key to protecting digital communication, and stream ciphers are commonly used for their speed and efficiency, especially in systems with limited processing power. These ciphers encrypt data one bit at a time using a keystream, which is generated by pseudorandom number generators like Linear Feedback Shift Registers (LFSRs). While LFSRs are efficient, their linear nature can be a weakness. The Berlekamp-Massey Algorithm, for example, can reconstruct the shortest LFSR that produces a given output, making stream ciphers vulnerable to Known-Plaintext Attacks (KPA).
To address these weaknesses, more complex designs like the Alternating Step Generator (ASG) are used. The ASG combines multiple LFSRs in a way that adds irregularity and non-linearity to the keystream, making it harder to analyze or predict. This report explains the core components involved — including bit handling, LFSR construction, the Berlekamp-Massey Algorithm, and the ASG — to show how these systems work and where their strengths and weaknesses lie.


## Stream Ciphers


Stream ciphers are symmetric key algorithms that encrypt data one bit or byte at a time, offering a faster and smaller alternative to block ciphers. Their  design is inspired by the One-Time Pad (OTP), introduced by Gilbert Vernam in 1917 and later formalized by Shannon. The OTP achieves perfect secrecy by XORing the plaintext with a random key that is as long as the message and used only once. However, due to the practical challenges of key generation and distribution, real-world stream ciphers rely on pseudorandom number generators (PRNGs) to produce keystreams. 

A common PRNG used in stream ciphers is the Linear Feedback Shift Register (LFSR), which is simple and efficient but vulnerable to linear analysis (expand on RNGs?). (The Berlekamp-Massey Algorithm can reconstruct the shortest LFSR that generates a given output sequence, making it useful in attacks like Known-Plaintext Attacks (KPA). To strengthen security, constructions like the Alternating Step Generator (ASG) combine multiple LFSRs in a non-linear, irregular clocking scheme. This project explores these components through code implementations, analyzing their structure and role in stream cipher design.)

treats the plaintext as a stream and encrypts the bits individually

smaller and faster 

assumed to be more efficient

encryption and decryption are the same procedure -XOR

they can reach very high level of secrecy

One time pod: key as long as the plaintext, uniformly distributed in the key space, key must be used only once -> information theory-wise secure

Security in stream ciphers relies on key stream

OTP is unpractical because stream ciphers rely on random number generators

-maybe talk a bit about RNGs- 

RNG must be reproducible and unpredictable



In [4]:
from bits import Bits
from lfsr import LFSR, berlekamp_massey
from bitgenerator import AlternatingStep

## LFSR

one of the main building blocks of PRNGs

Linear Feedback Shift Registers (LFSRs) are simple but powerful components often used to generate pseudorandom sequences in stream ciphers. An LFSR consists of a series of binary registers (bits) that shift their values at each step, with the new input bit determined by a linear function, typically the XOR of selected bits from the current state, defined by a feedback polynomial. Despite their simplicity, LFSRs can produce sequences with long periods and good statistical properties (when properly configured).

In stream ciphers, LFSRs act as keystream generators, where their output is XORed with the plaintext to produce ciphertext. The same sequence can be reproduced during decryption, as long as the initial state (or seed) and polynomial are known. However, the linear nature of LFSRs also makes them vulnerable. If an attacker can observeenough of the keystream, algorithms like Berlekamp-Massey can reconstruct the LFSR’s feedback polynomial and internal state, allowing full prediction of the remaining sequence. (Regardless, they are the foundation for more complex and secure systems such as combination generators or the ASG.)

## Berlekamp-Massey Algorithm

2 ---->  The Berlekamp-Massey Algorithm is a key tool in analyzing and attacking stream ciphers that use Linear Feedback Shift Registers (LFSRs). Originally developed for decoding linear error-correcting codes, the algorithm finds the shortest LFSR capable of generating a given binary sequence. This means that if an attacker can observe a long enough portion of the keystream, they can use Berlekamp-Massey to reconstruct the LFSR's feedback polynomial and internal state, recovering the rest of the keystream. The algorithm works by iteratively checking for discrepancies between the observed sequence and the output of a candidate LFSR. When a discrepancy is found, it updates the current feedback polynomial by combining it with a shifted version of a previous one. This process continues until the full sequence has been processed, and the minimal LFSR capable of producing that sequence is returned. The complexity of the algorithm is efficient, operating in O(n²), where n is the length of the input sequence.

3 ---->  The Berlekamp-Massey Algorithm is a key tool in analyzing and attacking stream ciphers that use Linear Feedback Shift Registers (LFSRs). Originally developed for decoding linear error-correcting codes, the algorithm finds the shortest LFSR capable of producing a given binary output sequence. If an attacker observes a long enough portion of a keystream, they can use Berlekamp-Massey to reconstruct the LFSR’s feedback polynomial and internal state, allowing them to predict the rest of the keystream.
The algorithm works by exploiting the property that any valid LFSR-generated sequence must satisfy the condition (see chatlog) =0 ,where  are the feedback polynomial coefficients and are the known bits of the sequence. At each step, the algorithm checks for a discrepancy — a violation of this condition — and updates the feedbackpolynomial accordingly by combining it with a shifted version of a previous one. This process continues until the full sequence is processed, producing the minimal LFSR that fits the data. Due to this capability, stream ciphers using LFSRs are vulnerable to Known-Plaintext Attacks (KPA), where an attacker, knowing parts of the 
plaintext and ciphertext, can recover the keystream and infer the generator.

binary output sequence -> shortest lfsr able to produce the sequence

exploits the ptoperty that x p_i+b[t-i] must be zero

makes the system prone to KPA attck. If eve knows enough x_i and y_i's she can compute b_i's and apply B-M algorithm to infer P(X)

In [13]:
# read the binary sequence from the file
with open('binary_sequence.bin', 'rb') as f:
        binary_sequence = f.read()
        
binary_sequence[:50]

b'\xbb`\xef\x067\xae\xd0K"Vd]#Q\xeb\x02~<\xe6C\xbe\xed5\xd0\xec\xada\xe8\x89h\xf3\xbdFc\x96\xb5\x8e\xb0\x03\xabVFY#\xd1\xeb">\xb5\xe4'

In [14]:
# convert the binary sequence to a Bits object
bits = Bits(binary_sequence)

In [15]:
# find the shortest feedback polynomial using the Berlekamp-Massey algorithm
poly = berlekamp_massey(bits)
linear_complexity = max(poly) if poly else 0

print("Shortest feedback polynomial degrees:", poly)
print("Linear complexity:", linear_complexity)

Shortest feedback polynomial degrees: {0, 18, 7}
Linear complexity: 18


## Alternating Step Generator

The Alternating Step Generator (ASG) is a stream cipher design that aims to overcome the weaknesses of individual LFSRs by combining multiple LFSRs in a non-linear and irregular fashion. In a typical ASG setup, three LFSRs are used: two generate bits for output, while the third acts as a control register that determines which of the other two is stepped at each time instant. This irregular clocking introduces unpredictability into the output sequence, making it harder for an attacker to analyze or reverse-engineer.

The core idea behind the ASG is to break the linearity that makes LFSRs vulnerable to attacks like Berlekamp-Massey. By conditionally stepping LFSRs based on the output of a third, the generator produces a more complex and less predictable keystream, even if the individual components are linear. This added complexity significantly raises the difficulty of reconstructing the internal state through known cryptanalytic techniques.Still, while ASGs are more secure than a single LFSR, they aren’t foolproof. If the parameters aren’t chosen carefully or if there isn’t enough non-linearity in the design, they can still be exposed to more advanced or targeted attacks.

# KPA

A Known-Plaintext Attack (KPA) targets stream ciphers by using knowledge of part of the original plaintext and its corresponding ciphertext to recover the underlying keystream. Since stream ciphers encrypt data by XORing the plaintext with a keystream, simply XORing the known plaintext with its ciphertext segment yields the matching portion of the keystream. 

In this case, the ciphertext (ciphertext.bin) and a portion of the original plaintext (known-plaintext.txt) were provided. XORing the known plaintext with the first part of the ciphertext produced the initial keystream segment. The Berlekamp-Massey Algorithm was then used to analyze this keystream and compute the shortest feedback polynomial for a Linear Feedback Shift Register (LFSR) capable of generating it. An LFSR was constructed to generate the full keystream. XORing this generated keystream with the entire ciphertext produced the original plaintext.

In [19]:
# open the ciphertext and known plaintext files
with open("ciphertext.bin", "rb") as f:
    ciphertext = f.read()

with open("known-plaintext.txt", "r", encoding="utf-8") as f:
    known_plaintext = f.read().encode("utf-8")
    
print(f"Ciphertext: {ciphertext[:30]} ..., \nKnown-Plaintext: {known_plaintext[:90]}...")

Ciphertext: b"\xb7;\xcep\x9e\x7f\xc0\xe3H_'\xc6D\x9b\xe8\xbd\x8e[\x8b\xb0\x94\x00\xdf]\xa9\xd9\x152k\x06" ..., 
Known-Plaintext: b'The Legacy of the Hidden Key\n\nIn a quiet corner of the university library, where dust mote'...


In [42]:
# convert the binary sequences to a Bits objects
kp_bits = Bits(known_plaintext)
cipher_bits = Bits(ciphertext)

In [43]:
# using the known plaintext, find the polynomial that produces the bit sequence needed to generate the ciphertext
bit_sequence = kp_bits ^ Bits(cipher_bits[:len(kp_bits)])
poly = berlekamp_massey(bit_sequence)
linear_complexity = max(poly) if poly else 0

print("Shortest feedback polynomial degrees:", poly)
print("Linear complexity:", linear_complexity)

Shortest feedback polynomial degrees: {0, 1, 9, 48, 19}
Linear complexity: 48


In [44]:
# recover the initial state of the LFSR using the polynomial and the first 48 bits of the bit sequence
init_state = Bits(bit_sequence[0:48][::-1])
lfsr = LFSR(poly, init_state)

In [45]:
# Generate the binary sequence for decryption using LFSR, starting with the initial state
decryption_bits = Bits([lfsr.output])
decryption_bits += lfsr.run_steps(len(cipher_bits)-1, init_state)

In [46]:
# decrypt the ciphertext using the generated binary sequence
decrypted_bits = decryption_bits ^ cipher_bits
decrypted_bytes = decrypted_bits.to_bytes()
print(decrypted_bytes[:130])

b'The Legacy of the Hidden Key\n\nIn a quiet corner of the university library, where dust motes danced in the slanted afternoon light,'


## Conclusion

Stream ciphers, supported by LFSRs, offer efficient encryption but face significant threats from the Berlekamp-Massey Algorithm and KPAs. Their development, from early innovations like the One-Time Pad to modern RNG-based systems, reflects ongoing efforts to enhance security. Non-linear designs and reliable RNGs are essential to counter vulnerabilities and maintain robust encryption. This report demonstrates that while stream ciphers are valuable for real-time applications, their security depends on addressing cryptanalytic risks. Continued advancements in cryptographic design will ensure stream ciphers remain effective in securing digital communications (Katz & Lindell, 2014).

## References

- Biryukov, A., Shamir, A., & Wagner, D. (2000). Real-time cryptanalysis of A5/1 on a PC. Fast Software Encryption, 185–199.
- Hell, M., Johansson, T., & Meier, W. (2006). Grain: A stream cipher for constrained environments. eSTREAM, ECRYPT Stream Cipher Project.
- Katz, J., & Lindell, Y. (2014). Introduction to Modern Cryptography (2nd ed.). CRC Press.
- Massey, J. L. (1969). Shift-register synthesis and BCH decoding. IEEE Transactions on Information Theory, 15(1), 122–127.
- Menezes, A. J., van Oorschot, P. C., & Vanstone, S. A. (1996). Handbook of Applied Cryptography. CRC Press.
- Rueppel, R. A. (1986). Analysis and Design of Stream Ciphers. Springer.
- Shannon, C. E. (1949). Communication theory of secrecy systems. Bell System Technical Journal, 28(4), 656–715.
- Singh, S. (1999). The Code Book: The Science of Secrecy from Ancient Egypt to Quantum Cryptography. Anchor Books.
- Stallings, W. (2017). Cryptography and Network Security: Principles and Practice (7th ed.). Pearson.

