<a id='top'></a>

<h1>Stream Cipher</h1>
<br>
Notes on Lecture 3, [Introduction to Cryptography](https://www.youtube.com/channel/UC1usFRN4LCMcfIV7UjHNuQg/videos), by Christof Paar, and Chapter 2, [Understanding Crpytography](https://www.amazon.com/Understanding-Cryptography-Christof-Paar-ebook/dp/B00HWUO98A), by Paar & Pletzl:<br>
<br>
Note: Many notebook functions and diagrams are unavailable /  will not display in static views (like GitHub). [View 'live' on MyBinder](https://mybinder.org/v2/gh/jinjagit/Cryptography/master?filepath=StreamCipher.ipynb)

In [None]:
# Run this code cell for help on using 'live' notebooks
f = open('HowTo.txt', 'r'); data = f.read(); print(data)

<h2>Contents:</h2><br>
<a href='#intro'>Introduction</a><br>
<a href='#rngs'>Random Number Generators</a><br>
<a href='#one_time'>The One Time Pad</a><br>

**NOTE: The following code should be run before other code in this notebook is run** (imports necessary libraries):

In [None]:
%matplotlib inline 
import matplotlib
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from ipywidgets import interact,FloatSlider,IntSlider
from IPython.display import HTML as html_print

print("libraries imported")  # <- (re)run until output prints; may take a few seconds to run.

<a id='intro'></a>

<h2>Introduction:</h2><br>

<center><img src=diagrams/CryptoTree.svg / ></center>

Note: GSM is used by all mobile phones, was developed in mid 1980s, and was first large-scale application of encrytpion. GSM uses a stream cipher to encrypt digital voice (audio) signals, before transmitting to base towers.

**Definition:** A stream cipher encrypts bits individually (as against a block cipher, which encodes blocks of bits), **by adding a bit from a <i>key stream</i> to a plaintext bit.**

Usually, stream ciphers use very simple encrytption and decryption methods.

Each bit $\Large{x_i}$ is encrypted by adding a secret key stream bit $\Large{s_i}$, modulo 2.

<center><img src=diagrams/asynchStream.svg / ></center>

Stream ciphers tend to be small and fast and are, therefore, relevant to applications that run in environments with relatively limited resources. In general, however, block ciphers are still used more frequenly for encrytping computer communications. Neither kind can claim better overall efficiency (of software = cycles; or hardware = fewer gates/smaller chip area).

**Encryption and Decryption:**

<i>The plaintext, ciphertext and key consist of individual bits;</i>

$x_i, y_i, s_i\in\{1, 0\}$

**Encryption:** $y_i = e_{s_i}(x_i)\equiv x_i + s_i\mod{2}$<br></br>
**Decryption:** $x_i = d_{s_i}(y_i)\equiv y_i + s_i\mod{2}$

  1. Encryption and decryption are same functions!
  2. Why can we use a simple modulo 2 addition as encrytpion?
  3. What is the nature of the stream bits $\large{s_i}$?

**Proof:** that decryption uses same algorithm as encryption, by substitution of $\large{y_i}$ in decryption algorithm with the encryption algorithm, $\large{x_i + s_i}$ and further reduction of the resultant expression, thus:<br>
<br>
$d_{s_i}(y_i)\equiv y_i + s_i\mod{2}$ <br>
$d_{s_i}(y_i)\equiv (x_i + s_i) + s_i\mod{2}$ <br>
$d_{s_i}(y_i)\equiv x_i + 2s_i\mod{2}$ <br>
$d_{s_i}(y_i)\equiv x_i + 0\mod{2}$ <br>
$d_{s_i}(y_i)\equiv x_i\mod{2} \;\blacksquare$

Thus, the reason that we can use simple modulo 2 addition as decryption, (as well as encryption), is that $x_i, y_i, s_i \in\{1, 0\}$, and $x_i, y_i, s_i \in Z 2$. In other words, because they are all bits and, therefore, can only have values of either $0$, or $1$.

**RULE:** Modulo 2 addition and subtraction are the same operation.

In [None]:
# calculates a+b mod 2, and a-b mod 2

a = int(input("enter 1st integer: "))
b = int(input("enter 2nd integer: "))

c = (a + b)%2
d = (a - b)%2

print("%d + %d, modulo 2 = %d" % (a, b, c))
print("%d - %d, modulo 2 = %d" % (a, b, d))

Thus, $\;\;\;\;d_{s_i}(y_i)\equiv y_i + s_i\mod{2}\;\;\;\;$ and $\;\;\;\;d_{s_i}(y_i)\equiv y_i - s_i\mod{2}\;\;\;\;$ are equivalent.

Consider the truth table for x, s and y:

|        |  x  |  s  |  y = x + s  |  y = x - s  |
|--------|-----|-----|-------------|-------------|
| case 1 |  0  |  0  |      0      |      0      |
| case 2 |  0  |  1  |      1      |      1      |
| case 3 |  1  |  0  |      1      |      1      |
| case 4 |  1  |  1  |      0      |      0      |

The above represents an [XOR](https://en.wikipedia.org/wiki/XOR_gate) binary gate, executed using either addition or subtraction in modulo 2.<br>
An XOR gate implements an exclusive **or**; that is, a true output results if one, and only one, of the inputs to the gate is true.

Idea: To apply modulo 2 to any binary integer value, we need only read the value of the last bit, (0 or 1), which gives the modulo 2 value of the integer. Not a great insight, as really it is another way of saying that the nature of binary means the 'units' digit of any integer will equal the modulo 2 value. This is true of the units digit for any base where a modulo of the same value as the base is applied (e.g. base 10 modulo 10). With binary, however, there is also the advantage that reading a single bit from memory is probably generally faster than doing modulo math on an integer. However, <i>I imagine</i> most decent math libraries / languages exploit this fact when performing modulo 2 math on an integer (but, I do not know this).

In [None]:
# compare modulo 2 representation of an integer with the last digit of its binary representation

a = int(input("enter integer: "))

get_binary = lambda x: format(x, 'b') # useful function for converting integer to binary string

binary = get_binary(a)
last_digit = binary[-1:]
mod2 = a%2

print("binary representation: " + binary)
print("last binary digit: " + last_digit)
print("%d modulo 2 = %d modulo 2" % (a, mod2))

<center><img src=diagrams/streamBasic.svg / ></center>

Considering the above truth table and its consequences, we can see that <i>if the probability of any stream bit $s_i$ being $1$ or $0$ is equal,</i> then the encoding of either $x_i = 0$, or $x_i = 1$, has an equal chance of giving $y_i = 0$, or $y_i = 1$, in either case.<br>
<br>
**Flipping bits:** Also, the value of the stream bit has a particular effect on the $x$ value provided: $s = 1$ will always flip the input bit; $0\,\to\,1$, or $1\,\to\,0$, whereas $s = 0$ will not change the input bit.<br>
  As there are two modulo 2 addition (XOR) operations in the encoding/decoding stream, each using $s_i$ to enccode $x_i$ and decode $y_i$, $x_i$ is eventually either flipped twice; if $s_i =1$, or not at all; if $s_i =0$.

In [None]:
# XOR on 4 cases of x + s, applied twice to emulate encryption and decryption of single bits.
# Note the replacement of modular arithmetic with a single XOR operation.

def get_XOR(x, s):
    if x == s:
        y = 0
    else:
        y = 1
    return y

a = str(get_XOR(0,0))
b = str(get_XOR(1,0))
c = str(get_XOR(0,1))
d = str(get_XOR(1,1))

def col_str(s):  # use html to print color output (white text on blue background)
    return "<text style=color:white;background-color:blue>{}</text>".format(s)

def bl_sp(s):  # use html to print blue space (blue text on blue background)
    return "<text style=color:blue;background-color:blue>{}</text>".format(s)

html_print("encrypt " + bl_sp('.') + col_str('0') + bl_sp('.') + " with s = 0 -> " + a + 
           " ... decrypt " + a + " with s = 0 -> " + bl_sp('.') + col_str(a) + bl_sp('.') + "</br>"
           + "encrypt " + bl_sp('.') + col_str('1') + bl_sp('.') + " with s = 0 -> " + b +
           " ... decrypt " + b + " with s = 0 -> " + bl_sp('.') + col_str(b) + bl_sp('.') + "</br>"
           + "encrypt " + bl_sp('.') + col_str('0') + bl_sp('.') + " with s = 1 -> " + c +
           " ... decrypt " + c + " with s = 1 -> " + bl_sp('.') + col_str(d) + bl_sp('.') + "</br>"
           + "encrypt " + bl_sp('.') + col_str('1') + bl_sp('.') + " with s = 1 -> " + d +
           " ... decrypt " + d + " with s = 1 -> " + bl_sp('.') + col_str(c) + bl_sp('.'))

Example: Encoding the ASCII character "A":

It's number 65 in the ASCII table, which is 1000001 in binary.
We apply a pre-defined stream of 0101101:

$x_7...x_1$ $= 1000001$ <br>
$s_7...s_1$ $= 0101101$

In [None]:
# Encrypting an ASCII 'A' in binary (requires previous code cell to have run)

def encr_bin_str(data_in, key_stream):
    encoded = ""
    for i in range(len(data_in) * -1, 0):
        encoded = encoded + str(get_XOR(data_in[i], key_stream[i]))
        
    return encoded

data_in = "1000001"
key_stream = "0101101"

encoded = encr_bin_str(data_in, key_stream)
print("data in = %s, after encryption = %s" % (data_in, encoded))
decoded = encr_bin_str(encoded, key_stream)  # note use of encryption algo to decrypt
print("encoded = %s, after decryption = %s" % (encoded, decoded))

<a id='rngs'></a>

The generation of key stream bits needs to be somehow related to randomness.<br>
<br>
<br>
<h2>Random Number Generators (RNG):</h2><br>
Three main types:<br>
  1. **True Random Number Generator (TRNG):**<br>
  True random numbers stem from randomness in some physical processes; E.g. Flipping a coin, roulette, noise (thermal), computer mouse movement, keystroke timing, radioactive decay, etc. The lack of repeatability of such randomness makes TRNG problematic for key generation (including of key streams).
  2. **Pseudo-random Number Generator (PRNG):**<br>
  PRNs are computed and, therefore, deterministic (not random). From certain points of view, however, they may appear random (unpredictable). Generally, they are not adequate for cryptographic use, despite being useful in many other areas of computing (e.g. some software testing). In practice, the output from many common PRNGs exhibit artifacts that cause them to fail statistical pattern-detection tests [[Wikipedia]](https://en.wikipedia.org/wiki/Pseudorandom_number_generator). They are often calculated using the formula; $s_0 =$ seed, $s_i+1 = f(s_i)$, recursively repeated.
  3. **Cryptographically Secure PRNG (CPRNG):**<br>
  Also computed and use an update function (recursion), like a PRNG, but additonally have the property that the numbers they produce <i>are</i> unpredictable. Specifically, this means that given $n$ consecutive bits of the key stream, there is no [polynomial time](http://mathworld.wolfram.com/PolynomialTime.html) algorithm that can predict the next bit $s_{n+1}$ with better than 50% chance of success. Informally, this definition is reduced to: it is computationally infeasible to compute $s_{n+1}$.

In [None]:
# Emulating rand() function in ANSI C, a pseudo-random number generator. Fails if seed = 0.
# Based on code from: http://pubs.opengroup.org/onlinepubs/009695399/functions/rand.html

x = int(input("enter a seed value (an integer): "))
print("pseudo-random sequence of 10 integers: ", end='')
binary = ""

for i in range (0, 10):
    x = int((x * 1103515245 + 12345)/65536) % 32768  # RAND_MAX assumed to be 32767.
    print(x, end=' ')
    binary = binary + str(x % 2)
    
print("\nPRNG binary from integer sequence: " + binary)

Note: I don't understand why the numbers 65536 and 32768 appear to work well, (one is 2 * the other), but larger numbers fail, (even if the 2:1 ratio is maintained), nor why the algorithm containing$\mod{2^{31}}$, given by the professor, does not work. Also, the output isn't very random-like.
<span style=color:white;background-color:orange>Need to study further.</span>

<a id='one_time'></a>

<h2>The One Time Pad (OTP):</h2><br>
Goal: to build the 'perfect' cipher.<br>
<br>
**Definition:** a cipher is **'unconditionally secure'** (or information theoretically secure) if it cannot be broken even with <i>**infinite**</i> computing resources.<br>
<br>
The OTP is a stream cipher where:
  1. The key stream bits stem from a TRNG.
  2. Each key stream bit is used only once.
  
Of course, the problem now resides in how to communicate the key to the receiver, since the key is as long as the message!  (e.g. encrypting a 400mb video file produces 3.8Gb of key). If the key is safely communicated, the key still needs to be kept safe (and it is probably very large). Not only this, but the key also is non-reusable (one time) if security is to be maintained.

Document unfinished. In progress...

<center><a href='#top'>Back to top of document</a><br></center>