<h1><center>cs1001.py , Tel Aviv University, Spring 2018</center></h1>
<img src="http://www.pngall.com/wp-content/uploads/2016/05/Python-Logo-PNG-Image-180x180.png" width=50/>

## Recitation 5

We continued discussing complexity. Then we reviewed some properties of prime numbers and used them for primality testing. We reviewed the Diffie-Hellman protocol for finding a shared secret key and also tried to crack it. 

### Takeaways:
<ol>
    <li>In order to analyze the time complexity of a code, try to bound the number of "basic operations" performed by your code.
    If your code contains loops try to understand their structure (series or parallel, and dependent or independent). This may help bounding the overall complexity.</li>    
    <li>The probabilistic function is_prime, that uses Fermat's primality test, can be used to detect primes quickly and efficiently, but has a (very small) probability of error. Its time complexity is $O(n^3)$, where $n$ is the number of bits of the input.</li>
    <li>The DH protocol relies on two main principles: the following equality $(g^{a}\mod p)^b \mod p = g^{ab} \mod p $ and the (believed) hardness of the discrete log problem (given $g,p$, and $x = g^{a} \mod p$ finding $a$ is hard). Make sure you understand the details of the protocol.</li>
</ol>

#### Code for printing several outputs in one cell (not part of the recitation):

In [1]:
from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all"

### Reminder: Big O notation

Given two functions $f(n)$ and $g(n)$,

$f(n) = O(g(n))$ 
 If and only if there exist $c > 0 $ and $n_{0}\in \mathbb{R}$ such that
 $\forall n>n_0$    
   $|f(n)| \leq c\cdot|g(n)|$ 

### Series Loops
Let $n$ denote the input size.
Let $f_1(n) = O(g_1(n))\;$ and $f_2(n)=O(g_2(n))$.

    for i in range(f1(n))):
        O(1)
    for j in range(f_2(n)):
        O(1)

We showed that $f_1(n) + f_2(n) = O(g_1(n) + g_2(n))$ and that $f_1(n) + f_2(n) = O(max(g_1(n), g_2(n)))$


Show that $f_1 + f_2 + ... + f_k = O(f_{max})$. That is, in a finite constant sum of functions, the dominate function defines the growth rate.
A private case is that of a polynomial.



### Independednt nested oops

    for i in range(f1(n)):
        for j in range(f2(n)):
            O(1)

Show that $f_1(n) \cdot f_2(n) = O(g_1(n) \cdot g_2(n))$.


### Dependent nested loops

    for i in range(f1(n)):
        for j in range(i):
            O(1)
 Use $\sum$ to bound the time complexity in this case

$\sum_{i=1}^{n}{i} = O(n^2)$ - the arithmetic series

$\sum_{i=1}^{n}{2^i} = O(2^n)$ - the geometric series

### Exercise: Analyze loops

    for i in range(1,n+1):
        j=1
        while j<=n:
            j += 1  # O(n**2)
            j += 7  # O(n**2), inner loop does n/7 iterations 
                    #   for each outer loop
            j *= 2  # O(n*log(n))
            j *= 7  # O(n*log(n)), change log bases is like 
                    #   multiplying by a constant
            j **= 2 # O(n*log(log(n))), we need to take a log on both sides 
                    #   *twice* (also for this case, j should start from 2)
            j += i  # O(n*log(n)), the sum of 1/i from i=1 to n is O(log(n))

## Primes and Diffie-Hellman

#### Primality test using Ferma's witness

Fermat's little theorem: if $p$ is prime and $1 < a < p$, then $a^{p-1} (\textrm{mod}\ p) \equiv 1$

Equivalently: if $m$ is not a prime then there exists $1 < a < m$ such that $a^{m-1} (\textrm{mod}\ m) \not\equiv 1$. Such a number $a$ is called a witness to the fact that $m$ is not prime.

We can use Fermat's little theorem in order test whether a given number is prime. Note that if the number has $n$ bits than testing all possible $a$-s will require $O(2^n)$ iterations (a lot!).

Instead, we will try 100 random $a$-s in the range and see if one works as a witness.

In [3]:
import random

def is_prime(m, show_witness=False):

    """ probabilistic test for m's compositeness """

    for i in range(0,100):
        a = random.randint(1,m-1) # a is a random integer in [1..m-1]
        if pow(a,m-1,m) != 1:
            if show_witness:  # caller wishes to see a witness
                print(m,"is composite","\n",a,"is a witness, i=",i+1)
            return False

    return True


For $a,b,c$ of at most $n$ bits each, time complexity of modpower is $O(n^3)$

In [3]:
def modpower(a, b, c):
    """ computes a**b modulo c, using iterated squaring """
    result = 1
    while b>0: # while b is nonzero
        if b%2 == 1: # b is odd
            result = (result * a) % c
        a = (a*a) % c
        b = b//2
    return result

#### The probability of error:
First, notice that if the function says that an imput number $m$ is not prime, then it is true. 
The function can make a mistake only is the case where a number $m$ is not prime, and is excidentally categorized by the function as prime. This can happen if all $100$ $a$'s that the function tried were not witnesses. According to the Miller-Rabin theorem $\frac{3}{4}$ of all possible $a$s are witnesses, so the probability for error is $(\frac{1}{4})^{100}$ (this is extremely low).

#### Testing the prime number theorem: For a large n, a number of n bits is prime with a prob. of O(1/n)
We decide on the size of the sample (to avoid testing all possible $2^{n-1}$ numbers of $n$ bits) and test whether each number we sample is prime. Then we divide the number of primes with the size of the sample.

In [1]:
def prob_prime (n, sample):
    cnt = 0
    for i in range(sample):
        m = random.randint(2**(n-1), 2**n-1)
        cnt += is_prime(m)
    return cnt/sample



In [6]:
prob_prime(2, 10**4)
prob_prime(3, 10**5)


0.50197

In [7]:
prob_prime(100, 10**4)

0.0164

In [8]:
prob_prime(200, 10**4)

0.0059

Diffie Hellman from lecture

<img src="DH.PNG">

#### The protocol as code

In [12]:
def DH_exchange(p):
    """ generates a shared DH key """
    g=random.randint(1,p-1)
    a=random.randint(1,p-1)# Alice's  secret
    b=random.randint(1,p-1)# Bob's  secret
    x=pow(g,a,p)
    y=pow(g,b,p)
    key_A=pow(y,a,p)
    key_B=pow(x,b,p)
    #the next line is different from lecture
    return g, a, b, x, y, key_A #key_A=key_B

#### Find a prime number

In [10]:
def find_prime(n):
    """ find random n-bit long prime """
    while(True):
        candidate = random.randrange(2**(n-1),2**n)
        if is_prime(candidate):
            return candidate

Demostration:

In [16]:
import random
p = find_prime(10)
print(p)
g,a,b,x,y,key = DH_exchange(p)
g,a,b,x,y,key

881


(742, 530, 729, 391, 9, 260)

In [17]:
print(pow(g, a, p))
print(pow(x, b, p))

391
260


#### Crack the Diffie Hellman code
There is no known way to find $a$ efficiently, so we try the naive one: iterating over all $a$-s and cheking whether the equation $g^a \mod p = x$ holds for them. 

If we found $a'$ that satisfies the condition but is not the original $a$, does it matter?

The time complexity of crack_DH is $O(2^nn^3)$

In [18]:
def crack_DH(p, g, x):
    ''' find secret "a" that satisfies g**a%p == x
        Not feasible for large p '''
    for a in range(1,p-1):
        if a%100000==0:
            print(a) #just to estimate running time
        if pow(g,a,p) == x:
            return a
    return None #should never get here

In [19]:
print(a)
crack_DH(p,g,x)

530


90

#### Trying to crack the protocol with a 100 bit prime

In [20]:
import random
p = find_prime(100)
print(p)
g,a,b,x,y,key = DH_exchange(p)
print(g,a,b,x,y,key)

crack_DH(p,g,x)

810543813422532635594122039951
83739677916973497183932873435 422273583332289289081486172915 652126468197482555431588471278 180840888183890983767154816426 649462685083822063286973416790 665441102083594163792581623840
100000
200000
300000
400000
500000
600000
700000
800000
900000
1000000
1100000
1200000
1300000
1400000
1500000
1600000
1700000
1800000
1900000


KeyboardInterrupt: 

Analyzing the nubmer of years it will take to crack the protocol if $a$ is found at the end (assuming iterating over 100000 $a$s takes a second)

In [21]:
810543813422532635594122039951//100000/60/60/24/365


2.5702175717355805e+17