# How long does it take to attack given n and a given hardware?

We assume the attack parameters are $n$ and $l$. Also, denote $g$ to be the number of ignored bits. 

$p = \frac{2^l}{2^n}$ , this is a geometric random variable, thus we expect a collision after   $\#queries = \frac{2^n}{2^l} $

Assume, we only accept digests that have certain number of zeros, denoted as $d$. Thus, we can pretend as we are working on small digests
$$
\begin{align}
&\#queries = \frac{2^n}{2^l} \\
&\#queries_{sec} \cdot t_{sec} = \frac{2^n}{2^l} \\
\Rightarrow &n = log2\left(\#queries_{sec} \cdot t_{sec} \cdot 2^{n-l-d}  \right)
\end{align}
$$


We have three point of views of $\#queries$
- Senders: How many hashes they generate? 
    - Their speed will be affected by difficulty, but from their perspective the overall attack time doesn't change if the difficulty change (add explanation, later)
    - $\#snd\_queries_{sec} = \frac{\#senders \cdot \#gen\_hashes_{sec}} {2^{d}}$

- Receivers: How many hashes they can query the dicitonary. 
    - In their world, the higher the difficulty the better chance of hitting collision (since digests are technically shorter).
    - $\#rcv\_queries_{sec} = \#receivers \cdot \#dict\_queries_{sec} $

- Bandwith: This is how many hashes the network can carry in a second. 
    - From their perspective, difficulty reduces the rate of transmitted messages. 
    - $bdwth_queries_{sec} $


Thus,

$$\#queries_{sec} := min\left(snd\_queries_{sec}, rcv\_queries_{sec}, bdwth\_queries_{sec}\right)$$


In [1]:
# basic parameters per cluster
nservers = 32
server_memory = (196 - 20)*10**9 # 96 GB
ncores_per_server = 32
hashes_sec_core = 2**24.87
dict_queries_sec = 2**21.863350
t_sec =  1 * 24 * 3600
nhashes_stored = 2**60
hashes_sec_phase_i = 2**24.72
dict_add_sec = 2**23.41
log2_l = 52 # log2 of #hashes stored
nhashes_stored = 2**log2_l



In [2]:
# basic functions

def seconds_2_time(t):
    """
    Convert seconds into understandable string
    """
    
    from math import floor
    t = float(t)
    days  = floor(t/(3600*24))
    t = t - days*24*3600
    hours = floor(t/3600)
    t = t - hours*3600
    minutes = floor(t/60)
    t = t - minutes*60

    return f"{days} days, {hours} hours, {minutes} mins, {floor(t)} sec"



def nqueries_sender(nsenders, hashes_sec_core, difficulty=0):
    """ Return how many queries senders can generate per second """

    return nsenders*hashes_sec_core/(2**difficulty)

def nqueries_receiver(nreceivers, dict_queries_sec):
    """
    Return how many queries receivers can make in a second
    """
    return nreceivers * dict_queries_sec

# phase 0 of phase ii
def t_regen_msg(nsenders,
                nreceivers,
                hashes_sec_core,
                dict_add_sec,
                difficulty,
                nhashes_stored):
    """ 
    return number of seconds needed to regenerate the long message
    """
    nsecs_sender = nhashes_stored / nqueries_sender(nsenders,
                                                   hashes_sec_core,
                                                   difficulty=0)
    
    # how many hashes receivers will try to store?
    nhashes_receiver = (nhashes_stored/(2**difficulty)) 
    nsecs_receiver = nhashes_receiver / nqueries_receiver(nreceivers,
                                                          dict_add_sec)
    
    
    # since we will wait for everyone to finish
    return max(nsecs_receiver, nsecs_sender)


# phase 1 of phase ii
def t_enough_candidates(n,
                       nsenders,
                       nreceivers,
                       hashes_sec_core,
                       dict_add_sec,
                       difficulty,
                       nhashes_in_dict):
    """
    Return time needed to generate enough candidates
    """
    from math import log2
    
    # how many hashes sender has to make 
    nreq_qsender = (2**n/nhashes_in_dict)
    # time needed for senders to get this number of queries
    t_req_qsender = nreq_qsender / nqueries_sender(nsenders,
                                            hashes_sec_core,
                                                 difficulty)
    # how many queries reciever needs to get enough candidates?
    # it should be less than or equal than number of hashes
    nreq_qrecv = nreq_qsender/(2**difficulty)                  
    t_req_qrecv = nreq_qrecv/nqueries_receiver(nreceivers,
                                            dict_queries_sec)

    return max(t_req_qrecv, t_req_qsender)
    

In [5]:
def best_parameter(n,
                   nservers,
                   server_memory,
                   ncores_per_server,
                   nhashes_stored,
                   hashes_sec_core,
                   dict_queries_sec,
                   verbose=False):
    
    """
    Find the best parameters: nsenders, nreceivers, and difficulty
    that minimizes the run time on given cluster.
    """
    from math import log2
    # step 1 loop over decompose nsenders, nreceivers
    # step 2  loop over difficulty
    # step 3 find the time
    # step 4 store the minimum
    t_min = float("inf")
    # how many hashes servers can store
    max_nhashes_in_memory = server_memory*nservers / (32)
    
    if (verbose):
        print(f"Memory can take at most 2^{log2(max_nhashes_in_memory)} hashes")
        print("++++++++++++++++++++++++++++++++++++++++++++++++++\n\n")
    
    for nreceivers in range(nservers, 
                           nservers*ncores_per_server,
                           nservers):
        
        nsenders = nservers*ncores_per_server - nreceivers
        
        for difficulty in range(9):
            t1 = t_regen_msg(nsenders,
                            nreceivers,
                            hashes_sec_core,
                            dict_add_sec,
                            difficulty,
                            nhashes_stored)
            
            if (t1<0):
                continue
                
            nhashes_in_dict = min(max_nhashes_in_memory,
                                 nhashes_stored/(2**difficulty))

            
            
            t2 = t_enough_candidates(n,
                                     nsenders,
                                     nreceivers,
                                     hashes_sec_core,
                                     dict_add_sec,
                                     difficulty,
                                     nhashes_in_dict)
    
            t = t1 + t2
            if (t < t_min):
                if (verbose):
                    print(f"t={seconds_2_time(t)}\n"
                          +f"t1={seconds_2_time(t1)}\n"
                          +f"t2={seconds_2_time(t2)}\n"
                          +f"nsednders={nsenders}, nreceivers={nreceivers}, " 
                          +f"difficulty={difficulty}\n"
                          +f"log2(nhashes)={log2(nhashes_in_dict)}")
                    print("===============================\n\n")
                
                t_min = t
                t1_min = t1
                t2_min = t2
                nsenders_min = nsenders
                nreceivers_min = nreceivers
                difficulty_min = difficulty
                nhashes_in_dict_min = nhashes_in_dict
    
    return {"t" : seconds_2_time(t_min), 
            "t1" : seconds_2_time(t1_min),
            "t2" : seconds_2_time(t2_min),
            "nsenders" : nsenders_min,
            "nreceivers" : nreceivers_min,
            "difficulty" : difficulty_min,
            "lg2(nhashes_in_dict)" : log2(nhashes_in_dict_min)}



In [4]:
best_parameter(80,
               nservers,
               server_memory,
               ncores_per_server,
               nhashes_stored/(2**i),
               hashes_sec_core,
               dict_queries_sec,
               verbose=False)


{'t': '0 days, 0 hours, 26 mins, 53 sec',
 't1': '0 days, 0 hours, 4 mins, 40 sec',
 't2': '0 days, 0 hours, 22 mins, 13 sec',
 'nsenders': 672,
 'nreceivers': 352,
 'difficulty': 2,
 'nhashes_in_dict': 176000000000.0}

In [8]:
for i in range(20):
    print(f"i={i}")
    print(best_parameter(88,
               nservers,
               server_memory,
               ncores_per_server,
               nhashes_stored/(2**i),
               hashes_sec_core,
               dict_queries_sec,
               verbose=False))
    print("-------------------\n\n")

i=0
{'t': '7 days, 4 hours, 15 mins, 8 sec', 't1': '2 days, 19 hours, 27 mins, 5 sec', 't2': '4 days, 8 hours, 48 mins, 3 sec', 'nsenders': 608, 'nreceivers': 416, 'difficulty': 2, 'lg2(nhashes_in_dict)': 37.35678447262356}
-------------------


i=1
{'t': '5 days, 14 hours, 40 mins, 39 sec', 't1': '1 days, 15 hours, 51 mins, 27 sec', 't2': '3 days, 22 hours, 49 mins, 11 sec', 'nsenders': 672, 'nreceivers': 352, 'difficulty': 2, 'lg2(nhashes_in_dict)': 37.35678447262356}
-------------------


i=2
{'t': '4 days, 18 hours, 44 mins, 55 sec', 't1': '0 days, 19 hours, 55 mins, 43 sec', 't2': '3 days, 22 hours, 49 mins, 11 sec', 'nsenders': 672, 'nreceivers': 352, 'difficulty': 2, 'lg2(nhashes_in_dict)': 37.35678447262356}
-------------------


i=3
{'t': '4 days, 8 hours, 47 mins, 3 sec', 't1': '0 days, 9 hours, 57 mins, 51 sec', 't2': '3 days, 22 hours, 49 mins, 11 sec', 'nsenders': 672, 'nreceivers': 352, 'difficulty': 2, 'lg2(nhashes_in_dict)': 37.35678447262356}
-------------------


i=4
