# Best attack parameters given a hardware

We assume the attack parameters are $n$ and $l$. Also, denote $g$ to be the number of ignored bits. 

$p = \frac{2^l}{2^n}$ , this is a geometric random variable, thus we expect a collision after   $\#queries = \frac{2^n}{2^l} $

Assume, we only accept digests that have certain number of zeros, denoted as $d$. Thus, we can pretend as we are working on small digests
$$
\begin{align}
&\#queries = \frac{2^n}{2^l} \\
&\#queries_{sec} \cdot t_{sec} = \frac{2^n}{2^l} \\
\Rightarrow &n = log2\left(\#queries_{sec} \cdot t_{sec} \cdot 2^{n-l-d}  \right)
\end{align}
$$


We have three point of views of $\#queries$
- Senders: How many hashes they generate? 
    - Their speed will be affected by difficulty, but from their perspective the overall attack time doesn't change if the difficulty change (add explanation, later)
    - $\#snd\_queries_{sec} = \frac{\#senders \cdot \#gen\_hashes_{sec}} {2^{d}}$

- Receivers: How many hashes they can query the dicitonary. 
    - In their world, the higher the difficulty the better chance of hitting collision (since digests are technically shorter).
    - $\#rcv\_queries_{sec} = \#receivers \cdot \#dict\_queries_{sec} $

- Bandwith: This is how many hashes the network can carry in a second. 
    - From their perspective, difficulty reduces the rate of transmitted messages. 
    - $bdwth_queries_{sec} $


Thus,

$$\#queries_{sec} := min\left(snd\_queries_{sec}, rcv\_queries_{sec}, bdwth\_queries_{sec}\right)$$


In [7]:
# Numbers from Gros cluster, nancy, grid5000.fr
nservers = 40
server_memory = (96-7)*10^9 # 96 GB
ncores_per_server = 36
hashes_sec_core =  2^24 # 56 MB
dict_queries_sec = 2^25.4227
t_sec = 3600*11 # 31 * 24 * 3600

hashes_sec_phase_i = 2^24.72

# 1 core hashing power
# thd2 sha_avx512_16way  elapsed 1.78sec i.e. 898392.92 hashes/sec = 2^19.777 hashes, 57.4971 M


# Querying 100000000, took 2.22 sec i.e. 44977939.99 elm/sec = 2^25.4227 elm/sec 


def seconds_2_time(t):
    from math import floor

    t = float(t)
    days  = floor(t/(3600*24))
    t = t - days*24*3600

    hours = floor(t/3600)
    t = t - hours*3600
    minutes = floor(t/60)
    t = t - minutes*60

    return f"{days} days, {hours} hours, {minutes} mins, {floor(t)} sec"

print(f"server_memory={server_memory}")

server_memory=89000000000


In [2]:

def nqueries_sender(nsenders, hashes_sec_core, difficulty):
    """ Return how many queries senders can generate per second """

    return nsenders*hashes_sec_core/(2**difficulty)


def nqueries_receiver(nreceivers, dict_queries_sec):
    """
    Return how many queries receivers can make in a second
    """
    return nreceivers * dict_queries_sec

def phase_i_time(l, difficulty, hashes_sec_phase_i):
    """
    Return how many seconds it takes to complete phase_i
    """
    return 2^l * 2^difficulty / (hashes_sec_phase_i)


def largest_n(l,
              nsenders,
              nreceivers,
              dict_queries_sec,
              hashes_sec_core,
              difficulty,
              t_sec):

    """
    Given an attack parameter what is the largest n can be attacked in t_sec
    """
    from math import log2

    nqueries_sec = min(nqueries_sender(nsenders, hashes_sec_core, difficulty),
                   nqueries_receiver(nreceivers, dict_queries_sec))

    return log2(nqueries_sec*t_sec) + l + difficulty






def find_best_parameters(nservers,
              server_memory,
              ncores_per_server,
              dict_queries_sec,
              hashes_sec_core,
              hashes_sec_phase_i,
              t_sec,
              phase_i_timeout=365*24*60*60):
    """
    Find the attack parameters that can attack the largest possible n in t_sec
    return dictionary contains attack parameters.
    phase_i_timeout by default 365 days, since it can be done offline
    phase_ii_reconstruct_timeout 
    """

    from math import log2
    from itertools import product

    memory = nservers * server_memory
    val_size_bytes = 4 # one entry size in the dictionary
    filling_rate = 0.93 # how many slots of the dictionary are used
    l = log2(filling_rate * memory / val_size_bytes)

    ncores = nservers * ncores_per_server

    best_difficulty = 0
    best_n = 0 # optimize: find largest n
    best_nsenders = 0
    best_time_phase_i = float('inf')
    largest_difficulty = 40
    
    for nsenders, difficulty in product(range(1, ncores-nservers + 1), range(0, largest_difficulty)):
        nreceivers = ncores - nsenders
        n = largest_n(l,
                      nsenders,
                      nreceivers,
                      dict_queries_sec,
                      hashes_sec_core,
                      difficulty,
                      t_sec)

#         if (nreceivers == nservers):
#             print(f"n={n}, l={l}, nsenders={nsenders}, difficulty={difficulty}")

        # better n, always update
        t_phase_i = phase_i_time(l, difficulty, hashes_sec_phase_i)
        
        if (n > best_n  and t_phase_i <= phase_i_timeout):
            best_n = n
            best_difficulty = difficulty
            best_nsenders = nsenders


    return {"n": best_n, "l": l,
            "difficulty": best_difficulty,
            "nsenders": best_nsenders,
            "nreceivers": ncores - nsenders}

In [3]:
%%time
find_best_parameters(nservers,
                     server_memory,
                     ncores_per_server,
                     dict_queries_sec,
                     hashes_sec_core,
                     hashes_sec_phase_i,
                     t_sec)


CPU times: user 378 ms, sys: 110 Âµs, total: 378 ms
Wall time: 378 ms


{'n': 89.31474092285998,
 'l': 39.590317001173325,
 'difficulty': 4,
 'nsenders': 1400,
 'nreceivers': 40}

In [4]:
seconds_2_time(phase_i_time(41.22258521667284, 22, hashes_sec_phase_i))

'4507329 days, 14 hours, 41 mins, 42 sec'

In [5]:
N(4507329/365)

12348.8465753425

In [6]:
N((365/4340)*24)

2.01843317972350