**11.3-2<br>Suppose that we hash a string of $r$ characcters into $m$ slots by treating it as a radix-128 number and then using the division method. We can easily represent the number $m$ as a 32-bit computer word, but the string of $r$ characters, treated as a radix-128 number, takes many words. How can we apply the division methos to compute the hash value of the character string without using more than a constant number of words of storage outside the string itself?**

Given a string, we can express it a radix-128 integer $s$:

$s=a_r\bullet128^{r-1}+a_{r-1}\bullet128^{r-2}+...+a_1\bullet128^0$
<br>$\; =(m\bullet q_r+r_r)+(m\bullet q_{r-1}+r_{r-1})+...+(m\bullet q_1+r_1)$
<br>$\; =m\bullet (q_r+q_{r-1}+...+q_1)+(r_r+r_{r-1}+...+r_1)$
<br>$\; =m\bullet (q_r+q_{r-1}+...+q_1)+(m\bullet q*+r*)$
<br>$\; =m\bullet (q_r+q_{r-1}+...+q_1+q*)+r*$
<br>$\Rightarrow s\: mod \: m=r*=[(a_r\bullet128^{r-1})\: mod \: m+(a_{r-1}\bullet128^{r-2})\: mod\: m+...+(a_1\bullet128^0)\: mod \: m]\: mod \: m$

This can be generalised as the **distributive property of modulo operations**, expressed as: $(a+b)\:mod\:m=[(a\:mod\:m)+(b\:mod\:m)]\:mod\:m$. Thus, we can show that for a two-digit string $s_1=(a\bullet 128+b)$:

$$
\begin{align*}
s_1\:mod\:m=&[(a\bullet 128)\:mod\:m +b\:mod\:m]\:mod\:m\\
=&[(a+a+...+a)\:mod\:m +b\:mod\:m]\:mod\:m\\
=&[(128\bullet a\:mod\:m)\:mod\:m +b\:mod\:m]\:mod\:m\\
=&[(128\bullet a\:mod\:m) +b]\:mod\:m\\
\end{align*}
$$

In Python, the above equation allows us to compute $s\:mod\:m$ recursively by substituting $a$ with $s[i]$:
```
start=0
for i in range(len(s)):
    start=(start*128+s[i])%m
```
Function `h_new()` returns the same results as division method we defined in *11.3_Hash_functions.ipynb*; but it requires much shorter imtermediate string length (you can `print(to_radix)` inside the `for` loop).



In [15]:
"""convert m to 32-bit computer word, as requested by the question"""
def bit(m):
    if m==0:
        return 0
    elif m==1:
        return 1
    else:
        n=1
        while 2**n<=m: #find the largest n so that 2**n is smaller than m
            n+=1 #recurse for m-2**n
        return 10**(n-1)+bit(m-2**(n-1)) 
    
"""compute h(s) directly, functions in 11.3_Hash_function.ipynb"""
def radix_p(l,p=128): #default p=128
    """step 1: convert string to ASCII"""
    to_ascii=[ord(i) for i in l]
    
    """step 2: unite multiple ASCII integers to an integer with radix of base p"""
    to_radix=0
    for i in range(len(to_ascii)):
        
        to_radix=to_radix*p+to_ascii[i]
    return to_radix
def h_div(k,m):
    return k%m

"""compute h(s) by pseudocodes in 11.3-2"""
def h_new(l,m):
    to_ascii=[ord(i) for i in l]
    to_radix=0
    
    for i in range(len(to_ascii)):
        to_radix=(to_radix*128+to_ascii[i])%m
        
    return to_radix
    
s1='acdf'
m=bit(7)

"""compute h(s) directly, functions in 11.3_Hash_function.ipynb"""
h_div(radix_p(s1),m)
"""compute h(s) by pseudocodes in 11.3-2"""
h_new(s1,m)


37

**11.3-3<br>Consider a version of the division method in which $h(k))k\: mod\: m,$, where $m=2^p-1$ and $k$ is a character string interpreted in radix $2^p$. Show that if we can derive string $x$ from string $y$ by permuting its characters, then $x$ and $y$ hash to the same value.**

We can prove that each string hashes to the sum of its digits when the hash function is $h(s)=s\:mod\:m$ with **mathematical induction on the length of string $s$**.
* The proof is trivial when `len(s)==1`
* We assume that the it is true for string $s_1$, whose length is $l_1$
* Let $s=s_1s_2$, where $s_2$ is a single character string added to the right side of $s_1 \Rightarrow len(s)=len(s_1)+1$
$$
\begin{align*}
s=&s_1\bullet 2^p+s_2\bullet 2^0\\
s=&s_1\bullet (2^p-1)+s_1+s_2\\
h(s)&=[s_1\bullet (2^p-1)+s_1+s_2]\: mod\:(2^p-1)\\
&=(s_1+s_2)\: mod\:(2^p-1)\\
\end{align*}
$$
In Python, we can see it with the use of [itertools.permutations](https://docs.python.org/2/library/itertools.html). Remember that in this case, $m$ must be equal to $128-1$.

In [21]:
from itertools import permutations as ip
def allpermutations(l):
    return [''.join(i) for i in ip(l,len(l))]
for item in allpermutations(s1):
    print (item)
    #print (h_div(radix_p(item),m-1))
    print (h_new(item,m=128-1))


acdf
17
acfd
17
adcf
17
adfc
17
afcd
17
afdc
17
cadf
17
cafd
17
cdaf
17
cdfa
17
cfad
17
cfda
17
dacf
17
dafc
17
dcaf
17
dcfa
17
dfac
17
dfca
17
facd
17
fadc
17
fcad
17
fcda
17
fdac
17
fdca
17


**11.3-4<br>Consider a hash table of size $m=1000$ and a corresponding hash function $h(k)=\lfloor m(kA\:mod\: 1)\rfloor$ for $A=(\sqrt{5}-1)/2$. Compute the locations to which the keys $61,62,63,64,$ and $65$ are mapped.**
* Use the function `h_mul()` defined in *11.3_Hash_funcions.ipynb*

In [22]:
"""The multiplication method:"""
import numpy as np
import math
A_knuth=(np.sqrt(5)-1)/2 #A from Knuth
def h_mul(k,m,A=A_knuth): #default as A= A from Knuth
    return int((k*A-int(k*A))*m) #int(x) is equal to the floor of x if x>=0
h_mul(61,m=1000) 

700