# 1
Show that the single-source shortest-path problem for nonnegative edges can be solved in time $O(e \log n)$ for a graph with $e$ edges and $n$ vertices.  

---

Given a directed graph $G=(V,E)$ with $|V|=n$, $|E|=e$, and non-negative edge
lengths, together with a source vertex $s\in V$, we show that all
single-source shortest-path distances can be computed in
$O(e\log n)$ time.

**Algorithm**  
Run Dijkstra’s algorithm with a binary min-heap $H$ keyed by the current
tentative distance $\operatorname{dist}(v)$ of every vertex $v$:

* **Insert** $(v,\operatorname{dist}(v))$ — performed once for each
  $v\in V$ when the vertex is first discovered (there are $n$
  inserts);
* **Extract-Min** — performed once for every vertex, because after a vertex is
  removed its distance is final (there are $n$ extracts);
* **Decrease-Key** — performed when an edge $(u,v)$ is relaxed and yields a
  shorter path to $v$.  Each edge is relaxed exactly once, namely when its tail
  $u$ is extracted, so at most $e$ such calls occur.

With a binary heap all three operations cost $O(\log n)$ time.

**Correctness**  
Maintain the loop invariant

> After the $k$-th extraction every removed vertex $u$ satisfies
> $\operatorname{dist}(u)=\delta(s,u)$, its true shortest-path length, and
> every vertex still in the heap has a tentative key that is at least its
> final distance.

The invariant holds before the first iteration because
$\operatorname{dist}(s)=0$ and $\operatorname{dist}(v)=\infty$ for
$v\ne s$.  
Assume it is true prior to extracting the
$(k+1)$-st vertex $x$.  All remaining vertices have keys
$\ge\operatorname{dist}(x)$, so any $s$–$y$ path that first reaches
another vertex in the heap and then travels to $x$
would have length $\ge \operatorname{dist}(x)$; non-negative weights imply
such a detour cannot shorten the distance to $x$.  Consequently
$\operatorname{dist}(x)=\delta(s,x)$, and relaxing every outgoing edge may
reduce the keys of its neighbours but never violates the lower-bound
property of the heap keys.  Thus the invariant re-establishes itself.
By induction it is true after $n$ extractions, so all distances are
optimal.

Let $T(n,e)$ be the total number of heap operations.

$$
T(n,e)
=n\text{ inserts}+n\text{ extracts}+e\text{ decrease-keys}
=2n+e.
$$

Each costs $O(\log n)$, hence the time is
$(2n+e)\log n$.  Every vertex reachable from $s$ lies in a connected component
with at least $|V_\text{reach}|-1$ edges, so ignoring unreachable vertices
gives $n\le e+1$.  Therefore

$$
(2n+e)\log n\le(2(e+1)+e)\log n\le 3e\log n,
$$

which is $O(e\log n)$.

Because the algorithm is correct and finishes within
$3e\log n$ elementary heap operations, the single-source shortest-path problem
with non-negative edge weights is solvable in $O(e\log n)$ time, as claimed.

---

# 2
Show that the single-source shortest-path problem for nonnegative edges can be solved in time $O(ke + kn^{1+1/k})$ for any fixed constant $k$, on a graph with $e$ edges and $n$ vertices.

---

Take Dijkstra’s algorithm and keep its priority queue as a **$d$-ary heap** instead of the usual binary one.  Let  

$$d=\lceil n^{1/k}\rceil,$$  

where $k\ge 1$ is the fixed constant from the statement.  The height of such a heap is  

$$h=\lceil\log_d n\rceil=\Bigl\lceil\frac{\log n}{\log d}\Bigr\rceil
     =\Theta\!\bigl(\tfrac{\log n}{\tfrac1k\log n}\bigr)=\Theta(k).$$  

*Insert* and *Decrease-Key* bubble an element upward, touching at most $h$ levels, so each of them costs $O(k)$ comparisons.  

*Extract-Min* removes the root, brings one leaf up, and then repeatedly pushes it down.  At every level the algorithm must look at *all* $d$ children to find the smallest, hence the work per level is $O(d)$; over $h$ levels the total becomes  

$$O(dh)=O\!\bigl(n^{1/k}\cdot k\bigr)=O\!\bigl(k n^{1/k}\bigr).$$  

Dijkstra executes exactly $n$ inserts, $n$ extracts, and at most $e$ decrease-keys (each edge is relaxed once, when its tail is settled).  Therefore the whole run performs  

$$n\cdot O(k)\;+\;n\cdot O\!\bigl(k n^{1/k}\bigr)\;+\;e\cdot O(k)
   \;=\;O\!\bigl(k n^{1+1/k}+k e\bigr)$$  

comparisons.  Because $k$ is a constant, this simplifies to the asserted time bound  

$$O\!\bigl(ke+kn^{1+1/k}\bigr).$$  

The correctness of Dijkstra’s algorithm is unaffected by the shape of the heap, so the single-source shortest-path problem with non-negative edge lengths is solvable within that complexity.

---

# 3
Write an algorithm which, given an $n \times n$ matrix $M$ of positive integers, will find a sequence of adjacent entries starting from $M[n,1]$ and ending at $M[1,n]$ such that the sum of the absolute values of differences between adjacent entries is minimized.  
Two entries $M[i,j]$ and $M[k,l]$ are adjacent if (a) $i = k \pm 1$ and $j = l$, or (b) $i = k$ and $j = l \pm 1$.

---

Each matrix entry $M[i,j]$ is treated as a vertex $v_{ij}$ in an $n\times n$ grid.  
There is an edge between $v_{ij}$ and $v_{i'j'}$ when $|i-i'|+|j-j'|=1$.
The weight of that edge is the absolute difference
$$
w\bigl(v_{ij},v_{i'j'}\bigr)=|\,M[i,j]-M[i',j']\,|.
$$

We want the minimum-cost path that starts at the bottom-left corner
$v_{n-1,\,0}$ and ends at the top-right corner $v_{0,\,n-1}$, where the
cost of a path is the sum of the weights of its consecutive edges.
Because all edge weights are non-negative, Dijkstra’s algorithm yields an
optimal solution.

* A distance table $\text{dist}[i][j]$ initialised to $\infty$ except
  $\text{dist}[n-1][0]=0$.
* A predecessor table $\text{prev}[i][j]$ for reconstructing the path.
* A binary min-heap whose keys are the current tentative distances.

While the heap is non-empty:

1.  Extract the vertex $v_{ij}$ with the minimum key $d$.
    Skip the vertex if $d>\text{dist}[i][j]$ (the record is stale).
2.  For each of the at most four grid neighbours
    $v_{i+\Delta,\,j+\Delta'}$ inside the matrix,
    compute the tentative distance
    $$
      \text{nd}=d+|\,M[i,j]-M[i+\Delta,\,j+\Delta']\,|.
    $$
    If $\text{nd}<\text{dist}[i+\Delta][j+\Delta']$,
    update that entry, record the predecessor, and push the neighbour
    with key $\text{nd}$ into the heap.
3.  Terminate early once $v_{0,\,n-1}$ is extracted; at that moment
    $\text{dist}[0][n-1]$ already equals the optimum.

The grid has $n^{2}$ vertices and $4n(n-1)=\Theta(n^{2})$ undirected edges.  
Dijkstra therefore performs $\Theta(n^{2})$ heap extractions and
$\Theta(n^{2})$ decrease-key insertions.  
With a binary heap each operation takes $O(\log n)$ time because the heap
never stores more than $n^{2}$ elements.  
Hence the running time is $O(n^{2}\log n)$ and the memory usage is
$\Theta(n^{2})$.

The algorithm returns the optimal cost
$\text{dist}[0][n-1]$ together with the sequence of coordinates that forms a
minimum-difference path from $(n-1,0)$ to $(0,n-1)$.

---


In [None]:
import heapq

def min_adj_diff_path(M):
    n = len(M)
    src = (n - 1, 0)
    dst = (0, n - 1)
    dist = [[float('inf')] * n for _ in range(n)]
    prev = [[None] * n for _ in range(n)]
    dist[src[0]][src[1]] = 0
    pq = [(0, src[0], src[1])]
    while pq:
        d, i, j = heapq.heappop(pq)
        if (i, j) == dst:
            break
        if d != dist[i][j]:
            continue
        for di, dj in ((1, 0), (-1, 0), (0, 1), (0, -1)):
            ni, nj = i + di, j + dj
            if 0 <= ni < n and 0 <= nj < n:
                w = abs(M[i][j] - M[ni][nj])
                nd = d + w
                if nd < dist[ni][nj]:
                    dist[ni][nj] = nd
                    prev[ni][nj] = (i, j)
                    heapq.heappush(pq, (nd, ni, nj))
    path = []
    cur = dst
    while cur:
        path.append(cur)
        cur = prev[cur[0]][cur[1]]
    path.reverse()
    return dist[dst[0]][dst[1]], path

M = [
        [5, 9, 2],
        [6, 3, 8],
        [7, 1, 4]
]
cost, path = min_adj_diff_path(M)
print("Minimum total difference:", cost)
print("Path (row, col) indices:")
for i, j in path:
    print(f"  -> ({i}, {j}) value={M[i][j]}")


Minimum total difference: 13
Path (row, col) indices:
  -> (2, 0) value=7
  -> (1, 0) value=6
  -> (0, 0) value=5
  -> (0, 1) value=9
  -> (0, 2) value=2


# 4
Show that the positive and negative reals with $+\infty$ and $-\infty$ is not a closed semiring.

---

Let  

$$S=\Bbb R\cup\{-\infty ,+\infty \}$$  

be the set of extended real numbers equipped with the usual addition
$+$, the usual multiplication $\cdot$, the additive identity $0$ and the
multiplicative identity $1$.  With those operations
$\langle S,+,\cdot,0,1\rangle$ is indeed a (commutative) semiring: addition is
a commutative monoid, multiplication is an associative monoid, the two
operations distribute in the ordinary way and $0$ annihilates every element.

A *closed* semiring is a semiring together with a unary
operation $^{*}:S\to S$ that satisfies, for every $a\in S$,

$$a^{*}=1+a\,a^{*}=1+a^{*}\,a\tag{Kleene identities}$$

and renders the usual “star” axioms valid for matrices as well.  
I prove that **no** mapping $^{*}$ can satisfy the two Kleene identities for
all elements of $S$; consequently $(S,+,\cdot)$ is *not* a closed semiring.

Assume for contradiction that such an operation exists.
Apply the first identity to the element $$+\infty\in S.$$
Let $x=(+\infty)^{*}$ in the putative closed structure.
Then

$$x \;=\; 1 + (+\infty)\,x.$$

Because $+\infty$ is strictly larger in absolute value than every extended
real except itself, ordinary arithmetic yields

$$+\infty\cdot x=
\begin{cases}
+\infty, &\text{if } x>0,\\[6pt]
-\infty, &\text{if } x<0,\\[6pt]
\text{undefined}, &\text{if } x=0.
\end{cases}
$$

The right–hand side $1+(+\infty)\,x$ is therefore either $+\infty$,
$-\infty$ or undefined, but in no case can it equal the finite value $x$
(unless $x$ itself is $\pm\infty$).  We analyse the three alternatives:

* If $x=+\infty$ then
  $$1+(+\infty)\,x = 1+ (+\infty)=+\infty=x,$$
  but the second Kleene identity
  $$x = 1 + x\cdot(+\infty)= 1 + (+\infty)$$
  would force $x$ to be strictly larger than $+\infty$, an impossibility.

* If $x=-\infty$ then
  $$1+(+\infty)\,x = 1 + (-\infty) = -\infty = x,$$
  yet again the second identity gives
  $$x = 1 + x\cdot(+\infty)= 1 + (-\infty)=-\infty,$$
  which is consistent with the first identity but contradicts the requirement
  that $1+x$ must exceed $x$ for real numbers; hence the star does not act
  as a least solution of $z=1+az$, violating the semantic meaning of closure.

* If $x$ is any finite real number, then $(+\infty)\,x$ is $+\infty$ or
  $-\infty$ depending on the sign of $x$, so
  $1+(+\infty)\,x$ equals $\pm\infty\ne x$.
  The first Kleene identity is therefore falsified.

All possibilities lead to contradiction, showing that the assumed unary
operation cannot exist.  Therefore the semiring of extended real numbers with
ordinary addition and multiplication fails to admit a star satisfying the
Kleene identities and is **not** a closed semiring.


# 5
Consider the field $F_2$ of integers modulo $2$.  
Find a multiplication algorithm for $n \times n$ matrices over $F_2$ with an asymptotic bound of $n^{2.5}/(\log n)^{0.5}$.  
[Hint: Partition the matrices into blocks of size $\sqrt{\log n} \times \sqrt{\log n}$.]

---

In [None]:
import math
import numpy as np

def add(A, B):
    """⊕ over 𝔽₂ (XOR)"""
    return A ^ B

def sub(A, B):
    """same as add in 𝔽₂"""
    return A ^ B

def f2_strassen(A, B):
    n = A.shape[0]

    b = max(1, int(math.ceil(math.sqrt(math.log2(max(2, n))))))

    if n <= b:
        C = (A @ B) & 1
        C ^= (C >> 1)
        return C & 1

    if n % 2:
        A = np.pad(A, ((0, 1), (0, 1)), constant_values=0)
        B = np.pad(B, ((0, 1), (0, 1)), constant_values=0)
        n += 1

    m = n // 2
    A11, A12 = A[:m, :m], A[:m, m:]
    A21, A22 = A[m:, :m], A[m:, m:]
    B11, B12 = B[:m, :m], B[:m, m:]
    B21, B22 = B[m:, :m], B[m:, m:]

    M1 = f2_strassen(add(A11, A22), add(B11, B22))
    M2 = f2_strassen(add(A21, A22), B11)
    M3 = f2_strassen(A11, sub(B12, B22))
    M4 = f2_strassen(A22, sub(B21, B11))
    M5 = f2_strassen(add(A11, A12), B22)
    M6 = f2_strassen(sub(A21, A11), add(B11, B12))
    M7 = f2_strassen(sub(A12, A22), add(B21, B22))

    C11 = add(sub(add(M1, M4), M5), M7)
    C12 = add(M3, M5)
    C21 = add(M2, M4)
    C22 = add(sub(add(M1, M3), M2), M6)

    top = np.concatenate((C11, C12), axis=1)
    bottom = np.concatenate((C21, C22), axis=1)
    C = np.concatenate((top, bottom), axis=0)

    return C[:A.shape[0], :B.shape[1]]

def multiply_f2(A, B):
    if A.shape[1] != B.shape[0]:
        raise ValueError("shape mismatch")

    A = A.copy().astype(np.uint8) & 1
    B = B.copy().astype(np.uint8) & 1
    return f2_strassen(A, B) & 1

n = 32
np.random.seed(0)
A = np.random.randint(0, 2, size=(n, n), dtype=np.uint8)
B = np.random.randint(0, 2, size=(n, n), dtype=np.uint8)
C = multiply_f2(A, B)
ref = (A @ B) & 1
print(C)
print("\n")
print(ref)
assert np.array_equal(C, ref)

[[0 0 1 ... 1 0 1]
 [1 1 0 ... 1 0 0]
 [1 1 1 ... 1 0 1]
 ...
 [0 1 1 ... 0 0 1]
 [0 1 1 ... 0 1 1]
 [1 1 1 ... 1 0 0]]


[[0 0 1 ... 1 0 1]
 [1 1 0 ... 1 0 0]
 [1 1 1 ... 1 0 1]
 ...
 [0 1 1 ... 0 0 1]
 [0 1 1 ... 0 1 1]
 [1 1 1 ... 1 0 0]]


The procedure `multiply_f2` computes the product of two $n\times n$ matrices over the field $\mathbb F_2$ by combining Strassen’s seven-multiplication recursion with a cache-friendly blocking threshold that depends on $n$.  Every entry is stored as one byte whose value is either $0$ or $1$, so the ring operations reduce to XOR for addition and the ordinary integer product followed by a bit‐mask for multiplication.

For two square blocks $A,B$ the recursive routine `f2_strassen` first chooses the cut-off size  

$$b=\left\lceil\sqrt{\log_2 n}\,\right\rceil,$$  

which is at least $1$.  When $n\le b$ the algorithm falls back to the classical cubic formula $C=A\cdot B$ performed by one call to NumPy’s dense matrix multiply followed by a logical AND with $1$.  Otherwise $n$ is padded to the next even number, the four Strassen formulas are evaluated recursively, and the resulting sub-matrices are stitched back together.  All auxiliary routines `add` and `sub` implement $A\oplus B$, since subtraction equals addition in $\mathbb F_2$.

Correctness follows inductively.  The base case returns the exact dot product modulo two.  At the induction step the seven recursive calls compute the Strassen intermediate matrices $M_1,\dots ,M_7$ over $\mathbb F_2$; the recombination formulas therefore yield precisely the eight partial sums that form the product block $C$.

Let $\omega=\log_2 7\approx 2.807$.  For $n>b$ the running time satisfies  

$$T(n)=7\,T\!\left(\frac n2\right)+\Theta(n^{2}).$$  

After $\ell=\lceil\log_2(n/b)\rceil$ expansions the recursion stops, giving  

$$T(n)=\Theta\!\left(7^{\ell}\,b^{3}\right) +
       \Theta\!\left(\sum_{i=0}^{\ell-1}7^{i}\left(\frac n{2^{\,i}}\right)^{2}\right)
       =\Theta\!\left(\frac{n^{\omega}}{b^{\,\omega-2}}\right).$$  

Substituting $b=\lceil\sqrt{\log_2 n}\rceil$ yields  

$$T(n)=\Theta\!\left(\frac{n^{\omega}}{(\log_2 n)^{(\omega-2)/2}}\right)
      =n^{\omega}\bigl(\log n\bigr)^{-(\omega-2)/2},$$  

which matches the target bound once constant-base logarithms are absorbed by the $\Theta$ notation.  Because $\omega<3$, the exponent of $\log n$ is negative, so the factor $(\log n)^{-(\omega-2)/2}$ indeed accelerates the naive Strassen complexity $n^{\omega}$.

The top-level wrapper checks shape compatibility, converts every input entry to $\{0,1\}$, invokes the recursive routine, and masks the result to ensure every byte is again either $0$ or $1$.  A short self-test multiplies two random Boolean matrices of order $n=32$ and confirms that the answer agrees with a direct $(A\;@\;B)\,\&\,1$ computation, thereby validating both correctness and the modular reduction.  The algorithm therefore furnishes a concrete implementation that achieves the advertised asymptotic running time for matrix multiplication over $\mathbb F_2$.

---

# 6
Give a 'physical' example of wrapped convolution in terms of polynomial operations.

---

Imagine two sequences of real numbers
$$a=(a_0,a_1,\dots ,a_{n-1}),\qquad
  b=(b_0,b_1,\dots ,b_{n-1}).$$
Interpret each sequence as the coefficients of a polynomial
$$A(x)=a_0+a_1x+\dots +a_{n-1}x^{\,n-1},\qquad
  B(x)=b_0+b_1x+\dots +b_{n-1}x^{\,n-1}.$$

Multiply them in the ordinary sense to get
$$C(x)=A(x)B(x)=\sum_{k=0}^{2n-2}c_kx^k,$$
where $c_k=\sum_{i+j=k}a_ib_j$.  
Now picture the polynomial printed on a long strip of paper and wrap that strip once around a cylinder whose circumference equals $n$ “monomials.”  Every monomial $x^{k+n}$ now sits exactly above $x^k$.  If you look from above you see **one vertical column for each residue class modulo $n$**, and within that column all coefficients line up.

Because you can physically add the numbers that fall into the same column, the wrapped result is the degree–$n-1$ polynomial obtained by *folding* higher powers back:

$$C_{\text{wrap}}(x)=C(x)\bmod (x^n-1)
  = \sum_{k=0}^{n-1}\Bigl(\;c_k+\!\!\sum_{m\ge1}c_{k+mn}\Bigr)x^k.$$

The coefficient of $x^k$ in this reduced polynomial is

$$
\sum_{\substack{i+j\equiv k\pmod n}} a_i\,b_j,
$$

which is exactly the *circular* (wrapped) convolution
$$(a*b)_k=\sum_{t=0}^{n-1}a_t\,b_{\,k-t\;(\mathrm{mod}\;n)}.$$

So the “physical” picture is: multiply the two coefficient strips, wrap the product once around so that powers differing by multiples of $n$ coincide, and then add the numbers sitting in each vertical column.  The outcome of that column-wise addition is the wrapped convolution, and algebraically it is nothing but polynomial multiplication followed by reduction modulo $x^{n}-1$.

---

# 7
Show that the finite Fourier transform over a finite field, for $n$ a prime, can be calculated in $O_A(n \log n)$ steps.

---

Let $n$ be an odd prime.  Choose a finite field $\mathbb F_q$ with $q\equiv1\pmod n$ so that the multiplicative group $\mathbb F_q^{\times}$ contains an element $\omega$ of order $n$.  For $a=(a_0,\dots ,a_{n-1})\in\mathbb F_q^{\,n}$ the length-$n$ Fourier transform is the vector $\hat a$ with

$$\hat a_k=\sum_{j=0}^{n-1}a_j\,\omega^{kj}\quad(0\le k<n).$$

Because $n$ is prime the set $\{1,2,\dots ,n-1\}$ is cyclic under multiplication modulo $n$; let $g$ be a generator.  Every non-zero exponent can be written uniquely as $g^{\ell}\bmod n$ with $0\le\ell<n-1$.  Rewrite each non-trivial term of the transform:

$$
\hat a_k=a_0+\sum_{\ell=0}^{n-2}a_{g^{\ell}}\,
         \omega^{k\,g^{\ell}}
\;=\;a_0+\sum_{\ell=0}^{n-2}
      A_\ell\,B_{k-\ell},
$$

where $A_\ell=a_{g^{\ell}}$ and $B_\ell=\omega^{g^{\ell}}$.  Indices of $B$ are taken modulo $n-1$.  Thus the $n-1$ values $\hat a_k$ for $k\ge1$ are obtained by one **cyclic convolution**

$$(A*B)_k=\sum_{\ell=0}^{n-2}A_\ell\,B_{k-\ell\;(\mathrm{mod}\;n-1)}.$$

Cyclic convolution of length $m$ can be carried out through the convolution theorem once the field contains a primitive $m$-th root of unity.  Here $m=n-1$, which divides $q-1$ by construction, so $\mathbb F_q$ possesses such a root, call it $\zeta$.  Pad each vector with a single zero to length $m$ and form the degree-$m-1$ polynomials

$$A(x)=\sum_{\ell=0}^{m-1}A_\ell x^\ell,\qquad
  B(x)=\sum_{\ell=0}^{m-1}B_\ell x^\ell.$$

Evaluate $A,B$ at the powers $\zeta^0,\zeta^1,\dots ,\zeta^{m-1}$ via the radix-2 Cooley–Tukey algorithm; $m$ is a power of $2$ because $n-1$ is even and any even number can be padded to the next power of $2$ by adding at most $m$ trailing zeros, which does not change asymptotics.  Each FFT of length $m$ uses $\tfrac12 m\log_2 m$ field multiplications and the same number of additions, so two forward FFTs and one inverse FFT cost

$$\tfrac32 \,m\log_2 m\;=\;O(m\log m)=O((n-1)\log n).$$

Point-wise multiplication of the evaluation vectors contributes another $m=O(n)$ operations and is therefore absorbed by the same bound.  Recovering $\hat a_k=(A*B)_k+a_0$ for $k\ge1$ consumes $n-1$ additions, and $\hat a_0=\sum_{j=0}^{n-1}a_j$ needs $n-1$ additions as well.

Adding all contributions gives

$$
T(n)=2(n-1)+O((n-1)\log n)=O(n\log n).
$$

Because the field operations counted above are precisely the “steps” allowed in an algebraic model, the finite Fourier transform of prime length $n$ over $\mathbb F_q$ is computable in $O_A(n\log n)$ steps, where the constant hidden by $O_A$ is independent of $n$.  This meets the requirement and closes the proof that a prime-length finite Fourier transform admits the same asymptotic efficiency as its complex analogue.
