# Graph non-isomorphism

In this chapter we construct a zero-knowledge protocol around graph non-isomorphism.

This chapter is based on [a lecture from the Max Plank Institute for Informatics](https://resources.mpi-inf.mpg.de/departments/d1/teaching/ss13/gitcs/lecture9.pdf).

# What is a graph?

[A graph](https://en.wikipedia.org/wiki/Graph_(discrete_mathematics)) consists of nodes and edges. Nodes are points in space. Edges are bridges between nodes.

# What is an isomorphism?

[Two graphs are isomorphic](https://en.wikipedia.org/wiki/Graph_isomorphism) if they have the same structure. By changing the names of the nodes of the first graph, we can obtain the second graph, and vice versa. There exists a translation of node names.

Given two large random graphs, it is hard to know if they are isomorphic. There is no known algorithm to efficiently compute this (in polynomial time).

# What are we proving?

Peggy and Victor are engaged in an interactive proof.

There are two graphs.

Peggy thinks she can differentiate between both graphs (both graphs are non-isomorphic). She wants to prove that to Victor.

Victor is sceptical and wants to see evidence. He wants to expose Peggy as a liar if both graphs are isomorphic.

Peggy wins if she convinces Victor. Victor wins by accepting only graphs that are structually different.

# Set up Jupyter

Run the following snippet to set up your Jupyter notebook for the workshop.

In [None]:
import os
import sys

# Add project root so we can import local modules
root_dir = sys.path.append("..")
sys.path.append(root_dir)

# Import here so cells don't depend on each other
from IPython.display import display
from typing import List, Tuple, Dict
import ipywidgets as widgets
import random
import networkx as nx
import matplotlib.pyplot as plt

from local.graph import Mapping, random_graph, non_isomorphic_graph
import local.stats as stats

# Select the scenario

Choose the good or the evil scenario. See how it affects the other cells further down.

1. **Peggy is honest** 😇 She knows a way to differentiate both graphs. She wants to convince Victor of a true statement.
2. **Peggy is lying**  😈 Both graphs actually look the same to her! She tries to fool Victor into believing a false statement.

Also select the **size of the graphs**.

In [None]:
def generate_graphs(values: Dict):
    global graph1, graph2, from_1_to_2
    
    n_edges = n_nodes_slider.value
    graph1 = random_graph(n_nodes_slider.value, n_edges)

    if honest_dropdown.value:
        # Good: Both graphs are different
        graph2 = non_isomorphic_graph(graph1)
    else:
        # Evil: Both graphs are isomorphic
        from_1_to_2 = Mapping.shuffle_graph(graph1)
        graph2 = from_1_to_2.apply_graph(graph1)

honest_dropdown = widgets.Dropdown(
    options=[
        ("Peggy can differentiate 😇", True),
        ("Peggy cannot differentiate 😈", False)],
    value=True,
    description="Scenario:",
)
honest_dropdown.observe(generate_graphs, names="value")

n_nodes_slider = widgets.IntSlider(min=4, max=20, value=4, step=1, description="#Nodes")
n_nodes_slider.observe(generate_graphs, names="value")

# Generate default values
generate_graphs({})
# Display selection
display(honest_dropdown)
display(n_nodes_slider)

# Visualize your graphs

Visualize the graphs you generated.

In [None]:
print("Graph 1")
nx.draw(graph1, with_labels=True)
plt.show()

print("Graph 2")
nx.draw(graph2, with_labels=True)
plt.show()

# How the proof goes

1. Victor randomly chooses graph 1 or 2 and shuffles it, to obtain graph $S$.
1. Victor sends $S$ to Peggy.
1. Peggy decides if $S$ came from graph 1 or 2 and sends her answer to Victor.
1. Victor checks if Peggy answered correctly.

In [None]:
class Peggy:
    def __init__(self, graph1: nx.Graph, graph2: nx.Graph):
        self.graph1 = graph1
        self.graph2 = graph2
    
    def distinguish(self, shuffled_graph: nx.Graph) -> nx.Graph:
        if nx.is_isomorphic(self.graph1, shuffled_graph):
            return 0
        else:
            assert nx.is_isomorphic(self.graph2, shuffled_graph)
            return 1


class Victor:
    def __init__(self, graph1: nx.Graph, graph2: nx.Graph):
        self.graphs = [graph1, graph2]
    
    def shuffled_graph(self) -> nx.Graph:
        self.chosen_index = random.randrange(0, 2)
        chosen_graph = self.graphs[self.chosen_index]
        shuffle = Mapping.shuffle_graph(chosen_graph)
        shuffled_graph = shuffle.apply_graph(chosen_graph)
        
        return shuffled_graph
    
    def verify(self, index: int) -> bool:
        return index == self.chosen_index

# Run the proof

Let's see the proof in action.

Run the Python code below and see what happens.

The outcome depends on the scenario you picked. The outcome is also randomly different each time.

Feel free to run the code multiple times!

In [None]:
peggy = Peggy(graph1, graph2)
victor = Victor(graph1, graph2)

shuffled_graph = victor.shuffled_graph()
index = peggy.distinguish(shuffled_graph)

if victor.verify(index):
    if honest_dropdown.value:
        print("Victor is convinced 👌 (expected)")
    else:
        print("Victor is convinced 👌 (Victor was fooled)")
else:
    if honest_dropdown.value:
        print("Victor is not convinced... 🤨 (Peggy was dumb)")
    else:
        print("Victor is not convinced... 🤨 (expected)")

# How the proof is complete

If Peggy can differentiate between both graphs, then **Victor will always be convinced** by her proof.

This is because Peggy is always able to answer which graph was shuffled.

Let's run a couple of exchanges and see how they go.

In [None]:
n_exchanges_complete_slider = widgets.IntSlider(min=10, max=1000, value=10, step=10, description="#Exchanges")
n_exchanges_complete_slider

In [None]:
# Good scenario:
# Both graphs are different
graph3 = non_isomorphic_graph(graph1)

honest_peggy = Peggy(graph1, graph3)
victor = Victor(graph1, graph3)

peggy_success = 0

for _ in range(n_exchanges_complete_slider.value):
    shuffled_graph = victor.shuffled_graph()
    index = honest_peggy.distinguish(shuffled_graph)

    if victor.verify(index):
        peggy_success += 1
        
peggy_success_rate = peggy_success / n_exchanges_complete_slider.value * 100

print(f"Running {n_exchanges_complete_slider.value} exchanges.")
print(f"Honest Peggy wins {peggy_success_rate:0.2f}% of the time.")
print()

assert peggy_success_rate == 100
print("Peggy always wins if she is honest.")

# How the proof is sound

If Peggy cannot differentiate both graphs, then **Victor has a chance to reject** her proof.

Because there are two graphs, Peggy has a 50% chance to randomly guess the graph that Victor shuffled. This is not great.

We can increase Victor's confidence by running the protocol for **multiple rounds**. This means Victor randomly selects and shuffles multiple times and Peggy has to answer which graph he shuffled. Victor accepts if Peggy answered correctly **all** time times. However, he rejects if Peggy answers incorrectly **even once**.

The chance that Peggy randomly guesses correctly for $n$ rounds is $\left(\frac{1}{2}\right)^n$, which decreases exponentially in $n$. This is tiny! If Peggy answers correctly, then Victor is confident that she didn't cheat.

Let's run a couple of exchanges and see how they go.

In [None]:
n_exchanges_sound_slider = widgets.IntSlider(min=10, max=1000, value=10, step=10, description="#Exchanges")
n_rounds_slider = widgets.IntSlider(min=1, max=10, value=1, step=1, description="#Rounds")

display(n_exchanges_sound_slider)
display(n_rounds_slider)

In [None]:
# Evil scenario:
# Both graphs are isomorphic
from_1_to_4 = Mapping.shuffle_graph(graph1)
graph4 = from_1_to_4.apply_graph(graph1)

lying_peggy = Peggy(graph1, graph4)
victor = Victor(graph1, graph4)

victor_success = 0

for _ in range(n_exchanges_sound_slider.value):
    for _ in range(n_rounds_slider.value):
        shuffled_graph = victor.shuffled_graph()
        index = lying_peggy.distinguish(shuffled_graph)
    
        if not victor.verify(index):
            victor_success += 1
            break
            
victor_success_rate = victor_success / n_exchanges_sound_slider.value * 100

print(f"Running {n_exchanges_sound_slider.value} exchanges with {n_rounds_slider.value} rounds each.")
print(f"Victor wins against lying Peggy {victor_success_rate:0.2f}% of the time.")
print()

if victor_success_rate < 50:
    print("Victor loses quite often for a small number of rounds.")
elif victor_success_rate < 90:
    print("Victor gains more confidence with each added round.")
else:
    print("At some point it is basically impossible to fool Victor.")

# How the proof is zero-knowledge

The proof itself looks like random noise. Nothing can be extracted from this noise.

Everything that is sent over the wire is randomized:

1. Victor sends a randomly shuffled graph.
1. Peggy sends an index which depends on Victor's random choice.

We can replicate this pattern:

1. Compute a random index (0 or 1).
1. Randomly shuffle the graph at the index.

Let's run a chi-square test to see if the original transcripts are distinguishable from the fake transcripts.

**Try small graphs first!** They require fewer samples than large graphs.

In [None]:
n_transcripts_slider = widgets.IntSlider(min=1000, max=50000, value=10000, step=1000, description="#Transcripts")
n_transcripts_slider

In [None]:
peggy = Peggy(graph1, graph2)
victor = Victor(graph1, graph2)

def real_transcript() -> Tuple:
    shuffled_graph = victor.shuffled_graph()
    index = peggy.distinguish(shuffled_graph)
    
    return tuple(shuffled_graph.edges()), index


def fake_transcript() -> Tuple:
    index = random.randrange(0, 2)
    chosen_graph = [graph1, graph2][index]
    shuffle = Mapping.shuffle_graph(chosen_graph)
    shuffled_graph = shuffle.apply_graph(chosen_graph)
    
    return tuple(shuffled_graph.edges()), index


real_samples = [real_transcript() for _ in range(n_transcripts_slider.value)]
fake_samples = [fake_transcript() for _ in range(n_transcripts_slider.value)]

null_hypothesis = stats.chi_square_equal(real_samples, fake_samples)
print()

if null_hypothesis:
    print("Real and fake transcripts are the same distribution.")
    print("Victor learns nothing 👌")
else:
    print("Real and fake transcripts are different distributions.")
    print("Victor might learn something 😧")

stats.plot_comparison(real_samples, fake_samples, "real", "fake")