# Random Walk on MaxCut Problem

In this notebook, we demonstrate Random Walk on the MaxCut problem. 

A random walk is a process where an entity makes random steps through a state space, with the next state determined by a random choice from the current one, continuing for a set number of steps or until a stopping condition is met.

1. [Introduction](#intro)
3. [Random Walk](#siminit)
4. [Testing](#testing)

<a id='intro'></a>
## 1. Introduction

### Imports

In [None]:
%pip install numpy
%pip install networkx
%pip install torch
%pip install pandas

In [1]:
import copy
import time
from typing import List, Union
import numpy as np
import random
import networkx as nx
from util import read_nxgraph
from util import obj_maxcut

copy: deep copies <br>
time: measurement <br>
typing: type annotations <br>
numpy: arrays<br>
random: sampling <br>
networkx: graph manipulation/analysis <br>
util: read graphs, calculate Max-Cut objective function

<a id='siminit'></a>
## 2. Random Walk
The function iteratively updates the current solution by flipping the binary values of selected nodes. In each step, it generates potential solutions by flipping up to max_num_flips nodes, evaluates their scores using obj_maxcut, and selects the solution with the highest score. The new solution is accepted if its score improves or matches the current score.

In [2]:
def random_walk(init_solution: Union[List[int], np.array], num_steps: int, max_num_flips: int, graph: nx.Graph) -> (int, Union[List[int], np.array], List[int]):
    """
    Performs random walk for Max Cut.

    Parameters:
    -- init_solution: Initial solution
    -- num_steps: Number of steps
    -- max_num_flips: Maximum number of node flips to try in each step
    -- graph: A NetworkX graph for the MaxCut problem

    Return:
    -- score: Final objective value after random walk
    -- curr_solution: Final solution partitioning of nodes
    -- scores: List of objective values at each step
    """
    print('random_walk')
    start_time = time.time()
    curr_solution = copy.deepcopy(init_solution)
    init_score = obj_maxcut(init_solution, graph)
    num_nodes = len(curr_solution)
    scores = []
    nodes = list(range(num_nodes))
    if max_num_flips > num_nodes:
        max_num_flips = num_nodes
    for i in range(num_steps):
        # select nodes randomly
        traversal_scores = []
        traversal_solutions = []
        for j in range(1, max_num_flips + 1):
            selected_nodes = random.sample(nodes, j)
            new_solution = copy.deepcopy(curr_solution)
            new_solution = np.array(new_solution)
            new_solution[selected_nodes] = (new_solution[selected_nodes] + 1) % 2
            new_solution = new_solution.tolist()
            # calc the obj
            new_score = obj_maxcut(new_solution, graph)
            traversal_scores.append(new_score)
            traversal_solutions.append(new_solution)
        best_traversal_score = max(traversal_scores)
        index = traversal_scores.index(best_traversal_score)
        best_traversal_solution = traversal_solutions[index]
        if len(scores) == 0 or (len(scores) >= 1 and best_traversal_score >= scores[-1]):
            curr_solution = best_traversal_solution
            scores.append(best_traversal_score)
    score = max(scores)
    print("score, init_score of random_walk", score, init_score)
    print("scores: ", scores)
    print("solution: ", curr_solution)
    running_duration = time.time() - start_time
    print('running_duration: ', running_duration)
    return score, curr_solution, scores


<a id='testing'></a>
## 3. Testing
Below are descriptions of the graphs in our dataset (sourced from https://web.stanford.edu/~yyye/yyye/Gset/). <br>
Notice that running a random walk on some graphs may not be feasible. <br>
| Graph | # Nodes | # Edges |
|-------|---------|---------|
|  G14  |   800   |   4694  |
|  G15  |   800   |   4661  |
|  G22  |  2000   |  19990  |
|  G49  |  3000   |   6000  |
|  G50  |  3000   |   6000  |
|  G55  |  5000   |  12468  |
|  G70  | 10000   |   9999  |

In [3]:
if __name__ == '__main__':
    # read data
    graph = read_nxgraph('data/gset_14.txt')

    # run alg
    init_solution = list(np.random.randint(0, 2, graph.number_of_nodes()))
    rw_score, rw_solution, rw_scores = random_walk(init_solution=init_solution, num_steps=1000, max_num_flips=20, graph=graph)


random_walk
score, init_score of random_walk 2928.0 2374.0
scores:  [2391.0, 2404.0, 2408.0, 2427.0, 2440.0, 2449.0, 2465.0, 2471.0, 2488.0, 2497.0, 2506.0, 2513.0, 2524.0, 2531.0, 2536.0, 2541.0, 2541.0, 2544.0, 2558.0, 2563.0, 2566.0, 2571.0, 2575.0, 2577.0, 2580.0, 2582.0, 2588.0, 2595.0, 2600.0, 2610.0, 2615.0, 2616.0, 2617.0, 2622.0, 2626.0, 2631.0, 2632.0, 2632.0, 2633.0, 2638.0, 2640.0, 2651.0, 2656.0, 2658.0, 2661.0, 2662.0, 2663.0, 2673.0, 2679.0, 2682.0, 2684.0, 2689.0, 2691.0, 2694.0, 2695.0, 2701.0, 2702.0, 2702.0, 2702.0, 2706.0, 2708.0, 2709.0, 2710.0, 2714.0, 2714.0, 2716.0, 2716.0, 2721.0, 2721.0, 2723.0, 2729.0, 2730.0, 2732.0, 2732.0, 2733.0, 2733.0, 2733.0, 2733.0, 2735.0, 2735.0, 2737.0, 2737.0, 2748.0, 2749.0, 2749.0, 2753.0, 2753.0, 2756.0, 2759.0, 2760.0, 2761.0, 2761.0, 2762.0, 2763.0, 2764.0, 2764.0, 2764.0, 2764.0, 2767.0, 2767.0, 2769.0, 2772.0, 2774.0, 2775.0, 2778.0, 2780.0, 2780.0, 2783.0, 2783.0, 2784.0, 2787.0, 2788.0, 2788.0, 2793.0, 2796.0, 2800.0, 280

<a id='benchmark'></a>
## 4. Benchmarked Results

The random walk algorithm was benchmarked on 6 Gset graphs. The 7th graph used in this testing suite, Gset70, was too large to be time-feasible and therefore was omitted. 

![title](images/random_walk_scores.png)