In this jupyter notebook, we want to demonstrate using manipulated gradients to boost the performance of FEM.
The example we use here will be MaxCut problem in Gset 1. The best known maximum cut in this instance is 11624.

Firstly load all the required environments, be careful that the version of pytorch should be at least 2.0.

In [1]:
import sys
sys.path.append('../')
from FEM import FEM
import torch
import time

Then set up some hyperparameters for usage.

In [2]:
num_trials = 500
num_steps = 1000
dev = 'cuda' # if you do not have gpu in your computing devices, then choose 'cpu' here

For the first case, we use auto-differentiation version of FEM solver with `manual_grad` argument set to be `False`.
The time here is based on execution of codes in a NVIDIA A100 80G GPU.


In [3]:
case_maxcut = FEM.from_file(
    'maxcut', 'instances/G1.txt', index_start=1
)
case_maxcut.set_up_solver(
    num_trials, num_steps, manual_grad=False, betamin=0.001, betamax=0.5, 
    learning_rate=0.1, optimizer='rmsprop', dev=dev
)
torch.cuda.synchronize(dev)
t0 = time.perf_counter()
config, result = case_maxcut.solve()
torch.cuda.synchronize(dev)
t1 = time.perf_counter()
optimal_inds = torch.argwhere(result==result.max()).reshape(-1)
print(f'maxcut instance with auto-differentiation, cut value {result.max():.0f}, consume {t1-t0:.4f} seconds.')

maxcut instance with auto-differentiation, cut value 11559, consume 1.6089 seconds.


For the second case, we use explicit gradient version of FEM solver with `manual_grad` argument set to be `True`.
The explicit gradient means we explicitly write the gradient form of parameters without using the auto-differentiation.
We can see that the running time is much lower than the previous one and the cut value is slightly better which may cause from the numerical error in the auto-differentiation.

In [5]:
case_maxcut = FEM.from_file(
    'maxcut', 'instances/G1.txt', index_start=1
)
case_maxcut.set_up_solver(
    num_trials, num_steps, manual_grad=True, betamin=0.001, betamax=0.5, 
    learning_rate=0.1, optimizer='rmsprop', dev=dev
)
torch.cuda.synchronize(dev)
t0 = time.perf_counter()
config, result = case_maxcut.solve()
torch.cuda.synchronize(dev)
t1 = time.perf_counter()
optimal_inds = torch.argwhere(result==result.max()).reshape(-1)
print(f'maxcut instance with explicit gradients, cut value {result.max():.0f}, consume {t1-t0:.4f} seconds.')

maxcut instance with explicit gradients, cut value 11617, consume 0.4201 seconds.


For the final case, we use manipulated gradient version of FEM solver with `manual_grad` argument and `discretization` argument set to be `True`.
The `discretization` argument will change the marginal `p` in energy gradient into `p.round()`, leads to the discretization of the marginal into binary configurations.
This manipulated gradient results in the optimal cut value (11624) found with FEM solver.

In [6]:
case_maxcut = FEM.from_file(
    'maxcut', 'instances/G1.txt', index_start=1, discretization=True
)
case_maxcut.set_up_solver(
    num_trials, num_steps, manual_grad=True, betamin=0.001, betamax=0.5, 
    learning_rate=0.1, optimizer='rmsprop', dev=dev
)
torch.cuda.synchronize(dev)
t0 = time.perf_counter()
config, result = case_maxcut.solve()
torch.cuda.synchronize(dev)
t1 = time.perf_counter()
optimal_inds = torch.argwhere(result==result.max()).reshape(-1)
print(f'maxcut instance with manipulated gradients, cut value {result.max():.0f}, consume {t1-t0:.4f} seconds.')

maxcut instance with manipulated gradients, cut value 11624, consume 0.4248 seconds.
