# Max Cut approach

After testing the NMcut, I wondered if Maxcut could be used directly to solve the splitting of the distance matrix.

---

Here, we'll make use of a distance matrix directly, so we need new functions to calculate it. However, as all the elements in the matrix are normalized between 0 and 100, we can directly invert the elements of all the matrices.

In [1]:
import os, sys
import numpy as np
sys.path.append('../')

In [None]:
for i in range(8,40):
    
    os.makedirs(f'../distance_matrices/{i}', exist_ok=True)

file_names = os.listdir('../matrices')

for folder in file_names:
    files = os.listdir(f'../matrices/{folder}')
    for file in files:
        # Load the sequences
        distance_matrix = np.load(f'../matrices/{folder}/{file}')
        # Invert the matrix
        distance_matrix = 100 - distance_matrix
        # Save the inverted matrix
        np.save(f'../distance_matrices/{folder}/{file}', distance_matrix)

        del distance_matrix

---

We now have the distance matrix created from the alignments. Now we've got to define the new optimization problem using the max cut as a basis:

$$Max_{cut}=\sum_{i=1}^{n-1}\sum_{j=0}^{i-1}D_{ij}z_iz_j,\quad z_i\in \{-1,1\}.$$

We need to minimize this expression, therefore, our problems can be defined as:

$$\min Max_{cut}$$

In [None]:
import dimod
from dimod import BinaryQuadraticModel, BINARY, SPIN
from dimod import ExactDQMSolver

distance_matrix = np.load('../distance_matrices/8/matrix_1822.npy')
J = {}
# Create the J dict for the BQM
for i in range(len(distance_matrix)):
    for j in range(i):
        if distance_matrix[i][j] != 0:
            J[(i, j)] = distance_matrix[i][j]  

model = BinaryQuadraticModel({},J,0.0,SPIN)
print(model)

BinaryQuadraticModel({1: 0.0, 0: 0.0, 2: 0.0, 3: 0.0, 4: 0.0, 5: 0.0, 6: 0.0, 7: 0.0}, {(0, 1): 39.87122337790986, (2, 1): 2.7322404371584668, (2, 0): 40.2281746031746, (3, 1): 39.615194869264926, (3, 0): 40.3448275862069, (3, 2): 40.76086956521739, (4, 1): 40.82286277408229, (4, 0): 41.00861008610086, (4, 2): 41.72218110041944, (4, 3): 6.00343053173242, (5, 1): 44.07324919574363, (5, 0): 39.31307141092167, (5, 2): 45.13011152416357, (5, 3): 43.78538026089097, (5, 4): 46.017699115044245, (6, 1): 46.824808500123545, (6, 0): 41.67283493708364, (6, 2): 47.73570898292502, (6, 3): 48.09535512410912, (6, 4): 50.220913107511045, (6, 5): 12.179487179487182, (7, 1): 68.59340114115604, (7, 0): 63.93361406985385, (7, 2): 68.09937888198758, (7, 3): 68.51714779175919, (7, 4): 70.23164120256284, (7, 5): 67.97029702970298, (7, 6): 70.09391992090954}, 0.0, 'SPIN')


In [14]:
solver = dimod.ExactDQMSolver()
sol = solver.sample_dqm(model)

# We want the best feasible solution. We can filter by its feasibility and take the first element
feas_sol = sol.filter(lambda s: s.is_feasible)
print(feas_sol)

TypeError: 'int' object is not callable