# Solving Sudoku with exact cover

Sudoku solved by exact cover -algorithm implemented by QUBO in quantum annealer.

Sudoku can be converted to exact cover problem by following procedure:

- Elements of set $U$ in exact cover problem represent here sudoku rule restricition. For example: cell $(y,x)$ of the sudoku is filled, block $y/3,x/3$ has number $n$ somewhere in blocj, row $y$ has number $n$, and column $x$ has number $n$. So there are altogether $4*9*9=324$ elements in $U$.
- Every subset $V_i \in V$ is an option to fill a cell in sudoku: A number $n$ is placed to a cell $(y,x)$. All these subsets have four elements: cell $(y,x)$ is filled, block $y/3,x/3$ has number $n$, row $y$ has number $n$, and column $x$ has number $n$.
- To reduce the needed qubits some “evidently impossible” choices of $n$ to cell $(y,x)$ are not included in subsets $V$. For example number $n$ is already in that row.

After this problem is solved by the exact cover algorithm with set $V$.

In [109]:
import numpy as np
import time
import dimod
from dwave.system import DWaveSampler, EmbeddingComposite, LeapHybridSampler
from dwave.samplers import SimulatedAnnealingSampler
import dwave.inspector
from minorminer.busclique import find_clique_embedding

## Some helper functions

In [172]:
def count_zeros(sudoku):
    count=0;
    for y in range(size2):
        for x in range(size2):
            if sudoku[y][x]==0:
                count += 1
    return count

def print_sudoku(sudoku):
    for y in range(size2):
        if y!=0 and y%ssize==0:
            if ssize==4: print('----+----+----+----')      
            if ssize==3: print('---+---+---')      
            if ssize==2: print('--+--')      
        for x in range(size2):
            if x!=0 and x%ssize==0:
                print('|',end='')
            n = int(sudoku[y][x])
            print('.' if n==0 else n , end='')
        print('')
        
def merge_sudoku(sudoku1, sudoku2):
    sudoku_res = np.zeros((ssize*ssize,ssize*ssize))
    for y in range(size2):
        for x in range(size2):
            if sudoku1[y][x]==0:
                sudoku_res[y][x] = sudoku2[y][x]
            else:
                sudoku_res[y][x] = sudoku1[y][x]
                if sudoku2[y][x]>0:
                    print('! Sudokus have overlapping cell: {},{}'.format(y+1,x+1))
    return sudoku_res

def check_sudoku(sudoku):
    f = 0
    for i in range(size2):
        g = []
        for j in range(size2):
            g.append(sudoku[i][j])
        for n in range(1,size2+1):
            if not n in g:
                print('Number {} missing in row {}.'.format(n,i+1))
                f += 1
    for i in range(size2):
        g = []
        for j in range(size2):
            g.append(sudoku[j][i])
        for n in range(1,size2+1):
            if not n in g:
                print('Number {} missing in column {}.'.format(n,i+1))
                f += 1
    for i1 in range(ssize):
        for i2 in range(ssize):
            g = []
            for j1 in range(ssize):
                for j2 in range(ssize):
                    g.append(sudoku[i1*ssize+j1][i2*ssize+j2])
            for n in range(1,size2+1):
                if not n in g:
                    print('Number {} missing in block {},{}.'.format(n,i1+1,i2+1))
                    f += 1
    if f==0:
         print('sudoku OK')
    else:
        print('number of problems:',f)
                     

## Load sudoku

sudoku file structure: first line is number of sudokus. First line of every sudoku is the size of sudoku.

3_level_48.ss: These should be rather easy sudokus, 48 cells not filled.

In [10]:
f = open('testdata/3_level_48.ss','r')
#f = open('testdata/2_level_varia.ss','r')
scount = int(f.readline())
ssize = int(f.readline())

sudoku=np.zeros((scount,ssize*ssize,ssize*ssize))

for i in range(scount):
    for j in range(ssize):        
        for k in range(ssize):
            line = f.readline()
            for l in range(ssize*ssize):
                c = line[l+int(l/ssize)]
                sudoku[i][j*ssize+k][l] = 0 if c=='.' else int(c)
        line = f.readline()

## Create set and subset from sudoku

Sudoku can be converted to exact cover problem:
- set elements are all option in all cells. So in empty sudoku there is 9x9x9 elements. To get little bit better efficiency, here only empy cells are counted in.
- subset block: 1) every cell can have only one number, 2) every 3x3 block can have only one number each, 3) every row can have only one number each, 4) every column can have only one number each

Every four subset block has 81 subsets like this: 1) there is 81 cells in sudoku, 2) sudoku has 9 blocks, for wich one subset for each number, 3) 9 rows, for wich one subset for each number, 4) 9 columns, for wich one subset for each number.

Some preprocessing is made, known numbers are one large subset.

In [65]:
sind = 0             # which sudoku from the list
size2 = ssize*ssize

U = [i for i in range(size2*size2*4)]  # 4 blocks of subsets
V = []                                  # Set element names are row*81 + col*9 + "1-9"
cell_id = []

t1 = time.time()

# “evidently impossible” choices
c = []
for y in range(size2):
    for x in range(size2):
        if sudoku[sind][y][x]!=0:
            n = sudoku[sind][y][x]-1
            c.append(int(y*size2+x))
            c.append(int(size2*size2 + int(y/ssize)*size2*ssize + int(x/ssize)*size2 + n))
            c.append(int(2*size2*size2 + y*size2 + n))
            c.append(int(3*size2*size2 + x*size2 + n))
c.sort()
            
y=0
x=0
while y<size2:
    if sudoku[sind][y][x]==0:
        for j in range(size2):
            # cell, block, row, column, restriction
            bl = size2*size2 + int(y/ssize)*size2*ssize + int(x/ssize)*size2 + j
            row = 2*size2*size2 + y*size2 + j
            col = 3*size2*size2 + x*size2 + j
            if not (bl in c or row in c or col in c):
                V.append([y*size2+x, bl, row, col])
                cell_id.append((y,x,j+1))
    x += 1
    if x==size2:
        x = 0
        y += 1

z = count_zeros(sudoku[sind])
t2 = time.time()
print('Time used for construction Q (ms): {:.1f}'.format((t2-t1)*1000))
print('Number of blanks:',z)            
print('Number of elements:',len(U))
print('Number of subsets:',len(V))
print('Number of subsets if no zipping:',size2*size2*size2)

Time used for construction Q (ms): 3.2
Number of blanks: 48
Number of elements: 324
Number of subsets: 141
Number of subsets if no zipping: 729


## Create QUBO

Constraints
- total number of elements in subset should be $|U|$
- each element only in one subset

In [96]:
Q = np.zeros((len(V),len(V)))
t1 = time.time()

# Total elements constraint
for i in range(len(V)):
    Q[i][i] =- len(V[i])

# each element only in one subset
for a in U:
    for i in range(len(V)):
        for j in range(i+1, len(V)):
            if a in V[i] and a in V[j]:
                Q[i][j] = len(V[i]) + len(V[j]) + 2
                
t2 = time.time()
print('Time used for construction Q (ms): {:.1f}'.format((t2-t1)*1000))

Time used for construction Q (ms): 353.7


## Creat BQM from QUBO

In [100]:
bqm = dimod.BinaryQuadraticModel(Q, 'BINARY')
print('Number of logical qubits needed:',Q.shape[0])
print('Number of logical couplers needed:', len(bqm.quadratic))

Number of logical qubits needed: 141
Number of logical couplers needed: 588


## Local heuristic classical solver

Local deterministic solver can not be used because: "Maximum allowed dimension exceeded"

In [101]:
num_reads = 100
t1 = time.time()
sampleset = SimulatedAnnealingSampler().sample(bqm, num_reads=num_reads).aggregate()
t2 = time.time()
print('Time used by solver (s): {:.1f}'.format((t2-t1)))
print('Lowest energy reached:',int(sampleset.first.energy))
print('Lowest energy should be:',-count_zeros(sudoku[sind])*4)   
print('Lowest energy occurences: {} %'.format(int(sampleset.first.num_occurrences/num_reads*100)))

Time used by solver (s): 0.3
Lowest energy reached: -192
Lowest energy should be: -192
Lowest energy occurences: 24 %


## Then analyse results...

In [160]:
sudoku_res = np.zeros((ssize*ssize,ssize*ssize))
r = sampleset.first.sample
for k,v in r.items():
    if v==1 and k<len(cell_id):
        y,x,n = cell_id[k]
        sudoku_res[y][x] = n

In [161]:
print_sudoku(sudoku[sind])

..9|..3|.1.
..1|.27|.8.
...|15.|963
---+---+---
5.8|..1|.49
94.|..6|.3.
...|..9|...
---+---+---
726|3.8|1.4
.34|.6.|..2
...|...|...


In [162]:
print_sudoku(sudoku_res)

45.|68.|2.7
36.|9..|4.5
287|..4|...
---+---+---
.7.|23.|6..
..2|87.|5.1
613|54.|728
---+---+---
...|.9.|.5.
1..|7.5|89.
895|412|376


In [163]:
merge = merge_sudoku(sudoku[sind],sudoku_res)

In [164]:
print_sudoku(merge)

459|683|217
361|927|485
287|154|963
---+---+---
578|231|649
942|876|531
613|549|728
---+---+---
726|398|154
134|765|892
895|412|376


In [165]:
check_sudoku(merge)

sudoku OK


# Quantum solver

In [120]:
machine = DWaveSampler(solver={'chip_id': 'Advantage_system4.1'})
print('Chip:', machine.properties['chip_id'])
print('Qubits:', machine.properties['num_qubits'])

Chip: Advantage_system4.1
Qubits: 5760


In [140]:
num_reads = 1000

embedding = find_clique_embedding(bqm.variables, machine.to_networkx_graph())  
num_qubits_needed = sum(len(chain) for chain in embedding.values())
print('Number of actual qubits needed:',num_qubits_needed)

anneal_schedule = [[0.0, 0.0], [40.0, 0.4], [1040.0, 0.4], [1042, 1.0]]
estimated_runtime = machine.solver.estimate_qpu_access_time(num_qubits_needed, num_reads=num_reads, anneal_schedule=anneal_schedule)    
print("Estimate of {:.0f}ms on {}".format(estimated_runtime/1000, machine.solver.name)) 

Number of actual qubits needed: 1952
Estimate of 1313ms on Advantage_system4.1


In [141]:
sampleset2 = EmbeddingComposite(machine).sample(bqm, num_reads=num_reads)

In [142]:
qtime = sampleset2.info['timing']['qpu_access_time'] / 1000
qubits = sum(len(x) for x in sampleset2.info['embedding_context']['embedding'].values())
print('Lowest energy should be:',-count_zeros(sudoku[sind])*4)  
print('Lowest energy reached:',int(sampleset2.first.energy))
print('Lowest energy occurences: {} %'.format(int(sampleset2.first.num_occurrences/num_reads*100)))
print('QPU time used (ms): {:.1f}'.format(qtime))
print('Physical qubits used: {}'.format(qubits))

Lowest energy should be: -192
Lowest energy reached: -152
Lowest energy occurences: 0 %
QPU time used (ms): 207.0
Physical qubits used: 572


In [166]:
sudoku_res2 = np.zeros((ssize*ssize,ssize*ssize))
r = sampleset2.first.sample
for k,v in r.items():
    if v==1 and k<len(cell_id):
        y,x,n = cell_id[k]
        sudoku_res2[y][x] = n

In [167]:
print_sudoku(sudoku[sind])

..9|..3|.1.
..1|.27|.8.
...|15.|963
---+---+---
5.8|..1|.49
94.|..6|.3.
...|..9|...
---+---+---
726|3.8|1.4
.34|.6.|..2
...|...|...


In [168]:
print_sudoku(sudoku_res2)

45.|8..|2.7
36.|...|5..
2.7|..4|...
---+---+---
...|...|6..
..2|58.|..1
613|24.|758
---+---+---
...|...|.9.
1..|7.5|8..
895|412|376


In [175]:
merge2 = sudoku_merge(sudoku[sind],sudoku_res2)

In [176]:
print_sudoku(merge2)

459|8.3|217
361|.27|58.
2.7|154|963
---+---+---
5.8|..1|649
942|586|.31
613|249|758
---+---+---
726|3.8|194
134|765|8.2
895|412|376


In [178]:
check_sudoku(merge2)

Number 6 missing in row 1.
Number 4 missing in row 2.
Number 9 missing in row 2.
Number 8 missing in row 3.
Number 2 missing in row 4.
Number 3 missing in row 4.
Number 7 missing in row 4.
Number 7 missing in row 5.
Number 5 missing in row 7.
Number 9 missing in row 8.
Number 7 missing in column 2.
Number 8 missing in column 2.
Number 6 missing in column 4.
Number 9 missing in column 4.
Number 3 missing in column 5.
Number 7 missing in column 5.
Number 9 missing in column 5.
Number 4 missing in column 7.
Number 2 missing in column 8.
Number 5 missing in column 9.
Number 8 missing in block 1,1.
Number 6 missing in block 1,2.
Number 9 missing in block 1,2.
Number 4 missing in block 1,3.
Number 7 missing in block 2,1.
Number 3 missing in block 2,2.
Number 7 missing in block 2,2.
Number 2 missing in block 2,3.
Number 9 missing in block 3,2.
Number 5 missing in block 3,3.
number of problems: 30


In [132]:
dwave.inspector.show(sampleset2)

'http://127.0.0.1:18000/?problemId=7062af46-1a40-44eb-9d50-3c1394796ab4'

In [174]:
print(sampleset2.truncate(30))

    0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 ... 140 energy num_oc. ...
0   0  1  0  0  1  0  0  0  0  0  1  0  0  1  0  0 ...   0 -152.0       1 ...
1   0  0  1  0  1  0  0  0  0  0  1  0  0  1  0  0 ...   0 -148.0       1 ...
2   1  0  0  0  0  1  0  0  0  0  1  0  0  0  0  1 ...   0 -146.0       1 ...
3   0  0  1  0  1  0  0  0  0  0  1  0  0  1  0  0 ...   0 -144.0       1 ...
4   0  0  1  0  1  0  0  0  0  0  1  0  0  0  1  0 ...   0 -140.0       1 ...
5   0  1  0  0  1  0  0  0  0  1  0  0  1  0  0  0 ...   0 -140.0       1 ...
6   0  1  0  0  1  0  0  0  0  0  1  0  0  1  0  0 ...   0 -140.0       1 ...
7   0  0  1  0  0  0  0  1  1  0  0  0  0  0  0  1 ...   0 -140.0       1 ...
8   0  1  0  0  1  0  0  0  0  1  0  0  0  1  0  0 ...   0 -140.0       1 ...
9   0  0  0  0  1  0  0  0  0  1  0  0  1  0  0  0 ...   0 -140.0       1 ...
10  0  1  0  0  1  0  0  0  0  1  0  0  1  0  0  0 ...   1 -140.0       1 ...
11  0  0  1  0  1  0  0  0  1  0  0  0  1  1  0  0 ...   0 -140.