# Crossword creator, Model 2

## Purpose
The purpose of this model is to generate a crossword grid using Mixed Integer Linear Programming (MILP).

This model is inspired by "Generating Crossword Grids Using Constraint Programming" by Philippe Olivier, https://pedtsr.ca/2023/generating-crossword-grids-using-constraint-programming.html

## Usage:
- Define a crossword grid in an Excel file. See the examples for a guide.
- Specify the grid file name in <code>Grid</code>.
- Specify a word lexicon file to use as <code>Lexicon</code>. The lexicon is a list of words in Excel. Each word has a frequency. The grid will be populated using a random sample, of specified size, from the list.
- In the grid file, define a list ("Fix words") of grid positions to fix, meaning that words of the approriate length will be randomly selected for those position. Turn this feature on/off with "Use fix words".
- Specify global assumptions, as defined below.

## Model 2 features, in addition to the Model 1 features:
- Specified grid positions can be pre-populated with fixed values, using words randomly selected from the lexicon. Use this feature sparingly, as it is easy to make the model infeasible. It is tyically best to fix only a few. Avoid grid positions that intersect.
- The constraints are constructed by excluding terms that are clearly infeasible. This feature materially reduces the model size and reduces solve time.
- Variables are fixed at zero if a candidate word length doesn't match grid word length. This feature substantially reduces solve time.
- Alternative objective functions are available, selected via <code>ObjectiveChoice</code>.

In [1]:
# Include other notebooks

%run ./components/imports-1.ipynb
%run ./components/utilities-1.ipynb
%run ./components/solver-1.ipynb
%run ./components/data-model-2.ipynb
%run ./components/formulation-model-2.ipynb
%run ./components/output-model-1.ipynb
%run ./components/main-model-2.ipynb

In [2]:
# Globals

# Data assumptions
Lexicon = 'gutenberg.xlsx'   # gutenberg.xlsx
Grid = 'grid-7-2.xlsx'
SampleSize = 0   # Number of words to randomly select from WordFile. 0 means select all words
SingleWordSquare = False   # Use True only for 100% dense square grids, otherwise must be False

# Run options
MipGap = 100   # Highs: 100 (10000%) = stop on first feasible solution, or thereabouts; 0 = find optimum; CPLEX: 1 = first feasible, 0 = optimal; Gurobi: 100
SolutionLimit = 1   # Gurobi only, 1 = stop on first MIP solution
MaxIterations = 3   # Iterate random seeds, starting with RandomSeed and incrementing by 1 each iteration
StopOnFirst = True   # Stop on first solution, even if < MaxIterations
RandomSeed = 1   # Starting value
Direction = 1   # 1 = maximize, -1 = minimize
ObjectiveChoice = 3   # 1 = Allocated words weighted by frequency; 2 = Allocated words; 3 = Number of intersections with same letter (requires maximize)

# Solver options
Neos = False
SolverName = 'gurobi_direct'   #'appsi_highs'   'gurobi_direct'
os.environ['NEOS_EMAIL'] = 'your.email@example.com'
Verbose = True
LoadSolution = False
TimeLimit = 4*3600   # seconds

# Model file
WriteFile = False
ModelFile = 'model-2.gams'   # Extensions: .gams .lp .nl

# Fixed
ModelName = 'Crossword creator - Model 2'
Checkpoints = []   # List of time checkpoints
WordWorksheet = 'Data'
GridWorksheet = 'Grid'
WordFile = os.path.join(os.getcwd() + '\lexicon', Lexicon)
GridFile = os.path.join(os.getcwd() + '\grid', Grid)
MaxWordLength = 15

In [3]:
Main()

Crossword creator - Model 2
Grid:     grid-7-2.xlsx
Lexicon:  gutenberg.xlsx

Iteration:  1  of  3
Lexicon size: 6,574 (out of 6,574 that fit, from lexicon of 35,259)
Defining model...
Initial words:
5 :  frantic        
19 :  crc            

Calling solver...
Set parameter TimeLimit to value 14400
Set parameter LogFile to value "gurobi.log"
Set parameter MIPGap to value 100
Set parameter SolutionLimit to value 1
Gurobi Optimizer version 11.0.0 build v11.0.0rc2 (win64 - Windows 11.0 (22000.2))

CPU model: Intel(R) Core(TM) i7-8550U CPU @ 1.80GHz, instruction set [SSE2|AVX|AVX2]
Thread count: 4 physical cores, 8 logical processors, using up to 8 threads

Optimize a model with 6678 rows, 144669 columns and 518673 nonzeros
Model fingerprint: 0xfa92764f
Variable types: 0 continuous, 144669 integer (144669 binary)
Coefficient statistics:
  Matrix range     [1e+00, 1e+02]
  Objective range  [1e+00, 1e+00]
  Bounds range     [1e+00, 1e+00]
  RHS range        [1e+00, 1e+02]
Presolve removed 5