# how to run deits solver
1. clone the deits repository (https://github.com/rdeits/cryptics) into './deits/; Note that the julia solver is better but for our research we used python version.
2. checkout commit 402579 (in deits repo)
3. apply patch 0001-Set-up... (in this directory)
4. Use this notebook to set up the deits input clue file
5. run validate_cryptics.py in the deits directory.
    - this file will be created by patch application.
    (see bottom of validate_cryptics.py for the command line arguments that should be included)
    - you will need to specify an output file. Use the abs path of '../deits/clues/'
    - you will need to specify input clue file. that should be the one generated in this nb ('../deits/clues/*')
6. Use this notebook to run eval (this file)


In [1]:
%load_ext autoreload
%autoreload 2

In [2]:
from decrypt import config
k_json_folder = config.DataDirs.Guardian.json_folder
k_deits_clue_folder = config.DataDirs.Deits.k_deits_clues
k_deits_output_folder = config.DataDirs.Deits.k_deits_outputs

In [None]:
from decrypt.scrape_parse import (
    load_guardian_splits,
    load_guardian_splits_disjoint_hash
)
from decrypt.common.validation_tools import (
    load_deits,
    all_aggregate,
)
from decrypt.common.puzzle_clue import GuardianClue

Setting up the datasets

In [None]:
## produce deits

# produce test set for cdeits
def make_deits_format(gc: GuardianClue):
    len_str = "(" + ",".join(map(str, gc.lengths)) + ")"
    clue_str = gc.clue + " " + len_str
    final = clue_str + " | " + gc.soln_with_spaces
    return final

def make_deits(fname, clueset):
    with open(fname, "w") as f:
        for c in clueset:
            f.write(make_deits_format(c) + "\n")


def prep_deits(val_or_test: str,
               naive_or_disj: str):
    assert val_or_test in ["val", "test"]
    assert naive_or_disj in ["naive", "disj"]
    if naive_or_disj == "naive":
        load_fn = load_guardian_splits
    else:
        load_fn = load_guardian_splits_disjoint_hash

    _, _, (train_local, val, test) = load_fn(k_json_folder)
    if val_or_test == "val":
        val_local = val
    else:
        val_local = test

    append_name = f'guardian_{naive_or_disj}_{val_or_test}'
    fname = f'{k_deits_clue_folder}{append_name}.txt'
    output_folder = f'{k_deits_output_folder}{append_name}/'        # outputs will be like output/{append_name}/*-1.json
    return val_local, fname, output_folder

def make_deits_clues(val_or_test,
                     naive_or_disj):
    val_local, fname, _ = prep_deits(val_or_test, naive_or_disj)
    make_deits(fname, val_local)

This will produce a clue file in `k_deits_clue_folder`
For example, for disjoint val set:
make_deits_clues('val', 'disj')     # produces a clue file in k_deits_clue_folder

To fully replicate results, also run
- `make_deits_clues('test', 'disj')
- `make_deits_clues('val', 'naive')
- `make_deits_clues('test', 'naive')

Now we run `deits/validate_cryptics.py` to generate outputs from the model.
For example,
`python validate_cryptics.py <output_file> <timeout_len_in_seconds> <starting idx> <end idx> <input_file> <output_directory>`

To run all clues, e.g.,
- determine the last index of the clues in the set on which you are evaluating. If it is 28442 then
`python validate_cryptics.py out.json  120 0 28443  k_deits_clue_folder/guardian_naive_val.txt outputs/naive_val/`

To actually replicate this, it is best to split up the runs into sets of, e.g., 100 clues and parallelize.

It is recommended to use the Julia rather than the Deits solver since this one is incredibly slow.

Finally, we run the evaluation code:
Below we provide code to run the two models to produce row 1 of Main Results
Table 2 in the paper.

Note that for the Main Results Table 2, the metrics we include in the table correspond to
- `agg_top_match`
- `agg_top_10_after_filter`

More details of these metric calculations can be found in `decrypt.common.validation_tools`

In [None]:
def eval_deits_clues(val_or_test,
                     naive_or_disj):
    val_local, fname, output_folder = prep_deits(val_or_test, naive_or_disj)
    deits_outputs_glob = output_folder + "*-1.json"
    model_outputs = load_deits(val_local, deits_outputs_glob)
    all_aggregate(model_outputs)


In [None]:
# for example
eval_deits_clues('val', 'disj')     # will evaluate the outputs

