The below set up is for a Google Colab notebook. This assumes that this notebook is in a Google Folder called TMaze, which contains all the files in [the Github repository](https://github.com/annikaheuser/TMaze/blob/main/tmaze.py).

In [ ]:
# Added March 2024: Ensuring stable compatibility between CUDA (11.8), numpy (1.23.1), torch (2.2.0), and mxnet (mxnet-cu117).

# Downgrade CUDA to 11.8
!wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-ubuntu1804.pin
!mv cuda-ubuntu1804.pin /etc/apt/preferences.d/cuda-repository-pin-600
!wget https://developer.download.nvidia.com/compute/cuda/11.8.0/local_installers/cuda-repo-ubuntu1804-11-8-local_11.8.0-520.61.05-1_amd64.deb
!dpkg -i cuda-repo-ubuntu1804-11-8-local_11.8.0-520.61.05-1_amd64.deb
!cp /var/cuda-repo-ubuntu1804-11-8-local/cuda-*-keyring.gpg /usr/share/keyrings/
!apt-get update
!apt-get -y install cuda-11-8

# Downgrade numpy to handle np.bool issue
# (will require restarting runtime after execution)
!pip install numpy==1.23.1

In [None]:
#Installations
!ln -s /usr/local/cuda/lib64/libcusolver.so.11 /usr/local/cuda/lib64/libcusolver.so.10
!ls /usr/local/cuda/lib64/libcusolver*
!pip install torch==2.2.0 --index-url https://download.pytorch.org/whl/cu118
!pip install mxnet-cu117
!pip install wordfreq language_tool_python transformers
!git clone https://github.com/awslabs/mlm-scoring.git mlm_scoring/
!cd mlm_scoring/; git checkout 9cab61e6774bcc4983f7117f1a280c334f3e68b5; sed -i '21s/.*/"transformers",/' setup.py; cat setup.py; pip install .; pip install .; cd ..

In [None]:
!pwd
import torch
import mxnet as mx
from google.colab import drive
drive.mount('/content/gdrive/')
WORK_PATH = "/content/gdrive/My Drive/TMaze"
from mlm.scorers import MLMScorer, MLMScorerPT, LMScorer
from mlm.models import get_pretrained
import mxnet as mx
import pickle
import wordfreq
import string
from scipy.stats import norm
import spacy
import numpy as np
import pandas as pd
import sys
sys.path.append(WORK_PATH)
import tmaze
import materials
import ibex_prep
import lang_spec

In [None]:
%load_ext autoreload
%autoreload 2

If you're using a language other than English, please refer to [Heuser, 2022](https://dspace.mit.edu/handle/1721.1/147233) for detailed instructions on how to create the necessary language-specific files. In English, these are `nonwords_en.pkl` and `freq_bins_en_ensemble.pkl`, which were created via functions in `lang_spec.py` and are included in [the Github repository](https://github.com/annikaheuser/TMaze/blob/main/tmaze.py).



In [None]:
eng = lang_spec.lang_spec("en-US",True,WORK_PATH)
eng.compile_freq_bins_and_nonwords_set()

`boyce_materials_formatted.txt` has the expected format for experimental materials that are to be matched with distractors, namely:
```
ConditionName;ItemID;Sentence
```
For example:

```
adverb_high;72;Kim will display the photos she took next month, but she won't show all of them.
adverb_low;72;Kim will display the photos she took last month, but she won't show all of them.
```
This file was derived from [g_maze.js](https://github.com/vboyce/Maze/blob/master/experiment/Materials/g_maze.js), made by Boyce et al. (2020).


In [None]:
with open(f'{WORK_PATH}/boyce_materials_formatted.txt') as f:
    sents = f.readlines()

The materials object has a number of potentially useful attributes. Refer to [materials.py](https://github.com/annikaheuser/TMaze/blob/main/materials.py) for all of them. The following code creates new files with the names specified in the dictorionary at `WORK_PATH/{file_name}`. For example, in this case we will create `/content/gdrive/My Drive/TMaze/BoyceCondDict.pkl`.

In [None]:
m_pickle_dict = {"cond_dict": "BoyceCondDict.pkl", "word_info": "BoyceWordInfo.pkl","item_pairs": "BoyceNumItemPairs.pkl"}
boyce = materials.materials(sents,";",WORK_PATH,'en',m_pickle_dict)

Here we specify the transformer model that TMaze should use to produce materials. Run

```
mlm.models.SUPPORTED_MLMS
```
to see what other models can be run by just changing the string in the below code. 'bert-base-multi-cased' may work decently well for languages like German, French, or Spanish.


In [None]:
ctxs = [mx.gpu(0)]
model, vocab, tokenizer = get_pretrained(ctxs, 'bert-base-en-uncased')
scorer = MLMScorer(model, vocab, tokenizer, ctxs)

In [None]:
pickle_dict = {"freq_dict":f'{WORK_PATH}/freq_bins_en_ensemble.pkl', "word_info": f"{WORK_PATH}/BoyceWordInfo.pkl", "nonwords_set": f"{WORK_PATH}/nonwords_en.pkl", "dists_dict":f"{WORK_PATH}/EnglishDistractors.pkl"}
with open(pickle_dict['dists_dict'], "wb") as f:
  pickle.dump({},f) #to initialize the dictionary within TMaze

The above pickle files are either the result of the language specific setup (which are included in the Github repository for English) or from loading in the materials (i.e. `materials.materials(...)`). We initialize the final one at the desired file path in the code block above.

TMaze takes the scorer (which assigns strings psuedologlikelihood values or log likelihood values based on a model) as its first argument. While the scorer does not necessarily need to be from `mlm.scorers`, it does need a `score_sentences` function to be compatible with the current code. Therefore you may need to build a simple object based on a transformer such that it returns a list of likelihood values for each string `score_sentences` is passed. See the [mlm-scoring repository](https://github.com/awslabs/mlm-scoring) from [Salazar et al. (2020)](https://aclanthology.org/2020.acl-main.240/) for more details.

The second argument is the name of a [spaCy](https://spacy.io/) pipeline. The available pipelines are listed here: https://spacy.io/models. This was introduced for part of speech tagging purposes, which we keep track of for analysis purposes. These are included in a dataframe with all the generated distractors.

In [None]:
en_tmaze = tmaze.tmaze(scorer,'en_core_web_sm',WORK_PATH,pickle_dict)

The next code block produces and saves the resulting experimental materials as a Javascript file so that it can easily be plugged in to PCIbex.

`ibex_prep.compile_all_sent_items_from_dict` actually determines the distractors. It takes two objects built earlier in this notebook: the TMaze object (i.e. `en_tmaze`) and an attribute of the loaded experimental materials.

The third parameter is the number of potential distractors checked before returning the best one for any given words. In other words, with this set to 100, we find the best of 100 potential distractors. Increasing this number increases the time the function will take to run but the distractors it then returns might be higher quality.

The last parameter is the number of top distractors that are saved in the `pandas` dataframe returned by the function. In this case, we save the top 3 distractors for every word. If the chosen distractor is unideal for any reason, then we can replace it with the second or even third best distractor. We save the dataframe in a csv, in case we want to reload it after generating our distractors for any reason.

In addition to the dataframe, the function returns the sentences in JavaScript (JS) formatting, which we then write to a JS file. This file can quickly be plugged into a PCIbex project. We uploaded [this Github repository](https://github.com/vboyce/Ibex-with-Maze) to PCIbex and then inserted the content of `boyce_matchedDistractors.js` into the `sample.js` file for our validation experiment in [Heuser, 2022](https://dspace.mit.edu/handle/1721.1/147233).

In [None]:
items_js, dist_df = ibex_prep.compile_all_sent_items_from_dict(en_tmaze,boyce.num_item_pairs,100,3)
js_to_write = ""
js_to_write+=items_js
with open(f"{WORK_PATH}/boyce_matchedDistractors.js","w") as doc:
  doc.write(js_to_write)
dist_df.to_csv(f"{WORK_PATH}/boyce100matchedDistractors.csv")

The following commented out functions were written to allow you to easily adjust the language specific setup to your experimental materials. `delete_nonwords_after` adds words to the list of nonwords that should not be considered for distractors, such as acronyms or slang words, for example, if your experimental materials consist of formal language. `switch_word_cap` allows you to make a word that might have been algorithmically saved as lowercase, like "trump's," uppercase because it is more commonly found in this form, and vice versa.

In [None]:
#eng.delete_nonwords_after(["werid_word0","weird_word1"])
#eng.switch_word_cap("Wrong_capitalized","wrong_lowercase")