Code for
Petersen, Erika and Christopher Potts. 2022. Lexical Semantics with Large Language Models: A Case Study of English break. Ms., Stanford University.
-
annotated_break_data.csv
: the annotated dataset -
annotated_dataset_study.ipynb
gets basic stats and tables for the annotated dataset -
static.ipynb
: static representations for break in various versions of word2vec, GloVe, and fastText. -
get_all_reps.ipynb
: gets all the break representations for all the models we consider. These representations are required for the notebooksprobing.ipynb
andvisualizations.ipynb
. -
probing.ipynb
: probing experiment code. -
visualizations.ipynb
: t-SNE-based visualizations of the break representations. -
wordnet.ipynb
: basic analysis of the WordNet hypernym graph for break. -
break_utils.py
: helper code for many of the notebooks. -
fig
: directory containing visualizations included in the paper (output fromvisualizations.ipynb
andwordnet.ipynb
). -
reps
: directory in which representations are stored whenget_all_reps.ipynb
is run. -
results
: probing results files for the probes reported in the paper.