Skip to content

mattelim/interprexis-mit-6.8610-nlp

Repository files navigation

InterpreXis

image tldr: We propose InterpreXis, a novel approach to finding human-interpretable concepts inside contextual word embeddings. InterpreXis involves training linear classifiers to identify interpretable axis groups, which can be used for downstream tasks such as text classification and visualization.

set up instructions

secrets file

  • duplicate secrets_example.json + rename secrets.json
  • copy and paste OpenAI key in secrets.json

running the pipeline

  • open final_pipeline.ipynb
  • run the first few cells until you see this: image change the classification to your desired category (animals, art, cities, clinical).
  • do not run the "create dataset" or "token and create DistilBERT embeddings" sections. scroll until you see this: image
  • run the rest of the notebook and everything should proceed smoothly!

dataset

  • the final dataset is located in data/final_data.csv

(if you download the whole repo, it should be automatically detected when running final_pipeline.ipynb)

repo structure

the main files you will need to run our final code are described above. a brief summary of the repo structure is included below:

  • data/: this folder contains the various files we used to construct our final dataset, as well as other datasets we experimented with while developing our methodology
  • img/: this folder contains figures generated by our code to show the results of different experiments
  • outputs/: this folder contains files with the textual output from running our code (e.g., llm outputs, statistics, etc.)
  • new-method-exp/: this folder contains files with earlier experiments/iterations of our final methodology
  • old-method-exp/: this folder contains files with the experiments/code needed to run our intial methodology (see Sec. 3 of our paper)

About

InterpreXis: Finding Human-Interpretable Concepts Inside Contextual Word Embeddings (MIT 6.8610 NLP Project)

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •