Skip to content


Folders and files

Last commit message
Last commit date

Latest commit


Repository files navigation



conText provides a fast, flexible and transparent framework to estimate context-specific word and short document embeddings using the 'a la carte' embeddings approach developed by Khodak et al. (2018) and evaluate hypotheses about covariate effects on embeddings using the regression framework developed by Rodriguez et al. (2021).

How to Install



To use conText you will need three objects:

  1. A (quanteda) corpus with the documents and corresponding document variables you want to evaluate.
  2. A set of (GloVe) pre-trained embeddings.
  3. A transformation matrix specific to the pre-trained embeddings.

conText includes sample objects for all three but keep in mind these are just meant to illustrate function implementations. In this Dropbox folder we have included the raw versions of these objects including the full Stanford GloVe 300-dimensional embeddings (labeled glove.rds) and its corresponding transformation matrix estimated by Khodak et al. (2018) (labeled khodakA.rds).

Quick Start Guides

Check out this Quick Start Guide to get going with conText (last updated: 08/04/2023).

Latest Updates

We are hugely thankful to Will Hobbs and Breanna Green for bringing to our attention clear examples where finite sample bias was larger than we had anticipated when implementing our main estimation routine, conText. We are actively collaborating with them to evaluate alternative fixes. In the meantime we've implemented and recommend using Jackknife debiasing. Please refer to the Finite Sample Bias vignette for additional information on the issue and simulation results using various debiasing methods.

Multilanguage Resources

For those working in languages other than English, we have a set of data and code resources here:


An R package for estimating and doing statistical inference on context-specific word embeddings.






No releases published


No packages published