Tokenization and parts-of-speech tagging is some of the most resource-intensive stuff we do, and there are several functions (like run_adj_analysis) where we do this multiple times to the same document. Memoizing this information once it's been computed within the Document function should provide substantial performance improvements with little overhead.