The two datasets required for this analysis were the list of .nyc registered domains, which the NYC Open Data portal provides, and the list of the 10,000 most frequent words, which GitHub user worldwisdom compiled.
The analysis and visualizations were done in R, as per usual.