Gender differences in writing: A linguistic cluster analysis

These tools were used for a computational text analysis, analyzing gender differences in writing: who uses emoticons, abbreviations, clippings, etc., the most? Covers feature extraction, text clustering by features, and plotting of results.

This repository contains


A Python module used for cluster analysis and inspection. Implements the methodology for general purpose computer-assisted clustering and conceptualization developed in Grimmer & King. It helps us produce output like this.


A Python module used for extraction of e-grammar features. Implements search algorithms for features of e-grammar listed in Herring, "Grammar and electronic communication". For instance: extract emoticons, extract abbreviations, or extract non-Standard punctuation.


Visualize the study results: which group uses which feature?

