Script to recursively word count all tex files in all subdirectories.
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.

A repo with a script to recursively run latexcount on all tex files in a target directory. The output of the script is three files:

  • A scatter plot with a fitted regression line [png]
  • A histogram with both code words and words represented [png]
  • A csv file with 3 columns (number of code words, number of words, file name), which is to be used for further analysis

I have written 2 blog posts showing the use of this script (as it has evolved):

  1. Just counting: here.
  2. Regression model: here.


To run the script on a directory (which will recursively search all subdirectories):

./ directory

To run the script on a csv file (which needs to have two column of data: number of code words, number of words):

./ -c file.csv


The script uses matplotlib for the plotting and scipy for the linear regression.

License Information

This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 license. You are free to:

  • Share: copy, distribute, and transmit the work,
  • Remix: adapt the work

Under the following conditions:

  • Attribution: You must attribute the work in the manner specified by the author or licensor (but not in any way that suggests that they endorse you or your use of the work).
  • Share Alike: If you alter, transform, or build upon this work, you may distribute the resulting work only under the same or similar license to this one.

When attributing this work, please include me.