Automated text analysis of Requests for Comment (RFCs)
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.

This project is developing code for the automated analysis of the text of Requests for Comment (RFCs) published by the Internet Engineering Task Force, as part of a larger research project studying privacy in technical standard-setting.

For more information, if you want to use these tools or collaborate on their development, please contact Nick Doty.

Some basic graphs produced with this code are available online.


Scripts are not fully parameterized or user friendly. Current usage pattern:

  • clone the repository
  • download all RFCs as .txt into a RFC-all directory within the main directory of the repository
  • (optional: downloaded an updated version of rfc-index.xml from the IETF)
  • python will create a file rfc-search.json with the section titles and lengths for every available RFC

Other functionality:

  • changes to allow for basic string matching against all RFCs (or similar code for all W3C TRs)
  • the graphs/ directory contains d3.js visualizations of some of the measurements

See also

Bigbang, a toolkit for studying communications data from collaborative projects