Style guide

Kenneth Benoit edited this page Apr 11, 2018 · 7 revisions

quanteda style guide

Style is important, and we want our code to be readable and look great.

In general, we follow the tidyverse style guide, with a few exceptions noted below.

Source files

  • Source files use the extension .R (not .r).
  • In general we have one function per .R file, although closely related functions (e.g. translation) are grouped in single .R files.
  • Use meaningful lowercase names, in snake_case, for source files with words separated by underscores e.g. tokens.R, text_model_wordscores.R, etc.

Function and Variable names

  • Use snake_case for function names and variable names, following the rOpenSci guidelines.
  • Do not use dot.separated names for anything except when extending S3 generic methods.
  • Use short variable names for very local or temporary variables, and longer explanatory names otherwise.
  • Use <-, not =, for assignment.


An opening curly brace should never go on its own line and should always be followed by a new line; a closing curly brace should always go on its own line, unless followed by else. Always indent the code inside the curly braces. See the examples in Hadley Wickham's book Advanced R.


Put spaces around operators to aid expression readibility: see sections 3 and 4 here and the section Syntax: Spacing in Advanced R.

In quanteda, we use 4 spaces (spaces, never tabs) for indentation. Why?

  1. It aids readability (and space is cheap!).
  2. It reminds us of Python.
  3. Mickey Mouse has four fingers, not two.
  4. Because otherwise you will be water-boarded.