Skip to content

Coding Style and Idioms

Bob Carpenter edited this page Jan 4, 2018 · 31 revisions

We generally follow the Google Style Guide except for the exceptions listed in the following sections. We use a few tools to automate style and formatting checking, but they don't completely capture our formatting and style guidelines.

Automated style checking tools

cpplint

cpplint.py is a Python (2.7) script that checks for these guidelines. Here is a make target that will check the source code for any instances that do not conform to this standard:

> make cpplint

If your default python version is python 3, the cpplint.py script fails silently. Once you install python 2.x, you can add this to your make/local file to use python 2.x to run the python script:

RUN_CPPLINT = python2 stan/lib/cpplint_4.45/cpplint.py

The make target includes Stan-specific options and limits the tests to the src/stan directory.

clang-format

clang-format is a tool that will automatically format C++ code according to formatting rules in the .clang-format file. Our file is just a few changes from the vanilla Google Style guide. To see which rules are different, you can run this command:

clang-format --style=google -dump-config > google
diff google .clang-format 

And here's the current output:

15,16c15,16
< AllowShortIfStatementsOnASingleLine: true
< AllowShortLoopsOnASingleLine: true
---
> AllowShortIfStatementsOnASingleLine: false
> AllowShortLoopsOnASingleLine: false
36c36
< BreakBeforeBinaryOperators: None
---
> BreakBeforeBinaryOperators: All
86c86
< SortIncludes:    true
---
> SortIncludes:    false

You can see the rule definitions here. clang-format doesn't deal with many of our exceptions to the Style guide, below, as they are difficult to automate or not strictly formatting related.

Setup

First, install clang-format. If you're on a mac and using homebrew, please install clang-format@2017-06-22 with this command:

brew install https://raw.githubusercontent.com/Homebrew/homebrew-core/574560408e2ea6787a0af716e91d11468e878b31/Formula/clang-format.rb

If you're on Ubuntu Linux, you can use if you're on Ubuntu you can use update-alternatives:

sudo update-alternatives --install /usr/bin/clang-format clang-format /usr/bin/clang-format-5.0 100
Editor setup

Emacs - there's a plugin/script called google-c-style. You can find it here or just install from MELPA.

Spacemacs - you can add google-c-style to dotspacemacs-additional-packages.

Vim- check out https://github.com/google/vim-codefmt

Git hook

There is a pre-commit hook in hooks/pre-commit, and if you run bash hooks/install_hooks.sh it will be installed for you. Going forward, it will run clang-format -i (which implicitly uses the .clang-format file in the repo) on any changed files to format them. If you need to disable this for some reason (though your tests will fail if you haven't formatted things correctly) you can use git commit --no-verify.

Exceptions to the Google Style Guide

Listed below are Stan's exceptions to the Google Style Guide. These rules that correspond to running cpplint with a filter argument are marked with [category/rule].

These rules are ordered in the same order as the Google Style Guide.

Header files should be self-contained and end in .h. Files that 
are meant for textual inclusion, but are not headers, should end 
in .inc. Separate -inl.h headers are disallowed.

Our header files are .hpp files.

We'll have trouble with "Nonmember functions should not depend on external variables" due to our use of globals for memory management

Not sure about how this impacts our memory management:

Static or global variables of class type are forbidden: they cause hard-to-find bugs due to indeterminate order of construction and destruction. However, such variables are allowed if they are constexpr: they have no dynamic initialization or destruction.

What they suggest as a workaround is this:

If you need a static or global variable of a class type, consider initializing a pointer (which will never be freed), from either your main() function or from pthread_once().

The issue with that is that it'll have to be done in the interfaces (PyStan, RStan, CmdStan).

Because of our autodiff, we can't comply with:

Objects with static storage duration, including global variables, static variables, static class member variables, and function static variables, must be Plain Old Data (POD): only ints, chars, floats, or pointers, or arrays/structs of POD.

Otherwise, we should follow it.

This won't work for our autodiff type var() and fvar(), but otherwise sounds like a good idea:feasible:

Use the C++ keyword explicit for constructors callable with one argument.

I don't quite think our uses match their exception:

Classes that are intended to be transparent wrappers around other classes are also exceptions.

Inline class definition

I think this is because they're inline by default, but I don't think it matters because the compiler's going to decide what to inline and there's also semantics for inline (can be redefined in multiple translation units):

Do not put large method definitions inline in the class definition. Usually, only trivial or performance-critical, and very short, methods may be defined inline.

Reference Arguments [runtime/references]

I strongly disagree with this one, which they claim is a "very strong convention in Google code"

All parameters passed by reference must be labeled const.

They want us to use pointers, the reason being:

References can be confusing, as they have value syntax but pointer semantics.

So much for their injunction to assume readers of the code will know C++.

I'm not sure if our use of var will allow this:

Friend declarations should always be in the private section.

We will allow exceptions.

We have to violate this for the existing exception hierarchy in throwing exceptions with line numbers from Stan programs (otherwise I agree).

Avoid using Run Time Type Information (RTTI).

No, no, no. they want printf of all things (which causes runtime errors).

Use streams only for logging.

I think this is wrong because of size_t, and because of the crazy __float128 extension:

Of the built-in C++ integer types, the only one used is int.

If a program needs a variable of a different size, use a precise-width integer type from <stdint.h>, such as int16_t.

On Unsigned Integers

Although I agree with the motivation to avoid subtle bugs, this messes up our other error checking that Daniel's been so careful to stomp (signed vs. unsigned comparisons):

So, document that a variable is non-negative using assertions. Don't use an unsigned type.

Too late for this (and they hate enable_if, but make an exception for general Boost packages like Boost Spirit):

Avoid complicated template programming.

Of course, we use none of their approved modules (though Boost Spirit's not on it despite being mentioned elsewhere), but we use lots of other ones.

Use only approved libraries from the Boost library collection.

No, I like Boost and Eigen here, because they follow C++ rather than C conventions:

C++ files should end in .cc and header files should end in .h

We went with Boost and Stroustroup, not Eigen and Google:

Type names start with a capital letter and have a capital letter for each new word, with no underscores: MyExcitingClass, MyExcitingEnum.

Though we haven't been entirely consistent (e.g., VectorView).

We do this everywhere, including struct declarations.

The names of variables and data members are all lowercase, with underscores between words. Data members of classes (but not structs) additionally have trailing underscores.

We went with the all-caps convention, not this:

Use a k followed by mixed case, e.g., kDaysInAWeek, for constants defined globally or within a class.

Again, we followed Stroustroup, not Google:

Regular functions have mixed case; accessors and mutators match the name of the variable: MyExcitingFunction(), MyExcitingMethod(), my_exciting_member_variable(), set_my_exciting_member_variable().

I think it's too late to change all of this as it'd be a massive change. And I really hate that they capitalize --- they should at least follow Java's camel case, which would be "myExcitingFunction()" to distinguish from types.

I disagree with

Use either the // or /* */ syntax, as long as you are consistent.

We should always use // other than for doc comments. The problem with /* ... */ is that you can't comment it out when you're debugging.

File Comments [legal/copyright]

I think we should avoid this:

Start each file with license boilerplate, followed by a description of its contents.

I just hate all the boilerplate.

They make author lines optional, and I think we should stay away from them. We have Git blame, after all.

Just say no to this kind of redundancy:

Every file should have a comment at the top describing its contents.

Follow the authorial rule: show, don't tell!