-
Notifications
You must be signed in to change notification settings - Fork 140
Coding style conventions
This document is mainly intended for Chebfun developers. The aim is to provide guidelines for coding style in Chebfun. While it is true that there are concrete programming principles which are regarded as useful by a vast majority of experienced programmers, it is also true that a great number of programming “principles” are a matter of taste. We hope that with the passage of time, these guidelines will evolve and become a list of principles derived from reasons rather than tastes and a consensus on these principles among the current Chebfun team (at least) will be reached.
A good Chebfun developer has to be a good Matlab programmer. Therefore, while some guiding principles may be peculiar to Chebfun, others apply generally to Matlab.
Disclaimer: No claim of originality is being made in this document. Most principles and maxims are adopted from various books and on-line resources. A substantial but incomplete list of references is provided at the end.
Terminology: The terms matrix and array are used interchangeably in this document.
The purpose of formatting is to assist readability of the code.
Each level of code will be indented by 4 spaces. In some editors, a single tab character should do this, but make sure that your editor converts a single tab character to 4 spaces. A typical Matlab editor provides the option of converting a tab to a specified number of spaces, which by default is set to 4 spaces. This is to make sure that the code is displayed uniformly in various text editors which display the tab character differently.
Lines with more than 80 characters should be avoided. This improves readability and portability of the code.
-
Logical operators and the equality sign: The
=
sign should have a single space before and after its occurrence. Similarly, a single space should surround binary logical operators such as&
,&&
,==
,~=
etc. -
Comma & semicolon: For putting a space after a comma, the following conventions are adopted. If a comma is being used to separate arguments of a method, then it should be followed by a space. If, on the other hand, a comma is being used to separate indices of a matrix, no space should follow. For example,
A(iRow,jCol)
should be used when
A
is a matrix andfun(m, n)
should be used for a method
fun
.Another instance where commas and semicolons are frequently used is the assignment or initialization of vectors and matrices. Elements of vectors, matrices or cell arrays will always be separated by commas (or semicolons) and the commas (semicolons) will be followed by a single space. For example:
rowV = [ 1, 2, 3 ]; colV = [ a; b; c ]; matA = [ colV, rowV' ]; matB = [ rowV; colV' ];
Commas and semicolons will never be preceded by a space.
The use of white space padding at the start and end of the vector is left to the discretion of the author. For example, both of the following are valid:
rowVa = [ 1, 2, 3 ]; rowVb = [1, 2, 3];
-
Terminating statements: Semicolons are frequently used to terminate a statement. However, if a statement is silent, giving no output on the command line, then it should not be terminated by a semicolon. For example instead of
plot(x, y); hold on; shg;
one should write
plot(x, y) hold on shg
-
Brackets: For spacing around brackets, we adopt the convention of putting no space before and after a bracket if the brackets are being used to enclose the indices of a matrix, or the arguments of a method. If, on the other hand, brackets are being used to group together various tests of a conditional, we should surround them bracket with single spaces. For example,
A{iRow,jCol} + B(iRow,jCol)
should be used for arrays
A
andB
. Similarly,fun(m, n)
should be used for a method
fun
. However, for a conditional statement, we write:if ( a == b || a == c )
Similarly, for matrix, vector or cell array assignment/initialization, the opening bracket will be followed by a single space while the closing bracket will be preceded by a single space. Here is an example:
A = { A1; A2; A3 };
-
Spacing for binary operators: The binary operators
+
and-
should be surrounded by single spaces. For example, an expression may look likec = a + b - c - d + e
Spaces around operators
*
,.*
,/
,./
,\
, and so on, are optional and are left for the programmer to decide. For example:c = a*b + 2./x + nRows - nCols*nRows./(y + 1); A = [ 1, 3, 4 ] * c/2;
There should be no blank lines at the end of a method. [TODO] More to be added by Anthony regarding trailing spaces at the end of a line.
Only one statement should be written in a single line.
These mainly involve branching and looping. Statements like if
, switch
,
for
, while
, etc., all fall into this category. Any control structure
should be written with proper indentation, placing a single statement on each
line. This in particular means that one-line if
statements or one-line loops
should be avoided. It is true that there are situations when this rule will
seem too verbose, but to keep clarity and uniformity in the code we have
agreed to adopt this principle. Here is an example:
for counter = 1:5
if ( a == b )
a = 2*b;
else
a = 3*b;
end
end
It is also suggested that for if
and while
the conditions which
are being tested are always bracketed even if it is a single condition as in
the example above.
If there are multiple conditionals within an if
statement, then conditionals
involving a logical binary operator such as ==
or >=
will be enclosed in
an extra set of parentheses. Here is an example. We write
if ( (a == b) || (a > c) && all(x) )
instead of
if a == b || a == c && all(x)
Similarly, we write
if ( (naring < 3) || isempty(x) )
instead of
if ( naring < 3 || isempty(x) )
This again adds clarity and avoids possible bugs in some situations.
While assigning the logical outcome of multiple conditionals to a variable, we use the format
isHappy = (a == b) && all(c)
instead of
isHappy = a == b && all(c)
The %%
sign breaks a Matlab file into blocks which are called cells. When
the cursor is moved within a single cell in the Matlab editor, the cell is
highlighted, creating a visual aid to help concentrate on a particular part of
the code. Chebfun programs encouraged to divide their code into cells (typically
less than 20 lines). This improves modularization and readability within a
single file, not only at the visual level but also at the logical level.
Here is a crude template for a general method file in Chebfun Version 5.
function y = foo(x)
%FOO FOO computes the foo of a chebfun X and returns the result
% in chebfun Y. Notice the spacing. There are three spaces after
% the function name in the first line and three spaces before
% the start of each line in the help text. A blank line is then
% inserted before the "See also" section. The "See also" section
% has only one space after the % symbol. Same is true for the
% copyright section which follows the "See also" section.
%
% See also FOOBAR, WHATNOT.
% Copyright 2013 by The University of Oxford and The Chebfun Developers.
% See http://www.chebfun.org/ for Chebfun information.
%%
% First block of code with comments.
y = 0;
w = x+y;
%%
% Second block of code with comments.
z = y;
w = w+z;
end
Hale and Austin have written a detailed layout for classdef
files. This can
be accessed through our shared Dropbox folder.
Good naming conventions are very useful in making the code self explanatory. Appropriately chosen names also act as meta-data of a code and allow the reader to extract useful information about the context, data type, and the actual data associated with a variable.
Within the Chebfun team, there is a consensus on at least one principle for naming variables: descriptive but at the same time not too verbose. These apparently contradictory demands already give us a hint that choosing an appropriate name for a variable might prove to be trickier than it seems. There are mainly three types of items that we need to name:
- Variables or objects
- Functions or methods
- Classes
Let us start with variable names first.
This notation is not widely used by Matlab toolbox programmers, but it is
growing more popular. CamelCase notation is the practice of choosing compound
words or phrases as variable names, where parts of the compound word are
joined without spaces and are capitalized within the compound according to a
certain rule. For example getThisValue()
, MatSize
or funVal
etc.
In Chebfun, we make use of this notation at various levels. For example, the
number of Chebyshev points used by a fun
is accessed by fun.n
in
Version 4. In Chebfun Version 5, this changes to fun.nPts
.
Chebfun relies heavily on matrices, and this requires a lot of indexing variables and variables determining row and column sizes etc.
-
Matrices vs. Vectors: A matrix should be named with a capital letter and a vector with a small letter. This fits naturally with
Ax = b
. However, this also entails the loss of distinction between scalars and vectors. -
Operators: Linear operators should be denoted by
L
explicitly or by usingL
as a prefix, while non-linear operators by usingN
. For operator arithmetic, we may always use capital letters:A, B, C
e.g. and define methods likeC = plus(A, B)
. -
Index Variables: In Matlab,
i
andj
are used to represent the imaginary unitsqrt(-1)
. An expression of the sorti+j
can be accordingly confusing (for some people at least). Also, when there is a series of loops within a single file, usingi,j
can be even more confusing because ifi
orj
appear later in the code, they might already have have been defined with some unwanted values.We should also avoid the use of the variable name ``idx
' as a shortened version for the word
index'. This usage appears in many places in Chebfun Version 4, but it can be confusing—especially automatic differentiation people may think of it as the ID of some `x`.The preferred solution is to use
i
,j
, andk
as prefixes of an indicative name. Here is an example:for iRow = 1:nRows for jCol = 1:nCols A(iRow,jCol) = i+j; end end
-
Row--Column Sizes: Variables determining the total number of rows and columns of a matrix are extensively used in Chebfun. One should use the names
nRows
andnCols
in cases when a single matrix is involved. When multiple matrices are involved,nRows
— ornCols
— can be used as prefixes. For exampleif( nRowsCheb == nColsDiff ) display('Good to go!'); else display('Multiplication not defined'); end
-
Imaginary Numbers: The imaginary number
z = sqrt(-1)
will be written asz = 1i
. Other examples includez = exp(2i*pi/n)
etc.
Chebfun extends many Matlab functions designed for discrete vectors to their
continuous analogues. For example, the Matlab function (method) min
gives
the minimum of a vector, while Chebfun's min
command gives the minimum of a
function. Since method names in Matlab always start with small letters and do
not use camel-case notation, it seems that Chebfun does not really have a
choice when it comes to naming methods. Since our target audience is Matlab
users, we want to make sure that the users are able to guess the correct name
of the corresponding Chebfun method most of the time. Therefore, we should use
lower-case, simple, short, easy to remember names for functions.
Classes should also have names that are short but meaningful.
This is perhaps the single most important feature of any piece of code to improve its readability. We all agree that comments should be there—in fact, we should have have two tiers of comments intertwined within the code, one tier explaining what the particular piece of code does and the other explaining the code from an object-oriented, class hierarchy and input–output point of view also known as documentation comment blocks. In this section, we only discuss the former kind of commenting, i.e. the one which explains what the code is supposed to do. The latter would be dealt with in the “Documentation” sections.
Comments should be descriptive in nature and one of the aims of the code review process is to ensure this.
All comments will use English alphabets and other characters normally available on a standard keyboard. No accents or other special characters are allowed. Comments are allowed in both British and US spellings.
All comment lines will either be English sentences, starting with a capital letter and ending with a full stop or English phrases ending with a colon.
-
Branching Statements. This is how our standard
if
statement should look:% This if statement does this and that. if ( a == b ) % Do this if that. a = b; else % Do that if this. a = c; end % End of if.
-
Loops. A simple
for
loop:% This for loop loops and loops. for iRow = 1:nRows % Loop through rows and do X. A(iRow,:) = (iRow^iCol) * ones(1, nCols); end % End of for.
It is not necessary to comment every single line of an if
statement of a
for
loop. In particular, if the variable names are chosen wisely, it will
often be obvious what the code is doing and such comments would seem
redundant.
Any occurrence of a variable name or a method name in the introductory help text of a method or any other file will be fully capitalized. An exception to this rule is made when an example code snippet is provided in the help text to explain the usage of the method. For example
function yOut = foo(xIn)
%FOO The function FOO takes XIN as the input and
% gives YOUT as the output.
% Example:
% yOut = foo(xIn);
%
% See also MYFOO, YOURFOO.
When a local variable is referenced in comments explaining the code, the identifier will be used exactly and no extra capitalization will be done. However, method names whether local or non-local will be capitalized and followed by an open pair of parentheses in order to differentiate them from variable names. It is important to remember that following MATLAB conventions, no such distinction is made in the help text. Here is an example:
function yOut = foo(xIn)
%FOO The function FOO takes XIN as the input and
% gives YOUT as the output.
% Example:
% yOut = foo(xIn);
%
% See also MYFOO, YOURFOO.
% The code begins here.
yOut = xIn; % yOut is the same as xIn.
yOut = max(xIn); % MAX() is used to assign the maximum of xIn to yOut.
end
While referring to Inf
or NaN
in comments, capitalization will never be
used. On the other hand, within comments the key-words true
and false
will
always be referred to as TRUE
and FALSE
.
When we're referring to a mathematical interval in a comment, we write [a,b], not [a, b].
We sometimes come across situations where we find a partial fix or an
ingenious but obscure solution to a problem. The fragility and (or)
reliability of such constructs should be made clear by using TODO
or FIXME
phrases in nearby comment lines. Here is an example:
% [FIXME] The following is a kludge and should be improved.
x = xx.^(x(end-1:-1:1))./y;
We now illustrate how the guidelines will affect on existing Chebfun code. We
take a very simple Chebfun file, @chebfun/sin.m
. The is how the file looks
like in Chebfun Version 4:
function Fout = sin(F)
% SIN Sine of a chebfun.
% Copyright 2011 by The University of Oxford and The Chebfun Developers.
% See http://www.maths.ox.ac.uk/chebfun/ for Chebfun information.
for k = 1:numel(F)
if any(get(F(k),'exps')<0), error('CHEBFUN:sin:inf',...
'SIN is not defined for functions which diverge to infinity'); end
end
Fout = comp(F, @(x) sin(x));
for k = 1:numel(F)
Fout(k).jacobian = anon(['diag1 = diag(cos(F)); der2 = diff(F,u,''linop'');' ...
'der = diag1*der2; nonConst = ~der2.iszero;'],{'F'},{F(k)},1,'sin');
Fout(k).ID = newIDnum;
end
This is how the same code would look in Chebfun Version 5:
function Fout = sin(F)
%SIN Sine of a chebfun. F is a quasimatrix and
% Fout is a quasimatrix of the same dimension. Each chebfun in the
% quasimatrix Fout is the sine of the corresopnding chebfun in F.
%
% See also COS, TAN.
% Copyright 2013 by The University of Oxford and The Chebfun Developers.
% See http://www.chebfun.org/ for Chebfun information.
%%
% Loop through the columns of the quasimatrix and rule out singularities.
for k = 1:numel(F)
% If the current chebfun has singularites, report error
if ( any(get(F(k), 'exps') < 0) )
error('CHEBFUN:sin:inf', ...
'SIN is not defined for functions which diverge to infinity');
end % End of if.
end % End of for.
%%
% The output function is a composition of the input function with the
% sine function.
Fout = comp(F, @(x) sin(x));
%%
% Update the Jacobian info in each chebfun within the quasimatrix Fout.
for k = 1:numel(F)
% Update the current chebfun with the Jacobian of sin.
Fout(k).jacobian = anon(...
['diag1 = diag(cos(F)); der2 = diff(F, u, ''linop'');' ...
'der = diag1*der2; nonConst = ~der2.iszero;'], ...
{'F'}, {F(k)}, 1, 'sin');
% Update the ID of the current chebfun.
Fout(k).ID = newIDnum;
end % End of for loop.
end