-
Notifications
You must be signed in to change notification settings - Fork 82
Coding guideline
- Library structure
- File structure and naming
- Spacing, Indention, Alignment, and Naming
- Functions
- Metafunctions - "functions" always evaluated at compile time
- Exceptions
- Thread Safety
There are three structures or hierarchies in the library:
- Filesystem hierarchy
- Module hierarchy (almost identical to filesystem hierarchy)
- Namespace hierarchy
- The library is structured into modules and submodules, represented by directories.
- Modules are subdirectories of the top-level include directory.
-
Submodules are all subdirectories of module directories, except subfolders called
detail
(see below). - Modules and submodules may contain a
detail
folder that does not constitute a submodule, it belongs to the (sub-)module it is in and contains implementation detail (all headers inside are not part of the API / may change at any time). - There are no sub-sub-modules, a third level folder is only allowed, if it is a
detail
folder. - All (sub-)modules must provide a "meta-header", called
all.hpp
.- The meta-header includes all headers of the module (except those with
_detail
in their name) and all of its submodules' meta-headers (but not explicitly any content from adetail
subdirectory). - Unless otherwise required, the order of includes shall be alphabetical with submodules before files [this is also the filesystem's order]
- There is a top-level "meta-header" that includes all modules.
- The meta-header defines a Doxygen module of its name for the documentation.
- Library headers should never include
all.hpp
from other (sub-)modules, always include exactly the header required.
- The meta-header includes all headers of the module (except those with
- All documented entities that are not members (files, free functions, classes...) shall set the documentation property
\ingroup MODULENAME
to their respective (sub-)module, i.e., the name of the folder they are in, except where this is calleddetail
(in that case it is its parent). - We should include as few as possible in the headers.
- This applies to both
std::
andseqan3::
. - In snippets, we can be more verbose, i.e., including
<vector>
, even though it is already included by some seqan3 header. - In projects:
hpp
should only include what is needed in the header.cpp
should include what is additionally needed. But it would also make sense to use as few as possible in thecpp
. - The reasoning for
hpp
is that you want to include as few as possible since multiple translation units might include the header, but the "implementation" is provided via somecpp
which then might be linked.
- This applies to both
seqan3/alphabet/aminoacid/all.hpp // meta-header of the aminoacid sub-module of the alphabet module
seqan3/alphabet/aminoacid/... // aminoacid sub-module of the alphabet module
seqan3/alphabet/.../... // other sub-modules
seqan3/alphabet/detail/... // implementation detail of the alphabet module
seqan3/alphabet/.../... // other sub-modules
seqan3/alphabet/all.hpp // meta-header of the alphabet module
seqan3/alphabet/... // alphabet module
seqan3/.../... // other modules and sub-modules
seqan3/all.hpp // global meta-header that provides all of the API
-
namespace seqan3
: everything, except the following exceptions -
namespace seqan3::detail
: all free/global functions, metafunctions, static variables and class definitions that are considered implementation detail and not part of the API (and not a private/protected member of a class) -
namespace seqan3::view
: seqan-defined views (usually inseqan3/range/view
) -
namespace std
: overloads ofstd::
functions likestd::begin
(rarely) (We writea std::...
as we read it as[stood]::...
.) - nothing should be in the global namespace
- never have
using namespace ...
in a header file, except inside the test framework
- braces go on newline
- closing braces shall be documented
- there is no indention!
- for better readability and because we don't indent, nested namespaces are not declared inside parent, but separately and with full name
namespace seqan3::detail
{
void my_non_public_function()
{
// ...
}
} // namespace seqan3::detail
namespace seqan3
{
void my_public_function()
{
detail::my_non_public_function();
// ...
}
} // namespace seqan3
- header-only: SeqAn is a header-only library and all functionality is implemented in header files.
-
extension: Header files have the extension
.hpp
. -
file-names:
- all-lower snake_case
- only lower case standard characters
[a-z]
and underscore! - generalised singular (
concept.hpp
instead ofconcepts.hpp
) - if all the content of a file is inside the
namespace seqan3::detail
, the filename shall end_detail.hpp
or the file be placed in adetail
subdirectory
- UTF-8: All files are expected to be UTF-8 encoded. Special characters are allowed in documentation and string literals, but not in regular code (function names etc.) and file names.
- self-contained: every header shall include every other header that it needs.
-
seqan3/core/platform.hpp
: every header that doesn't include another SeqAn3 header shall includeseqan3/core/platform.hpp
. -
visibility: every header that is not in a
detail
subfolder or contains_detail
in its name is considered part of the API and may be directly included by users of the library.
-
Copyright notice:
- Copy from the license file or another header.
- Update the year if necessary.
- Don't add an Author line, instead add it to doc (see below).
-
File-Documentation:
-
\file
This says Doxygen that the documentation block belongs to this file -
\brief
+ one-line description starting with upper case and ending in.
\author your name <your AT mail.com>
-
(files don't get this!)\ingroup MODULENAME
- optionally a longer description
-
-
Single-inclusion:
#pragma once
(more information); we don't use header guards. -
Names and order of includes: (an empty line between each block, sorted alphabetically within)
- C system library (rarely!)
- C++ Standard Library, e.g.
-
#include <vector>
and -
#include <seqan3/std/ranges>
, this header is in the future the same as#include <ranges>
, so it will be ordered as if the prefixseqan3/std
is not there.
-
- SDSL
- Ranges-V3 (Always
#include <seqan3/std/ranges>
before including any Ranges-V3 header) - SeqAn3
- Cereal
- Lemon
- All headers are always included with
<header>
, not with"header"
, even SeqAn3! - The reasoning for this order is System β required Dependencies β SeqAn β optional Dependencies (because the inclusion of optional dependencies might depend on values/macros from other headers, especially
platform.hpp
) - Of course there are exceptions to this rule, but they should be very well argued!
-
rest of file (likely starts with a namespace opening)
// -----------------------------------------------------------------------------------------------------
// Copyright (c) 2006-2020, Knut Reinert & Freie UniversitΓ€t Berlin
// Copyright (c) 2016-2020, Knut Reinert & MPI fΓΌr molekulare Genetik
// This file may be used, modified and/or redistributed under the terms of the 3-clause BSD-License
// shipped with this file and also available at: https://github.com/seqan/seqan3/blob/master/LICENSE.md
// -----------------------------------------------------------------------------------------------------
/*!\file
* \brief Contains many nice things.
* \author Hannes Hauswedell <hannes.hauswedell AT fu-berlin.de>
* \ingroup alphabet
*/
#pragma once
#include <any>
#include <type_traits>
#include <vector>
#include <range/v3/view/drop.hpp>
#include <range/v3/view/take.hpp>
#include <seqan3/range/container/concept.hpp>
// here comes the code
In some cases, you need headers only in DEBUG mode, e.g., when static_assert
ing a concept. In these cases, you may include the header inside the DEBUG block, but only if this actually improves readability of the file.
10 different DEBUG blocks which each include different headers do not improve readability. In this case, please move everything to an extra _detail.hpp
.
- always indent by four spaces, no tabs allowed
- indention larger than four may be used to achieve alignment (see below)
- indent every scope, except namespaces
- always place opening brace on newline
- no trailing whitespace. EVER.
- maximum line length is 120.
Some basic rules:
- semicolon, comma
;
,,
β never a space before, always a newline or space after - arithmetic
+
,-
,*
,/
β always a space before, always a newline or space after - logical
&&
||
β always a space before, always a newline or space after (do not use alternativeand
,or
) - comparison
==
,!=
,<
,>
,<=
,>=
always a space before, always a newline or space after - bitshift/stream
<<
,>>
β always a space before, always a newline or space after - subscript
[
,]
never a space before either, never a space after[
- references
&
,&&
β always a space before, always a space after (unless variable omitted in function declaration)
Parenthesis (
and )
:
- do not use for casts, use c++ style casts instead (usually
static_cast<>()
) - do not use for initialization, use brace-initialization instead
{}
-
for
,if
,while
- space between keyword and
(
, no space after(
- no space before
)
, newline after)
- space between keyword and
- function declarations, definitions, and call β no space before, no space after
(
- if you don't close an open parenthesis, align the next lines after opening
(
Splitting code over multiple lines:
- in general, put operators at the end of the line and begin a word on the next line aligned with the corresponding word of the current line, e.g.:
if (foo &&
bar &&
bax)
//...
func(foo,
bar,
bax);
my_enum e = my_enum::FOO |
my_enum::BAR |
my_enum::BAX;
- An exception to this rule is the pipe-symbol in the context of range and views where it is put on the beginning of the line, either aligned with a pipe-symbol on the previous line, or with the assignment operator:
auto v = foo | view::bar
| view::bax;
auto v = view::myvee(foo)
| view::myvee2
| view::myvee3;
General:
- opening braces always go to the beginning of a new line
- only exception: tiny lambdas that completely fit into one line (including, e.g., surrounding function)
- always lead to indention of contents, except for namespaces
- empty bodies can be closed on the same line as opening
- otherwise, the closing brace goes on a newline as well
- always balance braces, i.e., if you have
if
andelse
and one body has braces, the other must, too.
General:
- for all types that are not built-in arithmetic types, use brace initialization if at all possible (not
()
or=
- initialise all variables upon declaration, unless you really know what you are doing (if in doubt, initialise with empty
{}
)
const
-ness:
- when possible, make variable
constexpr
orconst
(in that order) - always use "east-const", i.e., put
const
on the right of the type that is modified; see http://slashslash.info/eastconst/ for more info - if a variable is
constexpr
, put theconstexpr
on the left of the type (west-constexpr)
Global variables:
- should be
inline
andconstexpr
// brace-initialize, don't use =
int i{7};
int & k{i};
float f{4.5};
// assignment
i = 8;
f = 3.4;
// loops
for (size_t j = 0; j < i; ++j)
std::cout << j << '\n';
// linebreaks and alignment for readability
// in this case add braces, even for one-line body
for (size_t j = 0;
(j < i) && some_very_long_condition_or_call();
++j)
{
std::cout << j << '\n';
}
// always balance braces
if (i < 7)
{
i = 21; // just one line
}
else
{
i = 9;
f = 13.3;
}
// tiny lambda may go in one line
auto f = [] (int & i) { ++i; };
// long lambda must not
std::for_each(v.begin(), v.end(), [] (int & i)
{
i += 17;
// ...
});
General:
- use return type auto only if it actually improves readability
- use trailing return type only when strictly necessary
Spacing (independent of declaration, definition, or invocation):
- no space before, no space after
(
- always a newline/
;
after)
except when using trailing return type
Line breaks:
- you may always put different argument on individual lines for improved readability, especially in function definitions
- if you put one argument on an extra line, put each on its separate line
- if you do, also do this for template arguments if you have a function template
- if all arguments are on individual lines, but you still exceed line length 120, move
constexpr
orinline
and the return value to its separate line
Alignment:
-
if you have line breaks, align lines after opening
(
-
place
inline
orconstexpr
before the return type of the function (note that in contrast to variable definitions, theconstexpr
keyword does not influence the return type of the function, so it doesn't go to the right of the type)
Empty lines in function body:
- No double-newlines, ever.
- Always empty newline before new scopes if they are not part of
if
or a loop. - Always empty newline after
}
that closes a scope. - Newline after for-loop body that doesn't have
{}
is highly recommended - No strict rules otherwise; attempt to improve readability.
- e.g., if a function only has three statements, it may be OK to not have any empty lines. If it is longer, group statements.
TODO (old guide: Classes and Structs)
General:
- use
typename
, notclass
- don't use short concept forms
- always put the
requires
clause on its separate line - indent requires clause by four spaces
- if one part of the function header is aligned, align also the rest
Spacing:
- no space after opening
<
and no space before closing>
- template declaration has space before opening
- template usage or specialization has no space before opening
- no space between multiple closing
>
Line breaks:
- you may always put template parameters on individual lines for improved readability
- you are required to this for function templates where the function parameters are also on individual lines
Alignment:
- if you have line breaks, align lines after opening
<
// small function, readable
inline void my_free_function(int const i, float const f)
{
// ...
}
// larger header -> introduce linebreaks and alignment for readability
template <typename my_type,
typename my_type2>
requires std::is_integral_v<my_type> &&
std::is_floating_point_v<my_type2>
inline void my_free_function(my_type const i,
my_type2 const f)
{
// ...
}
We distinguish between
- member functions of a class or struct, also called methods
- free functions in the scope of namespace (not class/struct), also called global functions
First read the Chapter on Functions in the CoreGuidelines!
Basics:
- in-parameters are parameters that are only read from in the function
- shall be
type variable
(copy) if you want a copy inside function - shall be
type const variable
for small built-ins, i.e., mainly arithmetic types - shall be
type const & variable
for specific class types - shall be
type && variable
for parameters with type deduction (templated parameters) - shall be
type const & /**/
for type-only parameters / tags
- shall be
- in-out-parameters are parameters that are both read from, and written to
- shall be
&
in all cases
- shall be
- out-parameters are parameters that are only written to
- shall be return values (if multiple return values, put them in
std::tuple<>
) - in case you need to specialize over the type of the out-parameter, treat it as in-out
- shall be return values (if multiple return values, put them in
- ordering shall be 1. in-out, 2. in, 3. in (tags and pure type parameters)
void foobar(type3 & in_out, type4 const & in, tag_type const & /**/)
Reasons:
- in-parameters:
- If you plan on copying the argument inside your function, you should instead copy it in the signature because this enables usage of the move-constructor, saving the copy operation if the function is passed a temporary. It also eases writing exception-safe code. [See also the copy and swap idiom]
- The specified types are smaller than references.
- Don't copy, since you don't have to; use
const
protection because you can. - Since the type before the
&&
is subject to type deduction, the type is not an rvalue reference, but a forwarding reference. This implies that it can resolve to&
,&&
and alsoconst &
so it is more generic than onlyconst &
. This is especially important for objects that are notconst
-iterable like certain ranges. - If you are not going to use an argument's value, omit the variable to enable compiler optimization.
- out-parameters
- Since C++17 there is guaranteed copy elision on return values, so we don't need to worry about it and just return. There are also so-called structured bindings to easily access the return values.
Number of arguments:
- should be β€ 5
- use ranges instead of individual begin + end iterators (if applicable)
- use
std::pair
s andstd::tuple
s instead of individual value parameters (if applicable) - use traits instead of individual type parameters (if applicable)
There is no strict policy on the order of arguments (e.g., "output before input"), use the following guideline:
- "data arguments" (e.g., a string that is being processed)
- "input data arguments" come before "output data arguments" (but often the latter are return values anyway)
- "option arguments" (e.g., how the string shall be processed)
- type-only "option arguments" (e.g., tags or traits)
Also keep in mind that if you want to default some parameters, they need to be at the end. Execution policies are a special case and always come first.
Design your function signature so that there aren't too many possible interfaces, ideally 1-2, but not more than 3-5 (with and without defaults).
- always constrain your template parameters!
- choose the least-constrained concept that works for your algorithm
- but enforce all the requirements that you actually have!
- use forwarding reference
&&
instead ofconst &
, also for read only parameters (see above)
Examples for 2.:
- do not require a
container_concept
if aforward_range_concept
is sufficient - do not require a
random_access_sequence_concept
if asequence_concept
is sufficient
Examples for 3.:
- if you do require random access, make sure that you include the corresponding requirements!
TODO?
TODO?
We distinguish
- value metafunctions are functions or other constructs that return a value at compile time
- type metafunctions are constructs that return a type
There are different ways to implement value metafunctions in C++:
-
struct
templates withenum
definitions -
struct
templates withstatic const
orstatic constexpr
data members - free
constexpr
functions (only meta if evaluated inconstexpr
context) - global
constexpr
variable templates
In SeqAn3 we use the style of the STL which is:
- a (possibly constrained)
struct
template withstatic constexpr
value
member; and - a shortcut of the same name, suffixed with
_v
as aconstexpr
variable template
Example:
template <typename alphabet_type>
requires detail::internal_alphabet_concept<alphabet_type>
struct alphabet_size
{
static constexpr underlying_integral_t<alphabet_type> value = alphabet_type::value_size;
};
template <typename alphabet_type>
constexpr underlying_integral_t<alphabet_type> alphabet_size_v = alphabet_size<alphabet_type>::value;
Note that internally, you may of course use constexpr
functions or other forms of metaprogramming, but the public interfaces shall be as specified here.
There are different ways to implement type metafunctions in C++:
-
struct
templates withtypedef
orusing
declarations - global templatised
using
declarations - calling
decltype()
on (constexpr
) functions
In SeqAn3 we use the style of the STL which is:
- a (possibly constrained)
struct
template with a localtype
alias; and - a global shortcut of the same name, suffixed with
_t
as a templatised using declaration
Example:
template <typename alphabet_type>
requires detail::internal_alphabet_concept<alphabet_type>
struct underlying_integral
{
using type = typename alphabet_type::integral_type;
};
template <typename alphabet_type>
using underlying_integral_t = typename underlying_integral<alphabet_type>::type;
There are different ways to specialize type metafunctions:
- βpartial template specialization
- stronger constraints
The first case is especially handy for template subclassing, but it has the drawback that it does not work if regular inheritance is used (which is now more often the case because we rely on concepts in other places):
template <typename type>
struct is_foo : std::false_type
{};
template <typename ...>
struct is_foo<foo_impl<...>> : std::true_type
{};
//is_foo<foo_impl<int>> == true_type
template <typename t>
struct my_type : foo_impl<t>
{
//...
};
//is_foo<my_type<int>> == false_type
The desired behaviour can be achieved with a template template
and constraints:
template <typename type>
struct is_foo : std::false_type
{};
template <template <typename...> type, typename ...types>
requires std::is_base_of_v<foo_impl<types...>, type<types...>>
struct is_foo<type<types...>> : std::true_type
{};
//is_foo<foo_impl<int>> == true_type
template <typename t>
struct my_type : foo_impl<t>
{
//...
};
//is_foo<my_type<int>> == true_type
- Always guarantee at least the basic exception guarantee (2)!
- If you can, enforce the strong exception guarantee (3)
- move construction, move assignment and swap should always be no-throw
See section Exception-Safety for details on exception safety.
When do we use noexcept
:
- If we can ensure that everything within the function body can never throw
- If it is critical that the function does not throw (move semantics)
- Attempt to always make move construction, move assignment and swap
noexcept
! - Use the
noexcept()
-operator if necessary
- Attempt to always make move construction, move assignment and swap
- If there is a measurable performance gain (tests!)
Note: Since explicitly defaulted constructors are noexcept
if they can, do not explicitly declare them noexcept
, except if you want to enforce this.
See section The noexcept specifier (C++11) for details on noexcept
.
Related issue: #45 Related design discussion: 2020-03-30
Safety-Guarantee
- none or unknown
- basic (invariants of the component are preserved, and no resources are leaked)
- strong (if an exception is thrown there are no effects)
- no-throw (the code will never ever throw)
Adding noexcept
to your function declaration will tell the compiler: This function never throws!
void my_function() noexcept // "will never throw"
{
// ...
}
-
Ensures the no-throw exception guarantee (see above)
-> can be used accordingly (e.g., when using it in another function to ensure a strong exception guarantee)
-
The compiler may optimize your code (e.g., efficient move with std::move_if_noexcept)
What happens if you throw from a noexcept
function? Your program terminates, usually without unwinding the stack.
- Terminating the program isn't the nicest way to report an error to your user.
- Terminating prevents any error handling
- Removing
noexcept
can break the API
Take home message: Use noexcept
if you are confident, avoid if in doubt.
If you are uncertain if something throws, you can use a conditional noexcept
:
template <typename t>
int my_function(t & v) noexcept(noexcept(std::declval<t &>().size()))
{
return v.size();
}
- Functions that are declared
noexcept
- Construction of trivial types (e.g.,
int
) - Explicitly defaulted constructor and assignment operator
foo() = defaulted;
(c++11) are implicitly noexcept and constexpr if they can (see stack overflow which references the standard)
- https://blog.quasardb.net/when-noexcept-2/
- https://visualstudiomagazine.com/articles/2016/10/01/noexcept.aspx
- https://www.modernescpp.com/index.php/c-core-guidelines-the-noexcept-specifier-and-operator
- https://akrzemi1.wordpress.com/2014/04/24/noexcept-what-for/
- https://isocpp.org/blog/2014/09/noexcept-optimization
There are 4 categories for a function:
- not thread-safe or unknown
- does not modify data (safe to be called from multiple threads, as long as no other functions modifies the data)
- modifies, but re-entrant (safe to be called from multiple threads, as long as the data is different β different parameters or member function on different object)
- thread-safe (always safe)
Some rules-of-thumb:
- All const member functions should be 2 or 4.
- All non-const member functions should be 3 or 4.
- All free functions shall be 2 or 4 (if they take only copy or
const &
parameters) or 3 or 4 (otherwise)
Every function starts with 1., but should at least guarantee 2.
TODO: maybe rename to "data races"? Use other definitions?