Skip to content

Commit

Permalink
add some docs
Browse files Browse the repository at this point in the history
  • Loading branch information
djmott committed Jul 11, 2016
1 parent abf5dfd commit d314216
Show file tree
Hide file tree
Showing 7 changed files with 476 additions and 0 deletions.
19 changes: 19 additions & 0 deletions docs/Header-Only-Apps.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
Header-Only Apps
================
The modern buzz for library development is a 'header-only' format in which the goal is to provide the code in inline headers rather than separating declarations and definitions into separate files. Taking this one step further I wish to submit that the benefits of header-only designs extend to application development as well and is a wise practice to implement. The purpose of separating declarations from definitions in classic C++ was for resource limitations in computers during an era that has long past. Old habits die hard and it's no longer necessary. Arguably, separating code into .h and .c or .hpp and .cpp files is a bad practice by today's standards.

The basic C++ compilation process passes individual implementation units through the preprocessor to generate a temporary compilation unit. The compilation unit is then passed through the compiler to produce an intermediate object. Typically, all the applications .c/.cpp files are processed this way to produce a number of intermediate files which are finally collected by the linker to produce the resulting binary. It was necessary in ancient times to break down compilation steps this way because systems simply didn't have enough resources to compile all the code in a project.

Each file that's included in a compilation unit must pass through the tokenizer and parser before reaching the compiler. Tokenizing and parsing can account for a good majority of compile times depending on the project so precompiled headers were introduced to help out. PCH is a solution to a problem that was created as the result of a solution to another problem which no longer exists. And though PCH can help improve compile times in binaries with separate declarations and definitions, neither are necessary by today's standards.

I submit that the best format for today's application code is to have a single .cpp file which contains the application's entry point and all other code is contained in inline headers. Adopting this practice is a bit strange at first but it quickly becomes preferable to the classic format with the gains that are realized.

Reduced code is always a win. When all the code is contained in inline headers then the declarations are eliminated. This amounts to reduced technical debt. If an interface or parameter needs to change then it only needs to change in one place.

With the reduction in code comes the reduction in files, half of them to be exact. There's no longer a need to bounce around between multiple files to modify a single logical unit of code. Classic C++ suggested declarations in one file and definitions in another. Half the file maintenance is a big win in the technical debt department.

Compile times can be significantly faster than separating declarations/defintions. It also builds faster than a PCH enabled binary. PCH works by caching a copy of the AST to disk after a compilation unit has passed through the tokenizer and parser then re-using the AST when another compilation unit requests the same headers. This skips the tokenizing and parsing passes of code that has already been encountered. A single compilation unit gets tokenized, parsed and passed through the compiler only once so there's no room for PCH to improve on it. Enabling PCH on such a binary would add an additional step of writing out the AST cache to disk which would never be used anyway.

Increased productivity comes with less code, less files and faster compile times. The bean counters are always happy about increased output. But, increasing productivity will also be gained from the faster binaries that result from the build.

A final relic which is no longer needed is 'link time code generation' which is another 'fix' to the first problem which no longer exists. Multiple compilation units mean inefficiencies in the generated intermediate code. Modern linkers will try to get around some of these inefficiencies by utilizing LTCG to recompile portions of the intermediate objects. Many inefficiencies are reduced with LTCG but many more can still exist and wind up in the resulting executable binary. The most efficient binary results from giving everything to the compiler in a single compilation unit. When the compiler has complete visibility into all the executable code it will more aggressively optimize and the size of the resulting binary is often smaller too.
23 changes: 23 additions & 0 deletions docs/LICENSE.md.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
# Boost Software License - Version 1.0 - August 17th, 2003

Permission is hereby granted, free of charge, to any person or organization
obtaining a copy of the software and accompanying documentation covered by
this license (the "Software") to use, reproduce, display, distribute,
execute, and transmit the Software, and to prepare derivative works of the
Software, and to permit third-parties to whom the Software is furnished to
do so, all subject to the following:

The copyright notices in the Software and this entire statement, including
the above license grant, this restriction and the following disclaimer,
must be included in all copies of the Software, in whole or in part, and
all derivative works of the Software, unless such copies or derivative
works are solely in the form of machine-executable object code generated by
a source language processor.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE, TITLE AND NON-INFRINGEMENT. IN NO EVENT
SHALL THE COPYRIGHT HOLDERS OR ANYONE DISTRIBUTING THE SOFTWARE BE LIABLE
FOR ANY DAMAGES OR OTHER LIABILITY, WHETHER IN CONTRACT, TORT OR OTHERWISE,
ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
DEALINGS IN THE SOFTWARE.
77 changes: 77 additions & 0 deletions docs/Parsing.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
Parsing
=======

Parsing is an extensive subject and a frequent stumbling block of even experienced engineers. Due to the repetitive nature of parsing code, the volume of the points of failure and the frequently changing format of inputs, parsing code is often a significant resource sink for many software projects.

Parsing is frequently outlined with a grammar specification in [Backus-Naur Form (BNF)](https://en.wikipedia.org/wiki/Backus%E2%80%93Naur_Form) which defines the terminals and non-terminals that must be tagged in the input stream. Some tools accept the BNF grammar directly as input and generate all the parsing code as output. These tools are called parser-generators and they can save a lot of pain and time. Though not limited to accepting BNF exclusively as inputs, some parser-generates have custom language specifications or decorate BNF with various text handling code, the basic product is normally the same: some language specification is taken as input and all the boiler plate parsing code is generated as output. In most cases the resulting generated parser includes some interfaces for the library consumer to interrogate the parsed sources. Frequently, the parsed inputs are then feed as inputs to another stage which produces an [Abstract Syntax Tree (AST)](https://en.wikipedia.org/wiki/Backus%E2%80%93Naur_Form). The AST is most often the primary model that applications use to work with the inputs.

Some applications do not use generated parsers nor an AST. Some apps do not even abstract the parsing code from the processing code and will instead parse the inputs directly in-line with application logic. For all except the most trivial applications this is a recipe for disaster. Abstracting the two tasks is almost always the preferred method. However, doing things 'right' and adhering to best practice is also time consuming. For example, using the traditional GNU tools to handle parsing is an entire discipline of it's own that complicates the build process and requires a modicum of domain expertise before any application logic is ever addressed. Modern tools such as Boost::Spirit and Antlr suffer from the same problem of requiring a detailed study of the library to get a simple parse task accomplished.

When the introduction of a tool that is intended to simplify a problem makes the problem more complicated the tool looses it's utility and usefullness. Since using these tools require a significant investiment to learn and use properly it's no surprise that so many developers just opt to hand-roll a parser rather than learn yet another library. With all that said I introduce yet another library.

Introduction
------------

XTL::parse uses template meta-programming techniques to generate parse trees from a grammar specification. The grammar specification is written in C++ templates. The library is header-only and the parse trees are generated at compile time so there's no libraries to link to and no external tools to run as part of the build process. A unique feature of XTL::parse is that the grammar specification gets instantiated as the [AST] when the parse is successful which eleminates the need for an additional import step which is frequent of other similar tools. XTL::parse is a simple LL(k) parser that encourages embedding the grammar specification in the AST for simplicity and clarity.

To illustrate the entire process, here's a simple BNF grammar describing the command line syntax of example_parse1.cpp:

~~~{.cpp}
//terminals
<red> := 'red'
<green> := 'green'
<blue> := 'blue'
<one> := '1'
<three> := '3'
<five> := '5'
<dash_color> := '--color='
<dash_prime> := '--prime='
//rules
<rgb> := <red> | <green> | <blue>
<prime_num> := <one> | <three> | <five>
<color_param> := <dash_color> <rgb>
<prime_param> := <dash_prime> <prime_num>
<parameter> := <color_param> | <prime_param>
~~~
> If the intention of this BNF syntax is unclear there are plenty of tutorials on the web.
This is the sort of BNF that is frequently encountered in RFCs, white papers, programming language and protocol specifications. It maps into a XTL::parse grammar specification as:

~~~{.cpp}
//termnals
STRING(red, "red");
STRING(green, "green");
STRING(blue, "blue");
STRING(one, "1");
STRING(three, "3");
STRING(five, "5");
STRING(dash_color, "--color=");
STRING(dash_prime, "--prime=");
//rules
using rgb = or_<red, green, blue>;
using prime_num = or_<one, three, five>;
using color_param = and_<dash_color, rgb>;
using prime_param = and_<dash_prime, prime_num>;
using parameter = or_<color_param, prime_param>;
~~~

The `parameter` maybe either `--prime=<prime_num>` or `--color=<rgb>`. `<prime_num>` and `<rgb>` maybe either `red, green` or `blue` and `1, 3` or `5` respectively. This example uses a mixture of preprocessor macros and template aliases to define the grammar. The C++ representation is more verbose due to C++ language requirements but it maps to the BNF line-for-line.

Using this specification to parse command line parameters is a matter of passing the start rule and string to a parser:

~~~{.cpp}
int main(int argc, char * argv[]){
std::string sParam = argv[1];
auto oAST = xtd::parser<parameter>::parse(sParam.begin(), sParam.end());
if (!oAST){
//parse failed, show usage or error
}else{
//work with parsed parameters
}
}
~~~
Done and done.

AST Generation
--------------
An AST is instantated and returned by `xtd::parser<>::parse()` when the parse is successful. The AST is an object model that represents the parsed grammar.
96 changes: 96 additions & 0 deletions docs/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,96 @@
eXtended Template Library
=========================
[![Open Hub project report](https://www.openhub.net/p/libxtl/widgets/project_thin_badge.gif)](https://www.openhub.net/p/libxtl)
[![Travis](https://img.shields.io/travis/djmott/xtl.svg?style=plastic)](https://travis-ci.org/djmott/xtl)
[![Coveralls branch](https://img.shields.io/coveralls/djmott/xtl.svg?style=plastic)](https://coveralls.io/github/djmott/xtl)
[![SonarQube Tech Debt](https://img.shields.io/sonar/https/sonarqube.com/xtl/tech_debt.svg)](https://sonarqube.com/overview?id=xtl)
[![SonarQube Quality Gate](http://nemo.sonarqube.org/api/badges/gate?key=xtl&blinking=true)](https://sonarqube.com/overview?id=xtl)
[![Boost License](https://img.shields.io/badge/license-Boost_Version_1.0-green.svg?style=plastic)](http://www.boost.org/LICENSE_1_0.txt)

XTL is a public release of portions from a much larger private set of libraries which I've maintained over the years and used in a number of projects. It's primarily a series of C++ template metaprogramming patterns, idioms, algorithms and libraries that solve a variety of programming tasks. It supplements, extends and cooperates with the STL by providing some frequently used components that are otherwise absent from the standard. A short list of some of the more notable headers:

|Header |Description|
|--------------------|-----------|
|callback.hpp |single producer notifies multiple consumers of an event|
|dynamic_library.hpp |load and invoke methods in a dynamic library|
|parse.hpp |text parsing and AST generation|
|socket.hpp |general purpose socket communication|
|source_location.hpp |maintains info about locations within source code|
|spin_lock.hpp |simple user mode spin lock based on std::atomic|
|string.hpp |advanced and common string handling|
|tuple.hpp |manipulate and generate tuples|
|unique_id.hpp |global unique identifier / universal unique identifier data type|
|var.hpp |multi-type variant using type-erasure|

### Getting started

XTL works with modern C++11 compilers and has been tested with MinGW, GCC, Intel C++, Cygwin and Microsoft Visual C++. The library can be used out-of-the-box in many cases by simply including the desired header since most components are header-only. A few components require linking to a run-time component so they will need to be compiled.

### Requirements

* [CMake](http://www.cmake.org) is required to configure
* [libiconv](https://www.gnu.org/software/libiconv/) is optional for unicode support on Posix platforms.
* [libuuid](https://sourceforge.net/projects/libuuid/) is optional for UUID/GUID support on Posix plaforms. (This library has bounced around to several locations over the years. Some documentation says it's included in modern Linux kernel code while others say it's included in the e2fsprogs package. Most modern Linux distros support some version in their respective package managers.)

### Obtaining

XTL is hosted on GitHub and is available at http://www.github.io/djmott/xtl
Checkout the repo with git:

```
git clone https://github.com/djmott/xtl.git
```

### Compiling

For the most part XTL is a 'header-only' library so compilation isn't necessary. None the less, it must be configured for use with the compiler and operating system with [CMake](https://cmake.org/). From within the top level directory:

```
mkdir build
cd build
cmake ..
```
The compilation step is not always necessary depending on the required components that will be used. The method used to compile the run-time code is platform, toolchain and CMake configuration specific. For Linux, Cygwin and MinGW make files just run `make`.
### Using
Several configuration options are available during configuration with CMake. For most purposes the default configuration should work fine. Applications should add the `include` folder to the search path. The configuration with CMake detects the compiler toolchain and target operating system then produces the primary include file. For most applications just including the project header will go a long way:
```{.cpp}
#include <xtd/xtd.hpp>
```

### Testing

XTL uses the [Google Test](https://github.com/google/googletest) framework for unit tests and system test. From within the build directory:
```
make unit_tests
```
The unit tests and system tests are contained in the same resulting binary at `tests/unit_tests`. The `coverage_tests` build target is only available for GCC:
```
make coverage_tests
```
This will produce the binary `tests/coverage_tests` which is identical to the `tests/unit_tests` binary but has additional instrumenting enabled for gcov.

### Documentation

[Doxygen](http://www.doxygen.org) is used to generate source documentation. The code is fairly well marked up for doxygen generation. After the project has been configured with CMake build with documentation with:

```
make docs
```
This will extract the source comments and generate nice documentation in the `docs/html` folder. Also available is the [wiki](https://github.com/djmott/xtl/wiki)

### Feedback and Issues

Submit a [ticket](https://github.com/djmott/xtl/issues) on GitHub if a bug is found. Effort will be made to fix it ASAP.

### Contributing

Contributions are appreciated. To contirube, <a class="github-button" href="https://github.com/djmott/xtl/fork" data-icon="octicon-repo-forked" data-style="mega" data-count-href="/djmott/xtl/network" data-count-api="/repos/djmott/xtl#forks_count" data-count-aria-label="# forks on GitHub" aria-label="Fork djmott/xtl on GitHub">fork</a>
the project, add some code and submit a [pull request](https://github.com/djmott/xtl/pulls). In general, contributions should:
* Clear around %80 in code coverage tests
* Pass SonarQube quality gateway
* Pass unit and system tests
* Pass tests through ValGrind memcheck or some other dynamic analysis with no resource leaks or other significant issues

### License

XTL is copyright by David Mott and licensed under the Boost Version 1.0 license agreement. See [LICENSE.md](LICENSE.md) or [http://www.boost.org/LICENSE_1_0.txt](http://www.boost.org/LICENSE_1_0.txt) for license details.
17 changes: 17 additions & 0 deletions docs/Sockets.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
Sockets
=======
The XTL socket library provides low-level and high-level abstractions around sockets that can be mixed and matched to achieve a multitude of interfaces. The various socket behaviors are decomposed into independent policies which are composed at compile time into complex concrete types. Sockets are well suited for a hierarchy generation pattern because various socket types share behaviors in unique ways. The hierarchy generation pattern permits the composition of constituent behaviors without resorting to multiple inheritance. The hierarchy generation pattern also permits concrete compositions to contain only the interface elements that make sense and should be present for a particular socket type.

This concept is probably best explained from the highest level interfaces that will most commonly be used in applications. Here's a pre-defined typedef for an IPv4 UDP socket:

```{.cpp}
using ipv4_udp_socket = socket_base<ipv4address, socket_type::datagram, socket_protocol::udp, ip_options>;
```

The four constituent components that compose an `ipv4_udp_socket` are `ipv4address`, `socket_type::datagram`, `socket_protocol::udp` and `ip_options`. The `socket_base` template composes these individual behavioral components in a linear object hierarchy that avoids multiple inheritance. Some of these components are used in the IPv4 TCP socket:

```{.cpp}
using ipv4_tcp_stream = socket_base<ipv4address, socket_type::stream, socket_protocol::tcp, ip_options, connectable_socket, bindable_socket, listening_socket, selectable_socket>;
```

Additional behavioral policies can be added or removed as desired to achieve a variety of custom interfaces. For example, the `connectable_socket` behavior produces a `connect` method typically for TCP clients which the `bindable_socket` and `listening_socket` provide `bind` and `listen` respectively, typically for TCP servers. So, the predefined `ipv4_tcp_stream` type can be used as both a client and server. If so desired, these behaviors could be declared in separate interfaces to produce independent client and server socket types.

0 comments on commit d314216

Please sign in to comment.