Skip to content
ggeorgiev edited this page Feb 14, 2016 · 12 revisions

Welcome to the dbs wiki!

I am going to put some random thoughts here that I will systemize with the project going forward.

General

Why development build systems - as software developer with more than 2 decades of experience it always stuns me how easy it is for build engineers to put the developers needs on a second plan and focus on the release and other needs. Build systems even today 2015 that do not have IDE integration, reliable incremental builds, fast and efficient dependency checking are not only common but dominant in the industry. With this project even from the name I would like development to be clear priority.

Building process

Dbs does not recognize, favor or follow particular building model or process. In dbs there are two types of object - objectives and sources. Objectives could be different types - from simple one like executable or library file, through executing unit, integration or automation tests to static and dynamic code analysis, performance tests and benchmarking and even code improvements - like clang-moderniser, iwyu and similar. The sources also vary from simple files like a cpp or header file - to independent 3rd party components.

Note that the type of an object does not determine if something is source or objective. For example if you have a project where you would like to produce a set of cpp and header files from thrift or protobuf definitions - they are objectives. It is important also that when building an executable based on the cpp files that are produced from some type of idl files they are not source files for it. Sources are only what you do not obtain during the building process. Even if you submit auto-generated files to the scm this still does not define them as sources.

Build system language

Most of the build systems use procedural languages to define the build process. This seems reasonable to achieve the goal that most of them have with it - to build the dependency tree, then satisfy it. This is completely reasonable for a build process that works on sessions - you modify, you build, you explore and repeat. One big aspect of the dbs goals is to eliminate the need of re-evaluating the whole dependency tree and to literally correct it based on the changes that are observer with systems like watchman for example.

To be able to do that the main source of changes to it is not the cpp or idl files, but changes to the build scripts itself. To avoid the need to reloading and rebuild the whole build object module with every change in any of the build scripts dbs is using object definition language.

For example:

cpp_library Foo {
    cpp_code {
        file1.cpp
    }
    cpp_header {
        file1.h
    }
}

gtest Foo {
    cpp_code {
        main-gtest.cpp
        file1-gtest.cpp
    }
    cpp_library {
        Foo
    }
}

cpp_program Foo {
    cpp_code {
        main.cpp
    }
    cpp_library {
        Foo
    }
}

How a objective is achieved is implementation detail for the build system.

There might be more than one algorithm to achieve the same objective. When this is not represented in the dbs model - it will be easy to switch between them and pick the most performant.

Features

Supported programing languages

Dbs supports all the humanly known and unknown languages - considering the expectations it creates to the compilers, code generators, test executors and any other tools that might be involved in the development process. I will start with cpp because it seems as the most challenging for a build system programing language.

Distribution

Nowadays almost anything that does not assume a need that exceeds one machine is outdated before it is born. Dbs provides distributed service where though the developers will benefit from the infrastructure equally as any other build engineering need.

Cache

Because of the unique strategy of building the targets from top to bottom the cache mechanism of dbs is very efficient, because it does not require obtaining any intermittent targets to obtain the final objective. To illustrate this let's compare the behavior of bottom to top cache for one cxx_program as executable. Satisfying such objective will require obtaining all obj files for the cxx files, then obtaining all static libraries, and then obtaining based on them the executable. Building or even obtaining all such files will require time. When top to bottom strategy is used all we need to evaluate is that no one of the sources of the executable is changed. Note the obj files and static libraries are no source for the executable - only the cpp and header files are sources - and we have all of them obtained with syncing the project. With this in place if the executable for these set of cpp and header files is already cached we can simply obtain it directly.

Now what happens if it is not cached. The process moves down to produce all the static libraries involved in the executable. Where again the check is based only on the cpp and header files - even on this phase we do not need to go through building or obtaining all the obj files. If some of the libraries is not in the cache we go down only for it. It is easy to observe that if we have big branches of code that we do not touch we are going to obtain directly the top level artifacts from them - which saves all kind of resources and time.

Another important aspect of this approach to notice is the lowered expectations of the compiler tools. We are not going to ignore a cache simply because the tools we use produces slightly different binaries. All we care that the tools produce semantically the same results. It could involve timestamps, different order of objects or randomly generated uuids. Because we basing the cache on the source files - if the produced at some point artifacts are semantically the same as what we would've acchive running the tools again (and this is basic expectation for every tool) we are good to go.

IDE integration

Dbs provides what is necessary to translate its targets in IDE projects where developers is able to enjoy code completion and other nowadays goodies provided from the IDEs - without necessarily targeting completely functional projects.

Componentization

Dbs is designed to support multi-component projects that will be easy to develop together or separate. Dbs supports tracking dependencies between different components. It supports top-to-bottom and bottom-to-top dependency track to assure the best performance. But most importantly it is designed to handle humongous flat projects.

Build priorities

For dbs priorities are not based on dependencies. Every objective might be added with a different priority. Because of the client/server architecture of dbs having commands from multiple terminals will not step on each other fighting for resource, but they will be prioritized accordingly. For example

terminal 1: dbs test component1 
terminal 2: dbs build component2
terminal 3: dbs --top build test component3

will start testing component1, when the second command is received building component2 will be added to the queue. When the terminal 3 is executed building component3 will be added to the top of the queue and immediately under testing of component3. Note that current objectives do not need to finish before that, but they will be slowly or faster depending on the current active task will free the resource to the tasks with priority.

Incremental builds

Dbs does not build incrementally it builds continuously. The client/server/servers architecture allows for unsolicited observation of the development process, that might continuously trigger background building tasks.

Before and after the line objectives

When using dbs you can have objectives that you would like to wait and observe on the terminal and objectives that you would like to be executed on background where the feedback could be provided with other channels.

terminal: dbs build component1 test component1 ! clang-tidy component1

This command will result in building and testing component1 with priority then the command will exit. The dbs will continue with running clang-tidy on background.

Note: background commands will utilize only idle system resources.

terminal: dbs clang-tidy component1 

When at some point this command is executed it will benefit from already evaluated subtasks and if all of them are ready already it will be instantaneous. Note that the cached results from a command is not limited to the output file(s). The stderr is preserved to be instantaneously shown for tasks that have no changes in the sources - instead reevaluating them to simple achieve the same result. In same cases even the stdout could be preserved say when the compiler returns a warning and it is not set to treat it as error.