Skip to content

How to: versioning

Marina Golosova edited this page Sep 29, 2020 · 6 revisions

DKB system versioning

The whole system version is a consistent combination of versions of all its components (when all components are capable of operating together). Changes in any of them lead to the whole system's version change; but it should only be changed when all the components get consistent.

There are three main components of the DKB architecture:

  1. Integration process (data4es).

    The integration process consists of:

    • combination of stages;
    • management utility (to start/stop/check the process).

    -- and changes in any of these components are changing the version of the integration progress.

    Comment: In fact, there can be many integration processes within the DKB, each of which being developed independently of the others [1].

    [1] Or not exactly independently, if one process relies on another process' results; but it only means that their versions must be coordinated.

  2. pyDKB library.

    This library is used by the integration process steps. Which means:

    • when a new library version is introduces to the process -- the process' version is also changed (for its stages are changed);
    • after some changes in the library the integration process may require changes as well.
  3. API server.

    The API server is being developed independently of the rest of the code, and it only depends on the integration process in terms of storage scheme: what data are being integrated and how they can be accessed. Which means:

    • some changes in the integration process may require changes in the API server code.

Currently established convention on versions tracking

Comment: some items are marked as (new) for these moments were not clearly discussed prior to this document creation, but are supposed to be taken as "established convention" since now. These items can be altered after discussion (see the next section).

  1. The data4es process:

    • versions are currently tracked by the repository tags;
    • tags are annotated (with changelog);
    • new tag is created:
      • manually;
      • by the administrator of the DKB instance at CERN;
      • before switching the instance to the new version of the process;
    • tag looks like v<MAJOR>.<MINOR>-<PATCH>;
    • <MAJOR> number is increased when the process' execution result changes significantly -- so that integrated data must be reindexed and/or the new version breaks compatibility with the API server;
    • <MINOR> number is increased when the process semantically changed (start integrating new data from already used of new sources);
    • <PATCH> number is increased when the process' implementation was improved, but the results of its work are still the same.
  2. The pyDKB library:

    • versions are tracked by the Utils/Dataflow/pyDKB/VERSION file;
    • version is updated:
      • in each PR where the library code is changed;
      • by the PR author;
      • (new) in a separate commit (and annotated with changelog);
      • (new) once per PR (the PR's commit history should be rewritten if necessary);
    • version looks like <MAJOR>.<MINOR>.<PATCH>;
    • <MAJOR> number is increased if the changes break backwards compatibility (the integration process that used to work with the previous version will fail with the new one);
    • (new) <MINOR> number is increased if some new functionality is introduced (the integration process should probably be adapted to make use of it, but can keep on working as they used to);
    • (new) <PATCH> number is set to the date of commit formatted as YYYYMMDD increased if changes improve the library, but do not introduce any new functionality.

    Comment: it means that <MINOR> number can not relate to more than one new feature; but since we do not add new features too often it must be alright.

  3. The API server:

    • versions are tracked by __version__ variable in Utils/API/server/lib/dkb/api/__init__.py file;
    • version is updated:
      • in each PR where the API server code is changed (even if it changes only the main WSGI application code):
        • (new) PR to master: first commit sets version with -dev* suffix, last one -- removes it;
        • (new) PR to api: last commit updates -dev* suffix (once per PR; commit history may be rewritten if necessary);
      • by the PR author;
      • (new) in a separate commit (and annotated with changelog);
    • version looks like <MAJOR>.<MINOR>.<PATCH>[-devYYYYMMDD[a]];
    • <MAJOR> number is increased if changes break backwards compatibility with the integration process (previously used storage scheme is no longer supported);
    • (new) <MINOR> number is increased if changes introduce new functionality (e.g. a new method added);
    • (new) <PATCH> number is increased if changes improve the library, including improvements in the existing method's response;
    • (new) suffix -dev*:
      • is set to the date of commit formatted as devYYYYMMDD;
      • may be extended with additional symbols like dev20200917 -> dev20200917a -> dev20200917b if more than one update happened at the same date;
      • added in api branch at the beginning of a new iteration of development (with a set of planned changes, after which the new release will supposedly be published/merged to master, in the commit log);
      • updated in each PR merged to api (with proper changelog);
      • removed before merging api to master (with full changelog from the last non-dev version).
  4. The whole system:

    • is not versioned right now at all.

Discussion on the versions tracking improvement

https://trello.com/c/FyFoGWST