Skip to content

Commit

Permalink
Change to a hybrid event sourcing model for CAs and Repository. (#426)
Browse files Browse the repository at this point in the history
This allows us to keep the full history of semantically important events, while not spamming the history and avoiding excessive use of disk space. See issues #370 and #423.

This is a substantial change. Highlights follow:
* Added a developer documentation section
* No longer using events for manifest/crl generation (#370)
* No longer using events for publication deltas (#423)
* Removed pre 0.6.0 migration code - people will have to upgrade to at least 0.6.0 first
* Added migration code for 0.6.0-0.8.1 to this
* Migrate repository by doing a keyroll. (#370)
* Remove archiving code for commands (no longer applicable)

Minor other fixes:
* Use a swap file when writing (avoid corrupt json if disk is full) (#370)
* Make removing publisher content idempotent for publishers already removed.

Co-authored-by: Ximon Eighteen <3304436+ximon18@users.noreply.github.com>
Co-authored-by: Jasper den Hertog <jasper@plainspace.com>
  • Loading branch information
3 people committed Mar 17, 2021
1 parent cf1f8a9 commit e662c15
Show file tree
Hide file tree
Showing 1,449 changed files with 24,778 additions and 59,541 deletions.
2 changes: 1 addition & 1 deletion Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ intervaltree = "0.2.6"
jmespatch = { version = "^0.3", features = ["sync"], optional = true }
libflate = "^1.0"
log = "^0.4"
openidconnect = { version = "^2.0.0-alpha", optional = true, default_features = false }
openidconnect = { version = "^2.0.0-alpha.3", optional = true, default_features = false }
openssl = { version = "^0.10", features = ["v110"] }
oso = { version = "^0.8", optional = true }
regex = { version = "^1.4", optional = true }
Expand Down
28 changes: 0 additions & 28 deletions defaults/krill-pubd.conf
Original file line number Diff line number Diff line change
Expand Up @@ -109,34 +109,6 @@
# #
######################################################################################

# Archive publication events after X days. If you do NOT set this
# value then Krill will not do any archiving.
#
# If enabled this option will make sure that the republish events,
# where your CA simply generates a new Manifest and CRL are archived
# after the given number of days.
#
# If you run Krill as a Publication Server then this option will
# enable the archiving of publication deltas received by your server
# from CAs.
#
# Archived commands (containing e.g. details of when the change happened),
# and following events will be moved to an "archived" subdirectory under
# your data directory as follows:
# $data_dir/pubd/0/archived <-- for Publication Server
# $data_dir/cas/ca/archived <-- for a CA named 'ca'
#
# If you want to save space you can delete the files from these archived
# directories, e.g. from cron. However, you could also archive them in
# a different way: e.g. compress and move to long term storage. Krill will
# no longer need this data, but if you ever wanted to see these details in
# history you would need to move them back into the parent directory of
# the 'archived' directory.
#
# To enable this set the following key value pair.
#
### archive_threshold_days = 7

# Restrict size of messages sent to the API
#
# Default 256 kB
Expand Down
93 changes: 93 additions & 0 deletions doc/development/01_daemon.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,93 @@
Krill Daemon Setup
==================

Overview
--------

Here we will explain how the Krill daemon is layered and handles requests. This layering works the
same way whether the daemon is running as a Certification Authority, Publication Server, or both. The
components described here are responsible for:
* Parsing configuration
* Starting Krill
* Triggering and executing data migrations on upgrade
* Handling HTTPS requests
* Handling authorization
* Background jobs

Ultimately the actual requests coming from either the API or background jobs are dispatched to either
the `CaServer` or `RepositoryManager` which are set up using the provided config (e.g. instructing these components
where their data is stored). Theoretically those components could also be wrapped in a different way in
the future, e.g. to support serverless setups using AWS Lambda functions, provided of course that authorization,
configuration, and concurrency are handled.

Binaries
--------

The project includes two binaries which can be used to start a Krill daemon. These binaries are fairly
thin executables which are responsible for parsing a configuration file, setting the operation mode, and
then starting the `HTTPS Server` which includes the real `KrillServer`.

Typically the `krill` binary is used to start Krill as a Certification Authority server, while `krillpubd`
is used to start it as a dedicated Publication Server. That said, mixed operation is also possible as we
will explain below.


HTTPS Server
------------

Krill uses [hyper](https://hyper.rs/) as an HTTPS server. The set up for this is done in the `start_krill_daemon`
function in `src/daemon/http/server.rs`. This function performs the following steps:

* Creates the PID file.
* Verifies that the configured data directory is usable.
* Calls 'pre-start' upgrades before state is built. (e.g. migrate data structures).
* Instantiates a `KrillServer`, which will guard all state.
* Creates a self-signed TLS certificate, unless one was prepared earlier.
* Builds a `hyper` server which then connects to the configured port and handles connections.
* This server keeps running until the Krill binary is terminated.

Note that the `hyper` server itself is stateless. For this it relies on an `Arc<KrillServer>` which can
be cloned cheaply whenever a request is processed. So, we use hyper for the following:
* Get authentication/authorization information from the request (header/cookies dependent on config)
* Serve static content for the Krill UI.
* Map requests to API code in `KrillServer` and serve responses

> Note that for higher level testing we bypass the Krill binaries, and call the function to start the
> HTTPS server directly, with appropriate configuration settings. Have a look at `tests/functional.rs`
> for an example.

KrillServer
-----------

This is the main daemon component that runs Krill. It won't do actual processing, but it is responsible for running and
mapping calls to the following components (we will describe each component in more detail later):

| Element | Code Path | Responsibility |
| ------------------- | ----------------------------- | ----------------------------------------------------------- |
| `CaServer` | src/daemon/ca/server.rs | Manages Krill CAs. |
| `RepositoryManager` | src/pubd/pubserver.rs | Manages access to and content of the repository. |
| `Scheduler` | src/daemon/scheduler.rs | Schedules and executes background jobs. |
| `Authorizer` | src/daemon/auth/authorizer.rs | Verifies authentication and authorization for API requests. |
| `BgpAnalyser` | src/commons/bgp/analyser.rs | Compares authorizations to BGP, downloads RIS whois dumps. |


KrillMode
---------

The `KrillServer` elements are initialized based on which ```KrillMode``` is selected. The following modes are possible:

| KrillMode | Operation |
|-|-|
| Pubd | The KrillServer will have Some(PubServer), but no (None) CaServer |
| Ca | The KrillServer will have Some(CaServer), but no (None) PubServer |
| Mixed | The KrillServer will have both a CaServer and a PubServer |
| Testbed | Krill runs in test mode. It will have a PubServer, CaServer **AND** an embedded TA |

If Krill is started with the `krillpubd` binary, then the mode will always be ```KrillMode::Pubd```. If it is started with the
`krill` binary, then the mode will *normally* be ```KrillMode::Ca```. However, for backward compatibility with existing deployments,
the KrillServer will change this mode to ```KrillMode::Mixed``` if it finds that a data directory exists for an initialized
Publication Server with at least one active `Publisher`. ```KrillMode::Testbed``` can be forced if the user sets the URIs for the test
Publication Server rsync and RRDP URI base, using the following two environment variables: `KRILL_TESTBED_RSYNC` and `KRILL_TESTBED_RRDP`.


30 changes: 30 additions & 0 deletions doc/development/02_cli.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
Krill Command Line Client Setup
===============================

There are two CLI binaries included in Krill: `krillc` is intended to manage Certification
Authorities, and `krillpubc` is used to manage a Publication Server.

Essentially the CLIs are a small convenient way to access the Krill API and represent responses
to the user. They parse command line arguments and/or files supplied by the user (where applicable),
and query or post (JSON) to the appropriate API end-point. Responses can be displayed as JSON, or
plain text.

From a development point of view it's important to know that the argument parsing by the CLIs
is tested manually. This can lead to issues as there is no strong typing enforced by the clapper
library that we use. So: CHECK whenever there are changes.

What **is** tested properly is the underlying code used by the CLIs to submit data and process
server responses. Our test code bypasses the command line parsing, but it uses the same underlying
code in the higher level tests such as `tests/functional.rs` in order to interact with a running
Krill instance.

The code can be found under `src/cli`. An overview of the most important elements follows:

| Element | Code Path | Responsibility |
|---------------------|------------------------------|----------------------------------------------------------------------|
| `KrillClient` | src/cli/client.rs | The client code for Krill CA operations. |
| `KrillPubdClient` | src/cli/client.rs | The client code for Krill Publication Server operations. |
| `Command` | src/cli/options.rs | Enum for the intended CA command. |
| `PublishersCommand` | src/cli/options.rs | Enum for the intended Publication Server command. |
| `ApiResponse` | src/cli/report.rs | Structure to represent API responses. |

60 changes: 60 additions & 0 deletions doc/development/03_es_concepts.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
Event Sourcing Overview
=======================

The Audit is the Truth
----------------------

Event Sourcing is a technique based on the concept that the current state of a thing is
based on a full record of past events - which can be replayed - rather than saving / loading
the state using something like object-relational-mapping or serialization.

A commonly used example to explain this concept is bookkeeping. In bookkeeping one tracks
all financial transactions, like money coming in and going out of an account. The current
state of an account is determined by the culmination of all these changes over time. Using
this approach, and assuming that no one altered the historical records, we can be 100% certain
that the current state reflects reality. Conveniently, we also have a full audit trail which
can be shown.


CQRS and Domain Driven Design
-----------------------------

Event Sourcing is often combined with CQRS: Command Query Responsibility Segregation. The
idea here is that there is a separation between intended changes (commands) sent to your
entity, the internal state of the entity, and its external representation.

Separating these concerns we can also borrow heavily from ["Domain Driven Design"](https://en.wikipedia.org/wiki/Domain-driven_design).
In DDD structures are organized into so-called of "aggregates" of related structures. At
their root they are joined by an "aggregate root".

Commands, Events and Data
-------------------------

This separation means that the aggregate is a bit like a black box to the outside world,
but in a positive sense. Users of the code just need to know what messages of intent (commands)
they can send to it. This interface, like an API, should be fairly stable.

The first thing to do when a command is sent is that an aggregate is retrieved from storage,
so that it can receive a command. The state of the aggregate is the result of all past
events - but, because this can get slow, aggregate snapshots are often used for efficiency.
If a snapshot does not include the latest events, then they are simply re-applied to it.

When an aggregate receives a command it can see if a change can be applied. It is fully
in charge of its own consistency. If it isn't then the aggregate root is usually at the
wrong level. The result of the command can either be an error - the command is rejected - or
a number of events which represent state changes are applied to the aggregate.

When events are applied they are also saved, so that they can be replayed later. Furthermore,
implementations often use message queues where events are posted as well. This allows other
components in the code to be triggered by changes in an aggregate.

In its purest form (hint: we don't do this), these messages can then be used to (re-)generate
one or more data representations that users can *query*. E.g. you could populate data tables
in a SQL database if that floats your boat.

Credits / Read More..
---------------------

This combination of techniques has been championed by various people, most notably Greg Young
and Martin Fowler. You can do your own internet search find out much more about how this can
work, and how it is done in other projects.

0 comments on commit e662c15

Please sign in to comment.