Change to a hybrid event sourcing model for CAs and Repository. (#426)

This allows us to keep the full history of semantically important events, while not spamming the history and avoiding excessive use of disk space. See issues #370 and #423. This is a substantial change. Highlights follow: * Added a developer documentation section * No longer using events for manifest/crl generation (#370) * No longer using events for publication deltas (#423) * Removed pre 0.6.0 migration code - people will have to upgrade to at least 0.6.0 first * Added migration code for 0.6.0-0.8.1 to this * Migrate repository by doing a keyroll. (#370) * Remove archiving code for commands (no longer applicable) Minor other fixes: * Use a swap file when writing (avoid corrupt json if disk is full) (#370) * Make removing publisher content idempotent for publishers already removed. Co-authored-by: Ximon Eighteen <3304436+ximon18@users.noreply.github.com> Co-authored-by: Jasper den Hertog <jasper@plainspace.com>
NLnetLabs · Mar 17, 2021 · e662c15 · e662c15
1 parent cf1f8a9
commit e662c15
Show file tree

Hide file tree

Showing 1,449 changed files with 24,778 additions and 59,541 deletions.
diff --git a/Cargo.toml b/Cargo.toml
@@ -33,7 +33,7 @@ intervaltree    = "0.2.6"
 jmespatch       = { version = "^0.3", features = ["sync"], optional = true }
 libflate        = "^1.0"
 log             = "^0.4"
-openidconnect   = { version = "^2.0.0-alpha", optional = true, default_features = false }
+openidconnect   = { version = "^2.0.0-alpha.3", optional = true, default_features = false }
 openssl         = { version = "^0.10", features = ["v110"] }
 oso             = { version = "^0.8", optional = true }
 regex           = { version = "^1.4", optional = true }

diff --git a/defaults/krill-pubd.conf b/defaults/krill-pubd.conf
@@ -109,34 +109,6 @@
 #                                                                                    #
 ######################################################################################
 
-# Archive publication events after X days. If you do NOT set this
-# value then Krill will not do any archiving.
-#
-# If enabled this option will make sure that the republish events,
-# where your CA simply generates a new Manifest and CRL are archived
-# after the given number of days.
-#
-# If you run Krill as a Publication Server then this option will
-# enable the archiving of publication deltas received by your server
-# from CAs.
-#
-# Archived commands (containing e.g. details of when the change happened),
-# and following events will be moved to an "archived" subdirectory under
-# your data directory as follows:
-#   $data_dir/pubd/0/archived      <-- for Publication Server
-#   $data_dir/cas/ca/archived      <-- for a CA named 'ca'
-#
-# If you want to save space you can delete the files from these archived
-# directories, e.g. from cron. However, you could also archive them in
-# a different way: e.g. compress and move to long term storage. Krill will
-# no longer need this data, but if you ever wanted to see these details in
-# history you would need to move them back into the parent directory of
-# the 'archived' directory.
-#
-# To enable this set the following key value pair.
-#
-### archive_threshold_days = 7
-
 # Restrict size of messages sent to the API
 #
 # Default 256 kB

diff --git a/doc/development/01_daemon.md b/doc/development/01_daemon.md
@@ -0,0 +1,93 @@
+Krill Daemon Setup
+==================
+
+Overview
+--------
+
+Here we will explain how the Krill daemon is layered and handles requests. This layering works the
+same way whether the daemon is running as a Certification Authority, Publication Server, or both. The
+components described here are responsible for:
+* Parsing configuration
+* Starting Krill
+* Triggering and executing data migrations on upgrade
+* Handling HTTPS requests
+* Handling authorization
+* Background jobs
+
+Ultimately the actual requests coming from either the API or background jobs are dispatched to either
+the `CaServer` or `RepositoryManager` which are set up using the provided config (e.g. instructing these components
+where their data is stored). Theoretically those components could also be wrapped in a different way in
+the future, e.g. to support serverless setups using AWS Lambda functions, provided of course that authorization,
+configuration, and concurrency are handled.
+
+Binaries
+--------
+
+The project includes two binaries which can be used to start a Krill daemon. These binaries are fairly
+thin executables which are responsible for parsing a configuration file, setting the operation mode, and
+then starting the `HTTPS Server` which includes the real `KrillServer`.
+
+Typically the `krill` binary is used to start Krill as a Certification Authority server, while `krillpubd`
+is used to start it as a dedicated Publication Server. That said, mixed operation is also possible as we
+will explain below.
+
+
+HTTPS Server
+------------
+
+Krill uses [hyper](https://hyper.rs/) as an HTTPS server. The set up for this is done in the `start_krill_daemon`
+function in `src/daemon/http/server.rs`. This function performs the following steps:
+
+* Creates the PID file.
+* Verifies that the configured data directory is usable.
+* Calls 'pre-start' upgrades before state is built. (e.g. migrate data structures).
+* Instantiates a `KrillServer`, which will guard all state.
+* Creates a self-signed TLS certificate, unless one was prepared earlier.
+* Builds a `hyper` server which then connects to the configured port and handles connections.
+* This server keeps running until the Krill binary is terminated.
+
+Note that the `hyper` server itself is stateless. For this it relies on an `Arc<KrillServer>` which can
+be cloned cheaply whenever a request is processed. So, we use hyper for the following:
+* Get authentication/authorization information from the request (header/cookies dependent on config)
+* Serve static content for the Krill UI.
+* Map requests to API code in `KrillServer` and serve responses
+
+> Note that for higher level testing we bypass the Krill binaries, and call the function to start the
+> HTTPS server directly, with appropriate configuration settings. Have a look at `tests/functional.rs`
+> for an example.
+
+
+KrillServer
+-----------
+
+This is the main daemon component that runs Krill. It won't do actual processing, but it is responsible for running and
+mapping calls to the following components (we will describe each component in more detail later):
+
+| Element             | Code Path                     | Responsibility                                              |
+| ------------------- | ----------------------------- | ----------------------------------------------------------- |
+| `CaServer`          | src/daemon/ca/server.rs       | Manages Krill CAs.                                          |
+| `RepositoryManager` | src/pubd/pubserver.rs         | Manages access to and content of the repository.            |
+| `Scheduler`         | src/daemon/scheduler.rs       | Schedules and executes background jobs.                     |
+| `Authorizer`        | src/daemon/auth/authorizer.rs | Verifies authentication and authorization for API requests. |
+| `BgpAnalyser`       | src/commons/bgp/analyser.rs   | Compares authorizations to BGP, downloads RIS whois dumps.  |
+
+
+KrillMode
+---------
+
+The `KrillServer` elements are initialized based on which ```KrillMode``` is selected. The following modes are possible:
+
+| KrillMode | Operation |
+|-|-|
+| Pubd | The KrillServer will have Some(PubServer), but no (None) CaServer |
+| Ca | The KrillServer will have Some(CaServer), but no (None) PubServer |
+| Mixed | The KrillServer will have both a CaServer and a PubServer |
+| Testbed | Krill runs in test mode. It will have a PubServer, CaServer **AND** an embedded TA |
+
+If Krill is started with the `krillpubd` binary, then the mode will always be ```KrillMode::Pubd```. If it is started with the
+`krill` binary, then the mode will *normally* be ```KrillMode::Ca```. However, for backward compatibility with existing deployments,
+the KrillServer will change this mode to ```KrillMode::Mixed``` if it finds that a data directory exists for an initialized
+Publication Server with at least one active `Publisher`. ```KrillMode::Testbed``` can be forced if the user sets the URIs for the test
+Publication Server rsync and RRDP URI base, using the following two environment variables: `KRILL_TESTBED_RSYNC` and `KRILL_TESTBED_RRDP`.
+
+
diff --git a/doc/development/02_cli.md b/doc/development/02_cli.md
@@ -0,0 +1,30 @@
+Krill Command Line Client Setup
+===============================
+
+There are two CLI binaries included in Krill: `krillc` is intended to manage Certification
+Authorities, and `krillpubc` is used to manage a Publication Server.
+
+Essentially the CLIs are a small convenient way to access the Krill API and represent responses
+to the user. They parse command line arguments and/or files supplied by the user (where applicable),
+and query or post (JSON) to the appropriate API end-point. Responses can be displayed as JSON, or
+plain text.
+
+From a development point of view it's important to know that the argument parsing by the CLIs
+is tested manually. This can lead to issues as there is no strong typing enforced by the clapper
+library that we use. So: CHECK whenever there are changes.
+
+What **is** tested properly is the underlying code used by the CLIs to submit data and process
+server responses. Our test code bypasses the command line parsing, but it uses the same underlying
+code in the higher level tests such as `tests/functional.rs` in order to interact with a running
+Krill instance.
+
+The code can be found under `src/cli`. An overview of the most important elements follows:
+
+| Element             | Code Path                    | Responsibility                                                       |
+|---------------------|------------------------------|----------------------------------------------------------------------|
+| `KrillClient`       | src/cli/client.rs            | The client code for Krill CA operations.                             |
+| `KrillPubdClient`   | src/cli/client.rs            | The client code for Krill Publication Server operations.             |
+| `Command`           | src/cli/options.rs           | Enum for the intended CA command.                                    |
+| `PublishersCommand` | src/cli/options.rs           | Enum for the intended Publication Server command.                    |
+| `ApiResponse`       | src/cli/report.rs            | Structure to represent API responses.                                |
+
diff --git a/doc/development/03_es_concepts.md b/doc/development/03_es_concepts.md
@@ -0,0 +1,60 @@
+Event Sourcing Overview
+=======================
+
+The Audit is the Truth
+----------------------
+
+Event Sourcing is a technique based on the concept that the current state of a thing is
+based on a full record of past events - which can be replayed - rather than saving / loading
+the state using something like object-relational-mapping or serialization.
+
+A commonly used example to explain this concept is bookkeeping. In bookkeeping one tracks
+all financial transactions, like money coming in and going out of an account. The current
+state of an account is determined by the culmination of all these changes over time. Using
+this approach, and assuming that no one altered the historical records, we can be 100% certain
+that the current state reflects reality. Conveniently, we also have a full audit trail which
+can be shown.
+
+
+CQRS and Domain Driven Design
+-----------------------------
+
+Event Sourcing is often combined with CQRS: Command Query Responsibility Segregation. The
+idea here is that there is a separation between intended changes (commands) sent to your
+entity, the internal state of the entity, and its external representation.
+
+Separating these concerns we can also borrow heavily from ["Domain Driven Design"](https://en.wikipedia.org/wiki/Domain-driven_design).
+In DDD structures are organized into so-called of "aggregates" of related structures. At
+their root they are joined by an "aggregate root".
+
+Commands, Events and Data
+-------------------------
+
+This separation means that the aggregate is a bit like a black box to the outside world,
+but in a positive sense. Users of the code just need to know what messages of intent (commands)
+they can send to it. This interface, like an API, should be fairly stable.
+
+The first thing to do when a command is sent is that an aggregate is retrieved from storage,
+so that it can receive a command. The state of the aggregate is the result of all past
+events - but, because this can get slow, aggregate snapshots are often used for efficiency.
+If a snapshot does not include the latest events, then they are simply re-applied to it.
+
+When an aggregate receives a command it can see if a change can be applied. It is fully
+in charge of its own consistency. If it isn't then the aggregate root is usually at the
+wrong level. The result of the command can either be an error - the command is rejected - or
+a number of events which represent state changes are applied to the aggregate.
+
+When events are applied they are also saved, so that they can be replayed later. Furthermore,
+implementations often use message queues where events are posted as well. This allows other
+components in the code to be triggered by changes in an aggregate.
+
+In its purest form (hint: we don't do this), these messages can then be used to (re-)generate
+one or more data representations that users can *query*. E.g. you could populate data tables
+in a SQL database if that floats your boat.
+
+Credits / Read More..
+---------------------
+
+This combination of techniques has been championed by various people, most notably Greg Young
+and Martin Fowler. You can do your own internet search find out much more about how this can
+work, and how it is done in other projects.