docs: rework mdbook

this commit overhauls the ClairV4 documentation Signed-off-by: ldelossa <ldelossa@redhat.com>
quay · Sep 14, 2020 · 4f35fd0 · 4f35fd0
1 parent f41fba5
commit 4f35fd0
Show file tree

Hide file tree

Showing 39 changed files with 3,228 additions and 409 deletions.
diff --git a/Documentation/SUMMARY.md b/Documentation/SUMMARY.md
@@ -1,8 +1,24 @@
 # Summary
 
-- [About](./TODO.md)
-- [Operation](./operation.md)
-- [API](./TODO.md)
-  - [Internal Endpoints](./api_internal.md)
+- [What is ClairV4](./whatis.md)
+- [How Tos](./howto.md)
+  - [Getting Started With ClairV4](./howto/getting_started.md)
+  - [Deployment Models](./howto/deployment.md)
+  - [API Definition](./howto/api.md)
+  - [Testing ClairV4](./howto/testing.md)
+- [Concepts](./concepts.md)
+  - [Authentication](./concepts/authentication.md)
+  - [Internal Endpoints](./concepts/api_internal.md)
+  - [Notifications](./concepts/notifications.md)
 - [Contribution](./TODO.md)
   - [Releases](./contribution/releases.md)
+  - [Commit Style](./contribution/commit_style.md)
+- [Reference](./reference.md)
+  - [Api](./reference/api.md)
+  - [Clairctl](./reference/clairctl.md)
+  - [Config](./reference/config.md)
+  - [Indexer](./reference/indexer.md)
+  - [Matcher](./reference/matcher.md)
+  - [Notifier](./reference/notifier.md)
+
+
diff --git a/Documentation/api_internal.md b/Documentation/api_internal.md
diff --git a/Documentation/clairv4_arch.png b/Documentation/clairv4_arch.png
diff --git a/Documentation/concepts.md b/Documentation/concepts.md
@@ -0,0 +1,6 @@
+# Concepts
+
+The following sections give a conceptual overview of how ClairCore works internally.
+
+- [Authentication](./concepts/authentication.md)
+- [Internal Endpoints](./concepts/api_internal.md)
diff --git a/Documentation/concepts/api_internal.md b/Documentation/concepts/api_internal.md
@@ -0,0 +1,24 @@
+# Internal
+
+Internal endpoints are underneath `/api/v1/internal` and are meant for
+communication between Clair microservices. If Clair is operating in combo mode,
+these endpoints may not exist. Any sort of API ingress should disallow clients
+to talk to these endpoints.
+
+We do not formally expose these APIs in our OpenAPI spec. 
+Further information and usage is an effort left to the reader.
+
+## Updates Diffs
+
+The `update_diff/` endpoint exposes the api for diffing two update operations. 
+This is used by the notifier to determine the added and removed vulnerabilities on security databsae update.
+
+## Update Operation
+
+The `update_operation` endpoint exposes the api for viewing updater's activity. 
+This is used by the notifier to determine if new updates have occured and triggers an update diff to see what has changed.
+
+## AffectedManifest
+
+The `affected_manifest` endpoint exposes the api for retreiving affected manifests given a list of Vulnerabilities.
+This is used by the notifier to determine the manifests that need to have a notification generated.
diff --git a/Documentation/operation.md → Documentation/concepts/authentication.md b/Documentation/operation.md → Documentation/concepts/authentication.md
@@ -1,49 +1,4 @@
-# Operation
-
-## Releases
-
-All of the source code needed to build clair is packaged as an archive and
-attached to the release. Releases are tracked at the [github releases].
-
-[github releases]: https://github.com/quay/clair/releases
-
-## Official Containers
-
-Clair is officially packaged and released as a container at
-[quay.io/projectquay/clair]. The `latest` tag tracks the git development branch,
-and version tags are built from the corresponding release.
-
-[quay.io/projectquay/clair]: https://quay.io/repository/projectquay/clair
-
-## Architecture
-
-Clair is structured so that it can be easily scaled with demand. It can be
-broken up into up to 3 microservices as needed ([Indexer], [Matcher], and
-[Notifier]) or run as a single monolith. Each process talks to separate tables
-in the database and is responsible for disparate API endpoints.
-
-[Indexer]: #indexer
-[Matcher]: #matcher
-[Notifier]: #notifier
-
-### Indexer
-
-Responsible for ...
-
-### Matcher
-
-Responsible for ...
-
-### Notifier
-
-Responsible for ...
-
-## Ingress
-
-One recommended configuration is to use some sort of service ingress to route
-API endpoints to the component responsible for servicing it.
-
-## Authentication
+# Authentication
 
 Previous versions of Clair used [jwtproxy] to gate authentication. For ease of
 building and deployment, v4 handles authentication itself.

diff --git a/Documentation/concepts/indexing.md b/Documentation/concepts/indexing.md
@@ -0,0 +1,25 @@
+# Indexing
+
+The [Indexer](../reference/indexer.md) service is responsble for "indexing a manifest".
+
+Indexing involves taking a manifest representing a container image and computing its constinuent parts. The indexer is trying to understand what packages exist in the image, what distribution the image is derived from, and what package repositories are used within the image. Once this information is computed it is persisted in an IndexReport.
+
+The IndexReport is an intermediate datastructure describing the contents of a container image. This report can be fed to a [Matcher](../reference/matcher.md) node for vulnerability analysis.
+
+## Content Addressability
+
+ClairV4 treats all manifests and layers as [content addressable](https://en.wikipedia.org/wiki/Content-addressable_storage). In the context of ClairV4 this means once we index a specific manifest we will not index it again unless its required. Likewise with individual layers. This allows a large reduction in work. 
+
+For example consider how many images in a registry may use "ubuntu:artful" as a base layer. It could be a large majority of images if the developers prefer basing their images off ubuntu. Treating the layers and manifests as content addressable means we will only fetch and scan the ubuntu base layer once.
+
+There are of course conditions where ClairV4 should re-index a manifest. 
+
+When ClairV4 update an internal component such as a package scanner and is asked to index a layer it will know to perform the scan with the new package scanner. ClairV4 is smart enough to understand that a component has changed and the IndexReport may be different this time around. 
+
+A client can track ClairV4's `index_state` endpoint to understand when an internal component has changed, and subsequent issue re-indexes. See our [api](../howto/api.md) guide to learn how to view our api specification.
+
+## Summary
+
+In summary you should understand that Indexing is the process ClairV4 uses to understand the contents of a container.
+
+For a more indepth look at indexing check out the [ClairCore Documentation](https://quay.github.io/claircore/)
diff --git a/Documentation/concepts/matching.md b/Documentation/concepts/matching.md
@@ -0,0 +1,15 @@
+# Matching
+
+A [Matcher](../reference/matcher.md) node is responsible for matching vulnerabilities to a provided IndexReport. 
+
+Matchers by default are also responsible for keeping the database of vulnerabilities up to date. Matchers will typically run a set of Updaters which periodically probe their data sources for new contents, writing new vulns to the database when discovered.
+
+The matcher API is designed to be called often and will always provide the most up-to-date VulnerabilityReport when queried. This VulnerabilityReport summaries both the container's contents and any vulnerabilities affecting the container image.
+
+See our [api](../howto/api.md) guide to learn how to view our api specification and work with the Matcher api.
+
+## Summary
+
+In summary you should understand that a Matcher node provides vulnerability reports given the output of an Indexing process. By default it will also run background Updaters keeping the vulnerability database up-to-date.
+
+For a more indepth look at indexing check out the [ClairCore Documentation](https://quay.github.io/claircore/)
diff --git a/Documentation/concepts/notifications.md b/Documentation/concepts/notifications.md
@@ -0,0 +1,136 @@
+# Notifications
+
+ClairV4 implements a notification system.
+
+The notifier service will keep track of new security database updates and inform an interested client if new or removed vulnerabilites affect an indexed manifest.
+
+The interested client can subscribe to notifications via several substrates:
+* Webhook delivery
+* AMQP delivery
+* STOMP delivery
+
+Configuring the notifier is done via the yaml configuration. 
+
+See the "Notifier" object in our [config reference](../reference/config.md)
+
+## A Notification
+
+When the notifier becomes aware of new vulnerabilities affecting an index manifest it will inform the client of the change. The client will receive a single notification expressing the **most servere** vulnerability inducing the change. This avoids the situation where a client gets bombarded with notifications for the same security database update. 
+
+Once a client receives this notification they should issue a new request again the [matcher](../reference/matcher.md) to receive a up-to-date vulnerability report.
+
+The notification schema will be the json mashalled form of the following schema:
+
+```go
+// Reason indicates the catalyst for a notification
+type Reason string
+const (
+	Added   Reason = "added"
+	Removed Reason = "removed"
+	Changed Reason = "changed"
+)
+type Notification struct {
+	ID            uuid.UUID        `json:"id"`
+	Manifest      claircore.Digest `json:"manifest"`
+	Reason        Reason           `json:"reason"`
+	Vulnerability VulnSummary      `json:"vulnerability"`
+}
+type VulnSummary struct {
+	Name           string                  `json:"name"`
+	Description    string                  `json:"description"`
+	Package        *claircore.Package      `json:"package,omitempty"`
+	Distribution   *claircore.Distribution `json:"distribution,omitempty"`
+	Repo           *claircore.Repository   `json:"repo,omitempty"`
+	Severity       string                  `json:"severity"`
+	FixedInVersion string                  `json:"fixed_in_version"`
+	Links          string                  `json:"links"`
+}
+```
+
+## Webhook Delivery
+*See the "Notifier.Webhook" object in our [config reference](../reference/config.md) for complete configuration details.*
+
+When you configure notifier for webhook delivery you provide the service with the following pieces of information:
+* A target URL where the webhook will fire
+* The callback URL where the notifier may be reached including its API path
+    * e.g. "http://clair-notifier/api/v1/notifications"
+
+When the notifier has determined an updated security database has changed the affected status of an indexed container it will deliver the following json body to the configured target:
+```json
+{
+  "notifiction_id": {uuid_string},
+  "callback": {url_to_notifications}
+}
+```
+
+On receipt the client can immediately browse to the URL provided in the callback field.
+
+### Pagination
+
+The URL returned in the callback field brings the client to a paginated result.
+
+The callback endpoint specification follows:
+
+```go
+GET /api/v1/notification/{id}?[page_size=N][next=N]
+{
+  page: {
+    size:    int,      // maximum number of notifications in the response 
+    next:   string, //  if present, the next id to fetch.
+  }
+  notifications: [ Notification… ] // array of notifications; max len == page.size
+}
+```
+The GET callback request implements a simple bare-minimum paging mechanism.
+
+The "page_size" url param controls how many notifications are returned in a single page. 
+If not provided a default of 500 is used.
+
+The "next" url param informs Clair the next set of paged notifications to return. If not provided 
+
+A page object is returned which specifies "next" and "size" fields.
+
+The "next" field returned in the page must be provided as the subsequent request's "next" url parameter to retrieve the next set of notifications.
+
+The "size" field will simply echo back the request page_size parameter.
+
+When the final page is served to the client the returned "page" data structure will not contain a "next" member.
+
+Therefore the following loop is valid for obtaining all notifications for a notification id in pages of a specified size.
+
+```
+{ page, notifications } = http.Get("http://clairv4/api/v1/notifications/{id}?page_size=1000")
+
+while (page.Next != None) {
+    { page, notifications } = http.Get("http://clairv4/api/v1/notifications/{id}?next={page.Next},page_size=1000")
+}
+```
+
+*Note: If the client specifies a custom page_size it must specify this page_size on every request for accurate responses.*
+
+### Deleting Notifications
+
+While not manditory the client may issue a delete of the notification via the DELETE API. See [api](../howto/api.md) to view the delete api.
+
+Deleting a notification ID will clean up resources in the notifier quicker. Otherwise the notifier will wait a pre-determined length of time before clearing delivered notifications from its database.
+
+## AMQP Delivery
+*See the "Notifier.AMQP" object in our [config reference](../reference/config.md) for complete configuration details.*
+
+The notifier also supports delivering to an AMQP broker. With AMQP delivery you can control whether a callback is delivered to the broker or whether notifications are directly delivered to the queue.
+
+This allows the developer of the AMQP consumer to determine the logic of notification processing.
+
+### Direct Delivery
+
+If the notifier's configuration specifies "direct=true" for AMQP notifications will be delivered directly to the configured exchange.
+
+When "direct=true" the "rollup" property maybe set to instruct the notifier to send a max number of notifications in a single AMQP message. This allows a balance between size of the message and number of messages delivered to the queue.
+
+## Testing and Development
+
+The notifier has a testing mode enabled when it sees the "NOTIFIER_TEST_MODE" environment variable set. It can be set to any value as we only check to see if it exists.
+
+When this environment variable is discovered the notifier will begin sending fake notifications to the configured delivery mechanism every "poll_interval" interval. This provides an easy way to implement and test new or existing deliverers.
+
+The notifier will run in this mode until the environment variable is cleared and the service is restarted.
diff --git a/Documentation/concepts/operation.md b/Documentation/concepts/operation.md
@@ -0,0 +1,31 @@
+# Operation
+
+
+## Architecture
+
+Clair is structured so that it can be easily scaled with demand. It can be
+broken up into up to 3 microservices as needed ([Indexer], [Matcher], and
+[Notifier]) or run as a single monolith. Each process talks to separate tables
+in the database and is responsible for disparate API endpoints.
+
+[Indexer]: #indexer
+[Matcher]: #matcher
+[Notifier]: #notifier
+
+### Indexer
+
+Responsible for ...
+
+### Matcher
+
+Responsible for ...
+
+### Notifier
+
+Responsible for ...
+
+## Ingress
+
+One recommended configuration is to use some sort of service ingress to route
+API endpoints to the component responsible for servicing it.
+
diff --git a/Documentation/contribution/commit_style.md b/Documentation/contribution/commit_style.md
@@ -0,0 +1,42 @@
+# Commit Style
+
+The Clair and ClairCore project utilize well structured commits.
+We suggest signing off on your commits as well.
+
+A typical commit will take on the following structure:
+
+```
+<scope>: <subject>
+
+<body>
+Fixes #1
+Pull Request #2
+
+Signed-Off By: <email>
+```
+
+The header of the commit is regexp checked before commit and your commit will be kicked back if it does not conform.
+
+## Scope
+
+This is the section of code this commit influences. 
+
+You will often see scopes such as "notifier", "auth", "chore", "cicd".
+
+We use this field to group commits by scope in our automated changelog generation.
+
+It would be wise to take a look at our changelog before contributing to get an idea of the common scopes we use.
+
+## Subject
+
+Subject is a short and concise summary of the change the commit is introducing.
+
+## Body
+
+Body should be full of detail.
+
+Explain what this commit is doing and why it is necessary.
+
+You may include references to issues and pull requests as well. Our automated changelog process will discover references prefixed with "Fixes", "Closed" and "Pull Request"
+
+