Permalink
Switch branches/tags
Nothing to show
Find file Copy path
Fetching contributors…
Cannot retrieve contributors at this time
430 lines (321 sloc) 15.1 KB

JEP-212: External Logging API plugin

Abstract

On large-scale Jenkins instances master Disk and Network I/O become bottlenecks in particular cases. Build logging and Task reporting are one for the most intensive I/O consumers, hence it would be great to somehow redirect them to an external system. This is a continuation of the original story we had back in 2016 (see the public design document here). JEP-TODO defines requirements to the external logging system and Core APIs.

The External Logging API plugin is a separate plugin which offers shared functionality to External Logging implementations.

  • Configuration UI

  • Extension Points for External Logging implementations

  • Base classes for external logging, which simplify implementations

  • Other shared utility classes

Specification

Event class

Event is an atomic entry of the external logging. In the data model it includes the following fields:

  • id (long) - ID of the message. It must be unique within a single LoggingMethod writer

  • message(string) - Contains the message body. It may be a single-line or a multi-line message. The message may also contain annotations.

  • timestamp (long) - Timestamp of the message, measured in milliseconds, between the current time and midnight, January 1, 1970 UTC

  • data (Map<String, Serializable>) - additional metadata. This metadata will be used to categorize event entries.

⚠️

(svanoort) Timestamp when first part of message generated, or last? Or perhaps both?

⚠️

(svanoort) Why not enforce something simpler than Serializable for (meta)data which can be consumed by non-Java clients potentially, i.e. JSON perhaps?

Better yet: JSON (or Map<String, JsonType> that gets converted to a JSON document) for most things, and a Map<String, Serializable> of JenkinsInternal stuff - with as little as possible in the latter category.

To do cloud-native effectively, we need to have something consumable directly from the browser or that potentially be consumed by a non-Java source. (ab)use of java serialization is generally discouraged at this point - we should only be doing it where 100% required.

Actually yeah, here’s my proposal: we have JSON field that can be trivially converted by Java since types will map over. Then we have a Map<String, Serializable> with serializables stored (edited)base64, which can be ignored by non-Java consumers.

Some metadata is mandatory. Requited fields are defined below.

Event metadata

The following metadata entries are required to be present for all event entries:

  • jenkinsId - Instance ID of the Jenkins instance

  • loggableType - Type of the Loggable entry, e.g. run

  • loggableId - Identifier of the Loggable entry

⚠️

(svanoort) Re: jenkinsId - We probably need an API on the master that provides this easily (or at least there wasn’t one the last time I checked, you had to kind of duplicate the method used for web requests).

Run metadata

The following metadata entries are required to be present for all hudson.model.Run event entries:

  • All event metadata

  • jobId - ID of the Job, should be provided by Unique ID Plugin

Pipeline metadata

  • All Run metadata

  • stepId - Identifier of the step ("node")

⚠️

(svanoort) I think you probably mean flowNodeId, since that’s the actual object that is mapped to where a log originates from (1 or more flownodes per step - an important key point).

Extension Points

The plugin should offer the following extension points:

  • ExternalLoggingMethodFactory - produced ExternalLoggingMethod instances for Runs and, eventually, other objects. This extension point implements Describable and should be configurable via UI.

  • ExternalLogBrowserFactory - Same as above, but for ExternalLogBrowser

ExternalLoggingMethod and subclasses

This class defines how the logs should be sent to the storage. Logging method generally does not define the storage itself, because it may be pointing to intermediate log collectors like Fluentd or Logstash.

Produces ExternalLoggingMethod instances for Runs and, eventually, other objects. This extension point implements Describable and should be configurable via UI.

ExternalLoggingMethod is a new extension point on the top of LoggingMethod:

  • It implements LoggingMethod interfaces

  • It provides a new createWriter() method, which produces a ExternalLoggingEventWriter class instance.

    • This method automatically injects the mandatory metadata for events

    • An abstract _createWriter() method is offered for implementation by downstream implementations

  • Being compared to LoggingMethod, the implementation is designed to always perform logging from the agent side

ExternalLoggingEventWriter class:

  • The class is Serializable. It will be sent to the agent side.

  • The class offers the following abstract methods:

    • void writeEvent(Event event) throws IOException

      • writing event to the remote storage

  • The class also stores metadata, which may be injected into the events

    • The class stores a Map of Serializable metadata entries

    • The class offers API, which allow setting the metadata. This API will be used by ExternalLoggingMethod implementations and other logic to provide additional metadata if required

⚠️

(svanoort) Re: Map of Serializable metadata entries Similarly, consider a non-Java map of metadata that can be consumed by other sources, plus a smaller Java serialized grouping.

This makes parsing easier for non-java consumers too

ExternalLogBrowser and subclasses

Log Browser class is an instance, which refers ways to access the logs on the remote storage.

It should offer the following methods:

  • AnnotatedLargeText<T> overallLog() - Get large text for the entire execution/run

  • AnnotatedLargeText<T> stepLog(@CheckForNull String stepId, boolean completed) - Get large text for a particular step

Some implementations should be also moved from Run and generalized. It will provide default convenience methods which can be overridden by implementations for better performance.

  • InputStream getLogInputStream() throws IOException - gets the log as an input stream

  • Reader getLogReader() throws IOException - get the log as a Reader

  • String getLog() throws IOException - gets the entire log as a single String

    • This method is deprecated in hudson.model.Run, and it should remain deprecated

  • List<String> getLog(int maxLines) throws IOException - gets a number of log lines as a list of strings

  • File getLogFile() throws IOException - Compatibility method, which retrieves the log as a File.

    • By default a temporary file will be created, unless an implementation offers something better

ExternalLogBrowser will also provide an abstraction layer for eventual consistency management. This layer will be determined during reference implementation polishing.

Pipeline Bridge extensions

The plugin should also implement Pipeline LogStorage and LogStorageFactory extension points so that it transparently supports Pipeline with existing API.

Pipeline Storage JEP is documented in JEP-TODO.

JSON-based external-logging layer

The API Plugin should offer a convenience layer in order to support a number of most common logging providers like Logstash, Fluentd, Elasticsearch, AWS CloudWatch, etc.

This layer should provide the following features:

  • Utility classes for reading and writing JSON Events, including parsing/writing ConsoleNote objects

  • Base classes for constructing JSON queries for fetching data

⚠️

(svanoort) Consider GraphQL for an event querying API?

Other convenience layers will be defined during prototyping.

Configuration

External Logging API plugin should be fully configurable via WebUI and Configuration-as-Code Plugin. It includes:

  • Selection of LoggingMethod and LogStorage factories

  • Configuration of built-in External Logging and Log Browser factories

  • Any other configuration options

Reference implementation

As a reference implementation of the External Logging API plugin, a new External Logging for Elasticsearch Plugin will be implemented. Other implementations may be also created.

Motivation

JEP-207 introduces a new API in the core for adding External logging features, but it does not provide neither configuration UI nor convenient API for implementing these storage engines. This API plugin does that. One may say that all bits in the current design could be implemented as a part of the Jenkins core. It is true, but detaching of the plugin has the following motivation:

  • The plugin will have a separate release cycle so that changes in it can be delivered and backported independently from the core’s release cycle

  • The approach allows keeping the patches on the core’s side minimal

  • The approach allows integrating with Pipeline Log Storage API introduced in Pipeline plugins (JEP-200)

All External Log Storage implementations are expected to extend this plugin instead of just using API provided by the Core. Core APIs may be still used to define custom LoggingMethodLocator impelemtations, e.g. to define a custom logger allocation logic.

Reasoning

Why ExternalLoggingMethod and ExternalLogBrowser are separated?

After the initial prototyping it was decided to separate Logging Method and LogBrowser to separate pluggable entities. It is different from how Pipeline LogStorage is implemented in this pull request.

Reasons for such approach:

  • ExternalLoggingMethod does not define where logs will be actually stored. For example, logging to Fluentd or Logstash may end up in various storages depending on their configuration (e.g. in Elasticsearch, Redis, AWS CloudWatch, etc.)

  • Log browsing logic may be shared. E.g. with the current design logs can be browsed from Elasticsearch independently of how the logs get there (Logstash or direct push)

  • It gives more flexibility to Jenkins admins and plugin developers

Originally the separation was done inside the Jenkins Core as a part of JEP-207, but then it was decided to move it to External Logging API.

Why do we introduce the Event layer?

Jenkins project usually operates with logs as data streams and lines, especially on the agent side. On the other hand, modern log storage systems operate with "events" - atomic objects which may include multi-line strings and various metadata. Example: exception stacktraces may go to log storage as a single event and then they can be processed by external systems like Logstash if needed. The idea in this plugin is to offer bridge logic which converts stream-based logging into event-based logic.

Jesse Glick has raised the concern that Event layer may not be helpful taking the current state, because a lot of code would need to be updated so that the events get captured properly. Opinion of the JEP sponsor is that this JEP offers a foundation layer so that it may be implemented. Reworking the entire Jenkins API to events is NOT an objective for this JEP-200, but it may be added in subsequent JEPs.

Why do we need JSON layer?

Many popular log storage engines store events in a JSON format: Fluentd, Logstash, Elasticsearch, AWS CloudWatch, etc. Offering a JSON layer as a part of the API plugin could greatly simplify such implementations.

Eventually-consistent storages

Some target storages are eventually consistent. One cannot just write the data to remote storage and then reliably read it. It is critical for log browsers:

  • When a run finishes, querying data does not guarantee we get all the data

  • Loggable#isCompleted() call is not enough, some entries may be missing for "completed" entries

Such issue explains why we need a special watch layer to determine whether logs are actually completed. It explains why we need a special logic/queries to determine whether the log is actually complete.

⚠️

(svanoort) Polling-based or notification-based watch layer? This strikes me as a wrinkle that may be deceptively complex.

Full logic for the layer should be determined during reference implementation.

Backwards Compatibility

The External Logging API plugin will follow the compatibility requirements defined in the upstream JEP-TODO for the core. It will also offer API for plugins, which will allow reporting incompatibilities.

Security

There is no special security requirements defined at this level. JEP-207 defines top-level security requirements.

Infrastructure requirements

There is no special infrastructure requirements defined for this JEP. Subsequent JEPs for the implementations may define such infrastructure requirements.

Testing

Functional testing

All tests will be implemented using Jenkins Test Harness or Acceptance Test Harness (ATH) frameworks.

The following use-cases must be covered:

  • Backward compatibility

  • Upgradeability - upgraded instances use the Filesystem Storage by default

  • Smoke tests - logging Method locators are invoked for new runs

Integration testing

Once JENKINS-TODO is implemented, integration tests with External Task Logging for Logstash Plugin and other reference implementations should be added to the essentialsTest() run.

Load Testing

There is no special log testing requirements for this story. External Logging API and its implementations are responsible to execute performance and load testing, if deemed necessary.

References