Permalink
Switch branches/tags
Nothing to show
Find file Copy path
Fetching contributors…
Cannot retrieve contributors at this time
539 lines (403 sloc) 22.3 KB

JEP-207: External Build Logging support in the Jenkins Core

Abstract

On large-scale Jenkins instances master Disk and Network I/O become bottlenecks in particular cases. Build logging and Task reporting are one for the most intensive I/O consumers, hence it would be great to somehow redirect them to an external system. This is a continuation of the original story we had back in 2016 (see the public design document here).

This Jenkins enhancement proposal documents changes in the core required to achieve the external build logging objective (see Reasoning for explanation of the scope). Support of other logging types will be documented in subsequent JEPs.

Specification

Top-level requirements

  • All or almost all build logging is moved from Jenkins filesystem to External Build Log Storage

    • ON: “Task” logging is not considered as a requirement after the scope change

  • Minimize log traffic between Jenkins agents and the master, logs should be reported from slaves directly when it is possible

  • External Build Log Storage has an extensible and pluggable architecture

  • Reference implementations:

    • File-system-based storage (current implementation)

    • One external build log storage based on the open-source stack, e.g. Elasticsearch-Logstash-Kibana (ELK)

    • Nice 2 have: one Cloud-focused implementation, e.g. for AWS CloudWatch

  • Freestyle and Jenkins Pipeline projects are supported out of the box

  • Run console logs should be provided using standard Jenkins interfaces

  • Data migration flow for upgrading instances is technically possible

    • It is OK if it requires special actions outside Jenkins

  • Reference External Build logging implementation provides a feasible performance and fault-tolerance compared to the original filesystem based solution

  • All Global Configurations are designed in a way that they can be configured via Jenkins Configuration-as-Code Plugin

Design decisions

This section contains a not-so-sorted top-level description (aka “braindump”) of use-cases and concerns we need to address in the design.

Events

In this story we consider Build logs as a sequence of events, which need to be registered in the system.

  1. We cannot handle all kinds of executions in Jenkins since plugins may have their specific implementations

  2. For console logs the minimal atomic item for Event is a line (for I/O Stream implementations) or event

Such events may also include metadata so that they can be queried by Jenkins (e.g. "get all log entries for a build step") or other log consumers.

External Build Logging System

  1. We do not implement new external logging system on our own, we want to integrate with existing open-source and/or proprietary systems

  2. The build reporting system should support….

    • Reporting of events - the build log will be splitted to multiple (potentially thousands) events. These timestamped events may be delivered to the storage systems in a random order (e.g. in parallel() builds for Pipeline)

    • The reporting may be performed in parallel

      • Reporting of log data from master and agents will not be synchronized

      • Reporting from a single instance (e.g. Master) may be also parallel

  3. Multiple Jenkins masters may be connected to the same External Log storage

Build logging destinations

Depending on the environment, different build logging destinations may be used. The solution should be generic enough in order to support common destination types The following storage types should be supportable:

  • FileSystem-based storage (default implementation)

  • Industry-standard External Build Logging and storage systems: Fluentd, Logstash, Elasticsearch, etc.

  • SQL-based storages

  • No-SQL storages: Key-value storages, Document-based storages

Number of Logging System destinations

  • We will support different External Build Logging system for different builds

    • It allows updating without data migration

    • It allows configuring different loggers[n][o][p]

Requirements:

  • We implement the new “LogStorageFactory” extension point, which allows tweaking logging strategies

  • By now we do not provide specific implementations excepting reference ones, but we can tweak logging destination via JobProperty or NodeProperty later

    • Pipeline step / declarative will be complicated since we may lose some logging info (self-configuring logging within Pipeline, like JENKINS-41929) Secret handling during Log reporting

  • Logging should be performed on both master and slave

  • Secrets should be shaded on both sides ⇒ password suppression rules should be executed on both master and slaves side

  • This suppression rules should be passed to the node. It causes a potential security [q][r][s]risk if the implementation does not capture secrets properly, because they may go to location Jenkins admin does not control (external storage)

    • JG: If a secret is defined in an environment variable, we are already sending it to the agent via RemoteLaunchCallable.env. So having the ConsoleLogFilter also include the same information is not an issue.

Design decisions:

  • Agent <⇒ Master communication should be always performed via encrypted protocol when we use external build logging (ideally needs a NodeMonitor)

  • We should pass secret filtering options to the remote launcher when we invoke it

Log annotations

  • According to the “Indexing” approach, we have binary and text annotations

  • ConsoleNote is technically a binary one, which is being encoded to a string with a prefix to the output stream

Design decisions:

  • Log annotations should be performed on the master and agent side

  • Binary annotations (ConsoleNote classes) should be encoded into HEX representation and stored as additional annotation fields[t]

    • They will be decoded by Jenkins master[u][v][w] only when it displays it

Log browsing

  • Log browsing should support both local and remote Logging systems

  • The interface should support…

    • Querying and Filtering logs

    • Progressive log output (for running builds and tasks)

    • Annotation visualization in console log

Design decisions:

  • Annotations should be stored in the external storage

  • Storage format is defined by the external log storage implementation

  • If the log storage can store objects, it is recommended to store annotations separately from the text

Log rotation

  • Log rotation is performed as for any other components within Jenkins builds

  • Currently log deletion is implemented as a part of the build deletion

Design decisions:

  • New API should be introduced to support deletion of logs

  • External logging APIs should provide methods for deletion of logs

  • These APIs may implement log deletion…​ or not. In the latter case Jenkins should be able to produce a warning, but it should not impact its operation

  • External log browser implementations should be able to explicitly indicate that there is no logs available

Design

The following new API entities will be introduced:

  • Loggable - interface for objects supporting external logging

  • LogStorage - objects defining log reporting and browsing logic

  • LogStorageFactory - extension point for locating LogStorage

Implementations:

  • File-based LogStorage - logging to the local FileSystem, implements compatibility mode

  • No-op LogStorage - Fallback implementations for reporting errors

The introduced entities are described below.

NEW: Loggable interface

This is a new interface, which will mark all objects supporting external logging. In the current design this interface will be implemented only by Run instances, but other log types may be supported in further implementations.

Loggable interface should provide the following methods:

  • Getters for the LogStorage being used in the object

    • Default implementation - consult with LogStorageFactory extensions

  • Getters for default LogStorage

    • These getters will be used if there is no LogStorage configured for the item

    • For example, `Run`s will be referring File-based storage to retain compatibility

  • boolean isLoggingFinished() - indicates that there is no new logging being performed

  • Charset getCharset() - method, which defines the charset to be used

    • Some instances like Run allow setting charsets explicitly.

    • By this method this requirement is propagated to logging methods

  • getLogFileCompatLocation - provides file path to be used by the File-based storage

    • This method is needed, because instances like Runs have complex logic which defines the storage location

NEW: LogStorage abstract class

LogStorage is a central class which represents the log storage being used for a particular Loggable instance. It defines API for reporting logs and retrieving them.

LogStorage is an @ExportedBean, so its instances can be exported to the REST API.

Methods to be offered:

  • BuildListener createBuildListener() throws IOException, InterruptedException - Build Listener provider.

    • This listener will receive build events and put them to the storage

    • Implementations are responsible to consult with Jenkins security logic like ConsoleLogFilter extension points

  • TaskListener createTaskListener() throws IOException, InterruptedException - Same as createBuildListener(), but for tasks. This is a stub for other task types support in the future

  • Launcher decorateLauncher(@Nonnull Launcher original, @Nonnull Run<?,?> run, @Nonnull Node node) - Launcher decorator for logging. It allows altering the launcher logic in builds, e.g. to inject custom environment. This logic may be invoked by core and plugins (see JENKINS-52914 for limitations).

  • AnnotatedLargeText<T> overallLog() - Get large text for the entire execution/run

Some implementations should be also moved from Run and generalized. Jenkins core or External Logging API will provide default convenience implementations which can be overridden by implementations for better performance.

  • InputStream getLogInputStream() throws IOException - gets the log as an input stream

  • Reader getLogReader() throws IOException - get the log as a Reader

  • String getLog() throws IOException - gets the entire log as a single String

    • This method is deprecated in hudson.model.Run, and it should remain deprecated

  • List<String> getLog(int maxLines) throws IOException - gets a number of log lines as a list of strings

  • File getLogFile() throws IOException - Compatibility method, which retrieves the log as a File.

    • By default a temporary file will be created, unless an implementation offers something better

NEW: LogStorageFactory extension point

This is a low-level extension point, which allows locating LogStorage to be used for a particular Loggable item.

This extension point should offer static methods which consult with all implementations and provide proper extensions. If there is no LogStorageFactory providing implementation, fallback FileLogStorage should be used.

NEW: FileLogStorage

These classes implement extension points and contain the original logic for the Filesystem logging. All Filesystem-specific logic from hudson.model.Run and other such classes should be moved to these implementations.

Patches in Run and AbstractBuild

Integration with Loggable:

  • Run instance should implement Loggable

  • Run stores LogStorage references in fields. These fields can be persisted on the disk

  • Run#onLoad() method restores references to the owner which are stored by LogStorage

  • All methods in Run and child classes implement new APIs used by LogStorage`

  • Run offers a getLogStorage() method which is @Exported

File operations:

  • File logging operations are moved to FileLogStorage

  • Run#getLogFile() method should be deprecated, all usages in the Jenkins core should be cleaned up. The method will be still invoking the compatibility layer from LogStorage so read-only API users do not lose the compatibility

Motivation

The default build logging in Jenkins is known to be a performance and scalability bottleneck at large-scale instances.

  • Build logging from agents goes through master. It produces loads on the network and master’s memory/CPU, especially in the case of massive parallel builds

  • Build browsing goes through master. Every time logs are displayed to users, a request is sent to the Jenkins master in order to load the data

  • Logs are stored on the disk in raw format. It consumes a lot of storage space.

Externalization of Build logging could allow improving the situation a lot, but Core API patches are required to support external logging in AbstractProject-based job types. This JEP is needed in order to specify such core changes. Other changes are documented in subsequent JEPs.

Reasoning

Being compared to the original design in 2016, this design limits the scope of work so that it can be implemented and delivered in a reasonable timeframe.

LogStorage vs. separate LoggingMethod and LogBrowser

The original design in this JEP proposed to keep independent implementations for log reporting and log browsing functionality in order to increase configuration flexibility of implementations.

After the discussion in Cloud Native SIG, it was decided to move this separation to the External Logging API Plugin (JEP-212).

Steps browsing support in the core

In the original JEP it was proposed to support Log browsing for particular steps. This functionality is needed to browse Pipeline FlowNode logs, but it may be also used to browse other segmented logs.

After the review it was decided to NOT add this API to the core. Instead of that, External Logging API implements it for now. If there is a need to support logging of steps, such feature can be added in future core versions in a compatible way (implicit override).

Log migration

During the original discussions in 2016, the log migration topic has been raised. When a logging system is configured, one may expect the logs to be moved (e.g. from filesystem to the external storage).

  • We will NOT implement migration for old builds

  • We are going to provide multiple `LogStorage`s in parallel on a single instance according to the current design

  • We will show logs from the file system till they get log-rotated

Justification:

  1. Not required since we offer smooth migration. All logs on the disk on old instances will be rotated eventually

  2. It would be complicated since we may have multiple log sources.

  3. We would also have to take ConsoleNote annotations into account

Encoding of log files

Currently Jenkins does not set limitations for encoding while doing logging. Any charsets may be used on agent and master sides, and it is hard to manage them. Some implementations also rely on the default encoding in master or agent JVMs, and these encodings may be different. This behavior should be retained, because it is a default one for Freestyle projects.

Although it is expected that all logs eventually switch to UTF-8 (see the JEP-206 proposal for Pipeline), in meantime external logging may be performed in different encodings.

  • Loggable implementations can define the charset to be used

  • LogStorage implementations may implement support of charsets or reject them, it is up to the implementation

  • If the implementation does not support the requested charset, LogStorageFactory may apply a compatibility layer or skip the Log Storage

In the current design, the encoding is up to the LogStorage implementation. The default FileLogStorage implementation must support the default encoding.

Client-side-only Log Browsing

  • We investigated Kibana usage for client-only log browsing during ELK prototyping, and we were able to create it for non-authenticated instances

  • For real-world there are limitations of things to consider:

    • Master-provided logs may be required by CLI, REST API, or by plugins relying on the current master-side implementations (like BlueOcean)

    • Isolation. The log storage (e.g. Elasticsearch or AWS Cloudwatch) may be inaccessible to users at all. Services may have some kind of access tokens for it, but we should not expect any Jenkins user to have such access

    • Network isolation. The services may be just unreachable for user machines

In the current design it was decided that log browsing by default will go through the master. Client-side logging may be implemented via custom RunAction implementations. Support of client-side log browsing may be added in a subsequent JEP.

User Interface

In order to minimize the implementation on the core side, there will be no User Interface for log management in the Jenkins core.

Log management interface will be implemented in the External Logging API plugin.

Backwards Compatibility

Storage compatibility

This JEP guarantees full compatibility of Jenkins instances when they are upgraded and keep the legacy Filesystem-based storage.

On the other hand, some incompatibilities may be introduced for the new external logging modes.

  • Logging of non-UTF-8 charsets

  • Application of non-serializable ConsoleLogFilter implementations

  • etc.

External logging implementations will be responsible to document known incompatibilities and to warn users about it. Some checks will be performed at the External Logging API plugin level.

Run API compatibility

hudson.model.Run offers File getLogFile() method and several other methods, which cannot be universally mapped to external storages.

In order to support them, all LogStorage implementations are expected to provide a File toLogFile() method which ensures compatibility with such old API. It may be done via creating temporary files, so that read-only calls to Run#getLogFile() remain compatible.

Such caching approach implies a performance hit, but the raw File-based APIs are deprecated by this design anyway. There will be no performance overhead on the built-in File-based storage.

Also, caching does not prevent from compatibility issues if one of the plugins invokes Run#getLogFile() and then performs modification of such file. Such logic will be considered as incompatible for new External Logging implementations.

API Extensibility

The designed API can be extended in the future. Although this JEP addresses only external logging for runs, the API is designed in a way which allows supporting other log types later.

Security

This JEP defines the following security requirements:

  • All newly introduced methods should follow the Jenkins security model and perform user and queue authentication permission checks where necessary

  • Existing sensitive information masking logic should be executed on master and agent BEFORE logs are submitted to the external storage. External log storage should not expose secrets

  • The following sensitive data must be masked by default

    • Environment variables and parameters marked as sensitive

    • Credentials contributed by Credentials Binding plugin

    • ConsoleLogFilter implementations if they are Serializable (the most of Pipeline-compatible implementations are already serializable)

  • ConsoleAnnotator-based secret masking (e.g. Mask Passwords plugin) should be implementable in plugins

This Jenkins Enhancement Proposal does not define strong security requirements for external storage implementations. These implementations are responsible to define their security model.

Infrastructure requirements

There is no special infrastructure requirements defined for this JEP. Subsequent JEPs for the implementations may define such infrastructure requirements.

Testing

Functional testing

All tests will be implemented using Jenkins Test Harness or Acceptance Test Harness (ATH) frameworks.

The following use-cases must be covered:

  • Backward compatibility

  • Upgradeability - upgraded instances use the Filesystem Storage by default

  • Smoke tests - logging Method locators are invoked for new runs

Integration testing

Jenkins core will provide Logging methods and browsers only for the File System Log storage. This storage will be covered by existing tests for jobs.

External Logging implementations are expected to implement integration tests using DockerRule or similar technologies, if the target log storage allows it.

Once JENKINS-TODO is implemented, integration tests with External Task Logging API Plugin and one of the reference implementations should be added to the essentialsTest() run.

Load Testing

There is no special log testing requirements for this story. External Logging API and its implementations are responsible to execute performance and load testing, if deemed necessary.