Skip to content

Latest commit

 

History

History
280 lines (218 loc) · 11.9 KB

monitor.md

File metadata and controls

280 lines (218 loc) · 11.9 KB
title perex date author proofreading preferredLang
Monitor
evitaDB's monitoring facilities are designed to help you monitor running evitaDB instances as well as to help you optimise your application during development. All monitoring facilities are based on our operational experience and development of e-commerce projects.
17.1.2023
Ing. Jan Novotný
done
java

Work in progress

This article will contain description of Evita monitoring facilities - would it be directly Prometheus or OpenTelemetry. There should be also information how to log slow queries or see other problems within application (logging). The functionality is not finalized - see issue #18.

Logging

evitaDB uses the SLF4J logging facade for logging both application log messages and access log messages. By default only application log messages are enabled, the access log messages must be explicitly enabled in configuration.

Access log

If the accessLog property is set to true in the configuration, the server will log access log messages for all APIs using the Slf4j logging facade. These messages are logged at the INFO level and contain the ACCESS_LOG marker which you can use to separate standard messages from access log messages.

Access log messages can be further categorized using UNDERTOW_ACCESS_LOG and GRPC_ACCESS_LOG markers. This is because evitaDB uses Undertow web server for REST and GraphQL APIs and separate web server for gRPC. It might be sometimes useful to log these separately because even though they both use the same log format, for example, gRPC doesn't support all properties as Undertow.

Server Logback utilities

evitaDB server comes ready with several custom utilities for easier configuration of the custom logged data.

Note: These utilities are only available in evitaDB server because the rest of the evitaDB codebase doesn't rely on a concrete implementation of the Slf4j logging facade. If the evitaDB is used as embedded instance, the following tools are not available, but can be used as reference to custom implementation in chosen framework.

Log filters

The basic utilities are two Logback filters ready-to-use to easily separate access log messages from app log messages.

There is io.evitadb.server.log.AccessLogFilter filter to only log access log messages. This filter can be used as follows:

<appender name="FILE" class="ch.qos.logback.core.FileAppender">
    <filter class="io.evitadb.server.log.AccessLogFilter"/>
    <file>/path/to/access.log</file>
    <encoder>
        <pattern>%msg%n</pattern>
    </encoder>
</appender>

There is also io.evitadb.server.log.AppLogFilter filter to only log standard log messages. This filter can be used as follows:

<appender name="FILE" class="ch.qos.logback.core.FileAppender">
    <filter class="io.evitadb.server.log.AppLogFilter"/>
    <file>/evita/logs/evita_server.log</file>
    <encoder>
        <pattern>%date %level [%thread] %logger{10} [%file:%line] -%kvp- %msg%n</pattern>
    </encoder>
</appender>

This filter exists because when you enable access logs the log messages with the ACCESS_LOG marker aren't filtered out by default.

Tooling for log aggregators

If a log aggregator is used to consume evitaDB log messages, it is often useful to app log messages as one-line JSON objects. Therefore, there is Logback layout ready-to-use to easily log app log messages as JSON objects. This layout logs messages as JSON objects and makes sure that everything is properly escaped, even newline characters in log messages (e.g. stack traces).

The layout is the io.evitadb.server.log.AppLogJsonLayout layout to log app log messages, and can be used as follows:

<configuration>
    <!-- ... -->
    <appender name="STDOUT" class="ch.qos.logback.core.ConsoleAppender">
        <target>System.out</target>
        <filter class="io.evitadb.server.log.AppLogFilter" />
        <filter class="ch.qos.logback.classic.filter.ThresholdFilter">
            <level>INFO</level>
        </filter>
        <encoder class="ch.qos.logback.core.encoder.LayoutWrappingEncoder">
            <layout class="io.evitadb.server.log.AppLogJsonLayout"/>
        </encoder>
    </appender>
    <!-- ... -->
</configuration>

Readiness and liveness probes

The evitaDB server provides endpoints for Kubernetes readiness and liveness probes. The liveness probe is also c onfigured as healthcheck by default in our Docker image. By default the health check waits 30s before it starts checking the server health, for larger databases you may need to increase this value using environment variable HEALTHCHECK_START_DELAY so that they have enough time to be loaded into memory.

When you change system API port don't forget to set SYSTEM_API_PORT environment variable

The healthcheck in the Docker image is configured to use the default system API port, which is 5557. If you change the port, the health check will immediately report an unhealthy container because it won't be able to reach the probe endpoint. You need to specify the new port using the SYSTEM_API_PORT environment variable of the Docker container.

Both probes are available in the system API and are accessible at the following endpoints:

Readiness probe

curl -k "http://localhost:5557/system/readiness" \
     -H 'Content-Type: application/json'

The probe will return 200 OK if the server is ready to accept traffic, otherwise it will return 503 Service Unavailable. Probe internally calls all enabled APIs via HTTP call on the server side to check if they are ready to serve traffic. Example response:

{
  "status": "READY",
  "apis": {
	"rest": "ready",
	"system": "ready",
	"graphQL": "ready",
	"lab": "ready",
	"observability": "ready",
	"gRPC": "ready"
  }
}

The overall status may be one of the following constants:

STARTING
At least one API is not yet ready.
READY
The server is ready to serve traffic.
STALLING
At least one API that was ready is not ready anymore.
SHUTDOWN
Server is shutting down. None of the APIs are ready.

Each of the enabled APIs has its own status so that you can see which particular API is not ready in case of STARTING or STALLING status.

Liveness probe

curl -k "http://localhost:5557/system/liveness" \
     -H 'Content-Type: application/json'

If the server is healthy, the probe will return 200 OK. Otherwise, it will return 503 Service Unavailable. Example response:

{
  "status": "healthy",
  "problems": []
}

If the server is unhealthy, the response will list the problems.

MEMORY_SHORTAGE
Signalized when the consumed memory never goes below 85% of the maximum heap size and the GC tries to free the old generation at least once. This leads to repeated attempts of expensive old generation GC and pressure on host CPUs.
INPUT_QUEUES_OVERLOADED
Signalized when the input queues are full and the server is not able to process incoming requests. The problem is reported when there is ration of rejected tasks to accepted tasks >= 2. This flag is cleared when the rejection ratio decreases below the specified threshold, which signalizes that server is able to process incoming requests again.
JAVA_INTERNAL_ERRORS
Signaled when there are occurrences of Java internal errors. These errors are usually caused by the server itself and are not related to the client's requests. Java errors signal fatal problems inside the JVM.
EXTERNAL_API_UNAVAILABLE
Signalized when the readiness probe signals that at least one external API, that is configured to be enabled doesn't respond to internal HTTP check call.

Client and request identification

In order to monitor which requests each client executes against evitaDB, each client and each request can be identified by a unique identifier. In this way, evitaDB calls can be grouped by requests and clients. This may be useful, for example, to see if a particular client is executing queries optimally and not creating unnecessary duplicate queries.

Both identifiers are provided by the client itself. The client identifier is expected to be a constant for a particular client, e.g. Next.js application, and will group together all calls to a evitaDB from this client. The request identifier is expected to be a UUID but can be any string value, and will group together all evitaDB calls with this request identifier for a particular client. The request definition (what a request identifier represents) is up to the client to decide, for example, a single request for JavaScript client may group together all evitaDB calls for a single page render.

Configuration

This mechanism is not part of an evitaQL language. Check documentation for your specific client for more information.

If you are using the Java remote client, you are suggested to provide the clientId in evita_external_api/evita_external_api_grpc/client/src/main/java/io/evitadb/driver/config/EvitaClientConfiguration.java for all requests. The requestId is then provided by wrapping your code in a lambda passed to executeWithRequestId method on evita_api/src/main/java/io/evitadb/api/EvitaSessionContract.java interface.

Provide the client and request ids to the server

If you use embedded variant of evitaDB server there is no sense to provide clientId since there is only one client. The requestId is then provided the same way as described above.

To pass the request identification using GraphQL API, our GraphQL API utilizes GraphQL extensions. Therefore, to pass request identification information to the evitaDB, pass the following JSON object within the extensions property of a GraphQL request:

"clientContext": {
  "clientId": "Next.js application",
  "requestId": "05e620b2-5b40-4932-b585-bf3bb6bde4b3"
}

Both identifiers are optional.

In order to pass request identification using REST API, our REST API utilizes HTTP headers. Therefore, to pass request identification information to the evitaDB, pass the following HTTP headers:

X-EvitaDB-ClientID: Next.js application
X-EvitaDB-RequestID: 05e620b2-5b40-4932-b585-bf3bb6bde4b3

Both headers are optional.

Logging

These identifiers can be also used to group application log messages by clients and requests for easier debugging of errors that happened during a specific request. This is done using MDC support. evitaDB passes the client and request identifiers to the MDC context under clientId and requestId names.

The specific usage depends on the used implementation of the SLF4J logging facade. For example in Logback this can be done using %X{clientId} and %X{requestId} patterns in the log pattern:

<encoder>
    <pattern>%d{HH:mm:ss.SSS} %-5level %logger{10} C:%X{clientId} R:%X{requestId} - %msg%n</pattern>
</encoder>