prep release: v1.16.0 #3032

abernix · 2023-05-03T17:03:34Z

Note

When approved, this PR will merge into the 1.16.0 branch which will — upon being approved itself — merge into main.

Things to review in this PR:

Changelog correctness (There is a preview below, but it is not necessarily the most up to date. See the Files Changed for the true reality.)

Version bumps

That it targets the right release branch (1.16.0 in this case!).

🚀 Features

Add ability to transmit un-redacted errors from federated traces to Apollo Studio

When using subgraphs which are enabled with Apollo Federated Tracing, the error messages within those traces will be redacted by default.

New configuration (tracing.apollo.errors.subgraph.all.redact, which defaults to true) enables or disables the redaction mechanism. Similar configuration (tracing.apollo.errors.subgraph.all.send, which also defaults to true) enables or disables the entire transmission of the error to Studio.

The error messages returned to the clients are not changed or redacted from their previous behavior.

To enable sending subgraph's federated trace error messages to Studio without redaction, you can set the following configuration:

telemetry:
  apollo:
    errors:
      subgraph:
        all:
          send: true # (true = Send to Studio, false = Do not send; default: true)
          redact: false # (true = Redact full error message, false = Do not redact; default: true)

It is also possible to configure this per-subgraph using a subgraphs map at the same level as all in the configuration, much like other sections of the configuration which have subgraph-specific capabilities:

telemetry:
  apollo:
    errors:
      subgraph:
        all:
          send: true
          redact: false # Disable redaction as a default.  The `accounts` service enables it below.
        subgraphs:
          accounts: # Applies to the `accounts` subgraph, overriding the `all` global setting.
            redact: true # Redact messages from the `accounts` service.

By @bnjjj in #3011

Introduce `response.is_primary` Rhai helper for working with deferred responses (Issue #2935) (Issue #2936)

A new Rhai response.is_primary() helper has been introduced that returns false when the current chunk being processed is a deferred response chunk. Put another way, it will be false if the chunk is a follow-up response to the initial primary response, during the fulfillment of a @defer'd fragment in a larger operation. The initial response will be is_primary() == true. This aims to provide the right primitives so users can write more defensible error checking. The introduction of this relates to a bug fix noted in the Fixes section below.

By @garypen in #2945

Time-based forced hot-reload for "chaos" testing

For testing purposes, the Router can now artificially be forced to hot-reload (as if the configuration or schema had changed) at a configured time interval. This can help reproduce issues like reload-related memory leaks. We don't recommend using this in any production environment. (If you are compelled to use it in production, please let us know about your use case!)

The new configuration section for this "chaos" testing is (and will likely remain) marked as "experimental":

experimental_chaos:
  force_hot_reload: 1m

By @SimonSapin in #2988

Provide helpful console output when using "preview" features, just like "experimental" features

This expands on the existing mechanism that was originally introduced in #2242, which supports the notion of an "experimental" feature, and make it compatible with the notion of "preview" features.

When preview or experimental features are used, an INFO-level log is emitted during startup to notify of which features are used and shows URLs to their GitHub discussions, for feedback. Additionally, router config experimental and router config preview CLI sub-commands list all such features in the current Router version, regardless of which are used in a given configuration file.

For more information about launch stages, please see the documentation here: https://www.apollographql.com/docs/resources/product-launch-stages/

By @o0ignition0o, @abernix, and @SimonSapin in #2960

Report `operationCountByType` counts to Apollo Studio (PR #2979)

This adds the ability for Studio to track operation counts broken down by type of operations (e.g., query vs mutation). Previously, we only reported total operation count.

By @bnjjj in #2979

🐛 Fixes

Update to Federation v2.4.2

This update to Federation v2.4.2 fixes a potential bug when an @interfaceObject type has a @requires. This might be encountered when an @interfaceObject type has a field with a @requires and the query requests that field only for some specific implementations of the corresponding interface. In this case, the generated query plan was sometimes invalid and could result in an invalid query to a subgraph. In the case that the subgraph was an Apollo Server implementation, this lead the subgraph producing an "The _entities resolver tried to load an entity for type X, but no object or interface type of that name was found in the schema" error.

By @abernix in #2910

Fix handling of deferred response errors from Rhai scripts (Issue #2935) (Issue #2936)

If a Rhai script was to error while processing a deferred response (i.e., an operation which uses @defer) the Router was ignoring the error and returning None in the stream of results. This had two unfortunate aspects:

the error was not propagated to the client
the stream was terminated (silently)

With this fix we now capture the error and still propagate the response to the client. This fix also adds support for the is_primary() method which may be invoked on both supergraph_service() and execution_service() responses. It may be used to avoid implementing exception handling for header interactions and to determine if a response is_primary() (i.e., first) or not.

e.g.:

    if response.is_primary() {
        print(`all response headers: `);
    } else {
        print(`don't try to access headers`);
    }

vs

    try {
        print(`all response headers: `);
    }
    catch(err) {
        if err == "cannot access headers on a deferred response" {
            print(`don't try to access headers`);
        }
    }

Note
This is a minimal example for purposes of illustration which doesn't exhaustively check all error conditions. An exception handler should always handle all error conditions.

By @garypen in #2945

Fix incorrectly placed "message" in Rhai JSON-formatted logging (Issue #2777)

This fixes a bug where Rhai logging was incorrectly putting the message of the log into the out attribute, when serialized as JSON. Previously, the message field was showing rhai_{{level}} (i.e., rhai_info), despite there being a separate level field in the JSON structure.

The impact of this fix can be seen in this example where we call log_info() in a Rhai script:

  log_info("this is info");

Previously, this would result in a log as follows, with the text of the message set within out, rather than message.

{"timestamp":"2023-04-19T07:46:15.483358Z","level":"INFO","message":"rhai_info","out":"this is info"}

After the change, the message is correctly within message. The level continues to be available at level. We've also additionally added a target property which shows the file which produced the error:

{"timestamp":"2023-04-19T07:46:15.483358Z","level":"INFO","message":"this is info","target":"src/rhai_logging.rhai"}

By @garypen in #2975

Deferred responses now utilize compression, when requested (Issue #1572)

We previously had to disable compression on deferred responses due to an upstream library bug. To fix this, we've replaced tower-http's CompressionLayer with a custom stream transformation. This is necessary because tower-http uses async-compression under the hood, which buffers data until the end of the stream, analyzes it, then writes it, ensuring a better compression. However, this is wholly-incompatible with a core concept of the multipart protocol for @defer, which requires chunks to be sent as soon as possible. To support that, we need to compress chunks independently.

This extracts parts of the codec module of async-compression, which so far is not public, and makes a streaming wrapper above it that flushes the compressed data on every response within the stream.

By @Geal in #2986

Update the `h2` dependency to fix a potential Denial-of-Service (DoS) vulnerability

Proactively addresses the advisory in https://rustsec.org/advisories/RUSTSEC-2023-0034, though we have no evidence that suggests it has been exploited on any Router deployment.

By @Geal in #2982

Rate limit errors emitted from OpenTelemetry (Issue #2953)

When a batch span exporter is unable to send accept a span because the buffer is full it will emit an error. These errors can be very frequent and could potentially impact performance. To mitigate this, OpenTelemetry errors are now rate limited to one every ten seconds, per error type.

By @bryncooke in #2954

Improved messaging when a request is received without an operation (Issue #2941)

The message that is displayed when a request has been sent to the Router without an operation has been improved. This materializes as a developer experience improvement since users (especially those using GraphqL for the first time) might send a request to the Router using a tool that isn't GraphQL-aware, or might just have their API tool of choice misconfigured.

Previously, the message stated "missing query string", but now more helpfully suggests sending either a POST or GET request and specifying the desired operation as the query parameter (i.e., either in the POST data or in the query string parameters for GET queries).

By @kushal-93 in #2955

Traffic shaping configuration fix for global `experimental_enable_http2`

We've resolved a case where the experimental_enable_http2 feature wouldn't properly apply when configured with a global configuration.

Huge thanks to @westhechiang, @leggomuhgreggo, @vecchp and @davidvasandani for discovering the issue and finding a reproducible testcase!

By @o0Ignition0o in #2976

Limit the memory usage of the `apollo` OpenTelemetry exporter (PR #3006)

We've added a new LRU cache in place of a Vec for sub-span data to avoid keeping all events for a span in memory, since we don't need it for our computations.

By @bnjjj in #3006

CHANGELOG.md

docs/source/federation-version-support.mdx

Co-authored-by: Geoffroy Couprie <apollo@geoffroycouprie.com>

CHANGELOG.md

docs/source/federation-version-support.mdx

Co-authored-by: Gary Pennington <gary@apollographql.com>

Merely merges `main` into `dev` after the v1.16.0 release in #3032

prep release: v1.16.0

7d73e71

abernix requested review from a team, Geal, SimonSapin and bnjjj May 3, 2023 17:03

abernix changed the title ~~prep release: v~~ prep release: v1.16.0 May 3, 2023

apollo-bot2 assigned abernix May 3, 2023

abernix changed the base branch from dev to 1.16.0 May 3, 2023 17:04

abernix requested review from garypen and BrynCooke May 3, 2023 17:04

abernix marked this pull request as ready for review May 3, 2023 17:04

abernix requested a review from StephenBarlow as a code owner May 3, 2023 17:04

Remove trailing empty section

18a5a7c

Geal reviewed May 3, 2023

View reviewed changes

CHANGELOG.md Outdated Show resolved Hide resolved

CHANGELOG.md Outdated Show resolved Hide resolved

CHANGELOG.md Outdated Show resolved Hide resolved

CHANGELOG.md Outdated Show resolved Hide resolved

abernix commented May 3, 2023

View reviewed changes

docs/source/federation-version-support.mdx Show resolved Hide resolved

Apply suggestions from code review

3d205ea

Co-authored-by: Geoffroy Couprie <apollo@geoffroycouprie.com>

garypen reviewed May 3, 2023

View reviewed changes

CHANGELOG.md Outdated Show resolved Hide resolved

CHANGELOG.md Outdated Show resolved Hide resolved

CHANGELOG.md Outdated Show resolved Hide resolved

CHANGELOG.md Outdated Show resolved Hide resolved

docs/source/federation-version-support.mdx Show resolved Hide resolved

Geal approved these changes May 3, 2023

View reviewed changes

abernix and others added 2 commits May 3, 2023 20:29

Apply suggestions from code review

1ef747d

Co-authored-by: Gary Pennington <gary@apollographql.com>

Update CHANGELOG.md

c6db5b5

Co-authored-by: Gary Pennington <gary@apollographql.com>

abernix requested a review from garypen May 3, 2023 17:30

garypen approved these changes May 3, 2023

View reviewed changes

abernix enabled auto-merge (squash) May 3, 2023 17:41

abernix merged commit 60285d1 into 1.16.0 May 3, 2023
1 check passed

abernix deleted the prep-1.16.0 branch May 3, 2023 17:46

This was referenced May 3, 2023

release: v1.16.0 #3033

Merged

Reconcile dev after merge to main for v1.16.0 #3034

Merged

abernix added a commit that referenced this pull request May 4, 2023

Reconcile dev after merge to main for v1.16.0 (#3034)

bdf87fc

Merely merges `main` into `dev` after the v1.16.0 release in #3032

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

prep release: v1.16.0 #3032

prep release: v1.16.0 #3032

abernix commented May 3, 2023 •

edited

prep release: v1.16.0 #3032

prep release: v1.16.0 #3032

Conversation

abernix commented May 3, 2023 • edited

🚀 Features

Add ability to transmit un-redacted errors from federated traces to Apollo Studio

Introduce response.is_primary Rhai helper for working with deferred responses (Issue #2935) (Issue #2936)

Time-based forced hot-reload for "chaos" testing

Provide helpful console output when using "preview" features, just like "experimental" features

Report operationCountByType counts to Apollo Studio (PR #2979)

🐛 Fixes

Update to Federation v2.4.2

Fix handling of deferred response errors from Rhai scripts (Issue #2935) (Issue #2936)

Fix incorrectly placed "message" in Rhai JSON-formatted logging (Issue #2777)

Deferred responses now utilize compression, when requested (Issue #1572)

Update the h2 dependency to fix a potential Denial-of-Service (DoS) vulnerability

Rate limit errors emitted from OpenTelemetry (Issue #2953)

Improved messaging when a request is received without an operation (Issue #2941)

Traffic shaping configuration fix for global experimental_enable_http2

Limit the memory usage of the apollo OpenTelemetry exporter (PR #3006)

abernix commented May 3, 2023 •

edited

Introduce `response.is_primary` Rhai helper for working with deferred responses (Issue #2935) (Issue #2936)

Report `operationCountByType` counts to Apollo Studio (PR #2979)

Update the `h2` dependency to fix a potential Denial-of-Service (DoS) vulnerability

Traffic shaping configuration fix for global `experimental_enable_http2`

Limit the memory usage of the `apollo` OpenTelemetry exporter (PR #3006)