Skip to content

Conversation

jfreden
Copy link
Contributor

@jfreden jfreden commented Sep 4, 2025

This PR adds signing configuration for cross-cluster API keys.

The actual signing, verification and linking key identities to cross cluster api keys will be added in subsequent PRs.

The new configurations are:

cluster.remote.<cluster_alias>.signing.keystore.alias
cluster.remote.<cluster_alias>.signing.keystore.path
cluster.remote.<cluster_alias>.signing.keystore.secure_password
cluster.remote.<cluster_alias>.signing.keystore.algorithm
cluster.remote.<cluster_alias>.signing.keystore.type
cluster.remote.<cluster_alias>.signing.keystore.secure_key_password
cluster.remote.<cluster_alias>.signing.key.path
cluster.remote.<cluster_alias>.signing.key.secure_passphrase
cluster.remote.<cluster_alias>.signing.certificate.path
  • All new cluster settings can be dynamically updated through the cluster settings API
  • All new secure settings can be reloaded through the reload_secure_settings API
  • All files pointed to by the new settings are hot reloadable and the signing config will automatically be updated when the files change

@jfreden jfreden force-pushed the rcs2.1/conf_signing_cert branch from 970496f to 20ca640 Compare September 5, 2025 14:19
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I considered using the existing X509KeyPairSettings here but it proved difficult because:

  • They're not dynamic
  • If made dynamic, their default values are null, which is not supported for dynamic settings
  • Both default value and extra properties could be passed as parameters, but that just pollutes the code more than what makes sense IMO.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I considered using the existing SSLConfigurationReloader, but it's hardcoded against SSLService and refactoring it to be more generic would impact a lot of other code so I opted to bake all the various update paths into one reloader.

logger.error(Strings.format("No signing credentials found in signing config for cluster [%s]", clusterAlias));
}
} catch (Exception e) {
// Since this can be called by the settings applier we don't want to surface an error here
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not ideal from a client perspective. You can mess up your config through the cluster settings API and it would only be visible from the logs (the settings update would return 200).

The question is how much validation should we do when accepting a configuration through the API? If we start to validate these settings, the order of update operations start to matter. For example the keystore has to contain any password before adding a path to a keystore.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it wasn't for the keystore issue, then I'd definitely argue to make it a failure.

We've got three options, I think:

  1. Don't support changing the both password and filename. That seems limiting, but it's an option.
  2. Require that you update in a particular order and we have code to handle that. Probably that means update and reload the keystore first, and we cater for the case where we need to keep using the old password until the old settings are change.
  3. Be lenient in some ways.

I don't think we should be as lenient as you are here. I believe this will silently accept missing files, etc, which seems like it's too lax.

Also, I think we should consider retaining the old signing setting if the new one can't be loaded.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've moved things around a little so the reloader is responsible for the settings and the signer is only for the signing configs. This enables me to throw exceptions freely in the signer, which in turn means that we can prevent a node to start with bad settings while being more lenient when the dynamic cluster settings are updated through the API.

Also, I think we should consider retaining the old signing setting if the new one can't be loaded.

I agree, I've changed it to work like that.

I believe this will silently accept missing files, etc, which seems like it's too lax.

I'm struggling with how to validate this. To resolve the file path I think we need the path to the config dir, which is not available at validation time. Validating when applying the setting triggers this assertion.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm struggling with how to validate this.

For some reason the version of AbstractScopedSettings.addAffixGroupUpdateConsumer that takes a list of Setting doesn't accept a validator (but the version that takes 2 Settings does).
I think we want to change that so you can run a validator as well.

But, looking at

  • public synchronized void addAffixGroupUpdateConsumer(List<Setting.AffixSetting<?>> settings, BiConsumer<String, Settings> consumer) {
    List<SettingUpdater> affixUpdaters = new ArrayList<>(settings.size());
    for (Setting.AffixSetting<?> setting : settings) {
    ensureSettingIsRegistered(setting);
    affixUpdaters.add(setting.newAffixUpdater((a, b) -> {}, logger, (a, b) -> {}));
    }
    addSettingsUpdater(new SettingUpdater<Map<String, Settings>>() {
    @Override
    public boolean hasChanged(Settings current, Settings previous) {
    return affixUpdaters.stream().anyMatch(au -> au.hasChanged(current, previous));
    }
    @Override
    public Map<String, Settings> getValue(Settings current, Settings previous) {
    Set<String> namespaces = new HashSet<>();
    for (Setting.AffixSetting<?> setting : settings) {
    SettingUpdater affixUpdaterA = setting.newAffixUpdater((k, v) -> namespaces.add(k), logger, (a, b) -> {});
    affixUpdaterA.apply(current, previous);
    }
    Map<String, Settings> namespaceToSettings = Maps.newMapWithExpectedSize(namespaces.size());
    for (String namespace : namespaces) {
    Set<String> concreteSettings = Sets.newHashSetWithExpectedSize(settings.size());
    for (Setting.AffixSetting<?> setting : settings) {
    concreteSettings.add(setting.getConcreteSettingForNamespace(namespace).getKey());
    }
    namespaceToSettings.put(namespace, current.filter(concreteSettings::contains));
    }
    return namespaceToSettings;
    }
    @Override
    public void apply(Map<String, Settings> values, Settings current, Settings previous) {
    for (Map.Entry<String, Settings> entry : values.entrySet()) {
    consumer.accept(entry.getKey(), entry.getValue());
    }
    }
    });
    }

It reports a setting change for every key that gets changed.

We could use a simpler version with a method signature of

    public synchronized void addAffixGroupUpdateConsumer(
        List<Setting.AffixSetting<?>> settings,
        Consumer<Map<String, Settings>> consumer, 
        Consumer<Map<String, Settings>> validator) {
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added the suggested addAffixGroupUpdateConsumer and I'm now using it to validate that the files exist.

We could use a simpler version with a method signature of

By simpler I assume you mean that we could have a single call to consume and validate instead of one per group? I went with the same logic that is already there to not break existing callers. I can try to change the behaviour so the consumer and validator is only called once for all groups if you feel strongly.

@jfreden jfreden added :Security/Security Security issues without another label >enhancement labels Sep 5, 2025
@jfreden jfreden marked this pull request as ready for review September 5, 2025 14:44
@elasticsearchmachine elasticsearchmachine added the Team:Security Meta label for security team label Sep 5, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-security (Team:Security)

@elasticsearchmachine
Copy link
Collaborator

Hi @jfreden, I've created a changelog YAML for you.

@tvernum
Copy link
Contributor

tvernum commented Sep 8, 2025

At a high level this looks good to me. I'll do a thorough review tomorrow.

logger.error(Strings.format("No signing credentials found in signing config for cluster [%s]", clusterAlias));
}
} catch (Exception e) {
// Since this can be called by the settings applier we don't want to surface an error here
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it wasn't for the keystore issue, then I'd definitely argue to make it a failure.

We've got three options, I think:

  1. Don't support changing the both password and filename. That seems limiting, but it's an option.
  2. Require that you update in a particular order and we have code to handle that. Probably that means update and reload the keystore first, and we cater for the case where we need to keep using the old password until the old settings are change.
  3. Be lenient in some ways.

I don't think we should be as lenient as you are here. I believe this will silently accept missing files, etc, which seems like it's too lax.

Also, I think we should consider retaining the old signing setting if the new one can't be loaded.

elasticsearchmachine pushed a commit that referenced this pull request Sep 11, 2025
…s loaded (#134349)

The pattern to load secure settings for later use during secure settings
reload has been repeated across the code base and is now needed by
#134137. This PR extracts
the functionality to a `InMemoryClonedSecureSettings ` utility class.
@jfreden jfreden requested a review from tvernum September 11, 2025 12:48
@jfreden jfreden requested a review from a team as a code owner September 15, 2025 07:46
"cluster.remote." + remoteCluster + ".signing.keystore.path",
getDataPath("/org/elasticsearch/xpack/security/signature/signing." + (inFipsJvm() ? "bcfks" : "jks"))
);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we're missing some signing tests.

We have some in the integ tests, but I'd like to see some low level "does signing with this type of key work" tests for RCA/EC, PKCS#12/BCFKS, etc.

We can hold off on those until we have the signature validation in place if you think it will be easier to write complete tests cases then (when we can verify that signatures validate but mismatched keys don't, and changed bytes don't, etc).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I agree. My plan is to add those tests when we have the validation in place.

@jfreden jfreden merged commit d235a74 into elastic:main Sep 16, 2025
41 checks passed
mridula-s109 pushed a commit to mridula-s109/elasticsearch that referenced this pull request Sep 17, 2025
* Add signing configuration for cross cluster api keys

* Update docs/changelog/134137.yaml

* Add support for validator in addAffixGroupUpdateConsumer

* Use validator

---------

Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
gmjehovich pushed a commit to gmjehovich/elasticsearch that referenced this pull request Sep 18, 2025
* Add signing configuration for cross cluster api keys

* Update docs/changelog/134137.yaml

* Add support for validator in addAffixGroupUpdateConsumer

* Use validator

---------

Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
elasticsearchmachine pushed a commit that referenced this pull request Oct 10, 2025
This PR is a follow up to
#134137 and
#134893. It adds
serialization and verification of the new header: 

- The signing configurations are now used to generate a signature that's passed as a header for cross cluster api keys.
- The signature headers are deserializaed and validated on the server side and auth fails if the validation fails. 

This PR does not use the certificate identity that was added in
#134604 to verify that the
identity in the passed leaf certificate belongs to the signed cross
cluster API key by matching it against the API key certificate identity
pattern. That will be done in a follow up PR to keep the scope of this
PR manageable.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

>enhancement :Security/Security Security issues without another label Team:Security Meta label for security team v9.2.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants