Skip to content

Conversation

@OriolMunoz-da
Copy link
Contributor

Fixes too many refreshes happening even if BFT guarantees are still ensured (see #1141 (comment))

[ci]

Signed-off-by: Oriol Muñoz <oriol.munoz@digitalasset.com>
Signed-off-by: Oriol Muñoz <oriol.munoz@digitalasset.com>
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the alternative would be to drop the whole retry mechanism, but the problem then is that it's only refreshed every ~10m (which is a good-enough default otherwise)

val failed = connections.failed
val total = connections.totalNumber
val f = connections.f
if (connections.failed > connections.f)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm should this be f or 2f+1? I thought we want 2f+1 working connections so we can read from them and tolerate f failures.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually that was stupid, refresh already does the check:

val defaultCallConfig = BftCallConfig.default(connections)
            // Most but not all calls will use the default config.
            // Fail early if there are not enough Scans for the default config
            if (!defaultCallConfig.enoughAvailableScans) {
              throw io.grpc.Status.FAILED_PRECONDITION
                .withDescription(
                  s"There are not enough Scans to satisfy f=${connections.f}. Will be retried. State: $newState"
                )
                .asRuntimeException()
            } else {
              connections
            }

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix'd

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(where also the condition is: connections.size >= targetSuccess, i.e. connections.size >= f +1)

Signed-off-by: Oriol Muñoz <oriol.munoz@digitalasset.com>
Copy link
Contributor

@moritzkiefer-da moritzkiefer-da left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks!

Copy link
Contributor

@rautenrieth-da rautenrieth-da left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Would it make sense to add a test, something like the existing "periodically refresh the list of scans" in BftScanConnectionTest?

@OriolMunoz-da OriolMunoz-da merged commit ae01265 into main Jun 18, 2025
60 checks passed
@OriolMunoz-da OriolMunoz-da deleted the oriol/bft-less-refresh branch June 18, 2025 10:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants