Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

graph-chain-ethereum: Avoid adapters with errors #4468

Merged
merged 4 commits into from
Mar 28, 2023

Conversation

mangas
Copy link
Contributor

@mangas mangas commented Mar 17, 2023

  • Wire EndpointMetrics for rpc endpoints
  • Use a percentage of traffic to retest errored adapters

@mangas mangas force-pushed the filipe/rpc-avoid-failed-adapters branch 3 times, most recently from ee5ece0 to 249f64a Compare March 22, 2023 10:29
@mangas mangas marked this pull request as ready for review March 22, 2023 10:31
@mangas mangas requested review from neysofu and leoyvens March 22, 2023 10:32
@mangas mangas force-pushed the filipe/rpc-avoid-failed-adapters branch from 249f64a to 10d915e Compare March 22, 2023 10:35
@@ -841,7 +838,7 @@ impl EthereumAdapter {
#[async_trait]
impl EthereumAdapterTrait for EthereumAdapter {
fn url_hostname(&self) -> &str {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see only one user of this method, and it's for logs. I believe the intention is for all logs to use the provider name, so lets remove this and replace with the provider.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed


pub fn current_error_count(&self) -> u64 {
self.endpoint_metrics
.get_count(&self.adapter.url.as_str().into())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use provider instead to identify the adapter, just to keep the url private?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for adapters we can, it leaks a bit of the impl details onto config, added comments where relevant

pub capabilities: NodeCapabilities,
adapter: Arc<EthereumAdapter>,
pub adapter: Arc<EthereumAdapter>,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This pub seems unecessary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed

@mangas mangas force-pushed the filipe/rpc-avoid-failed-adapters branch 2 times, most recently from 8f1ac22 to b101e9d Compare March 27, 2023 15:09
@mangas mangas requested a review from leoyvens March 27, 2023 15:13
Copy link
Collaborator

@leoyvens leoyvens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My suggestion would be that we switch to always identifying providers by name rather than url, for metrics purposes. Urls can contain sensitive information such as auth data, so it's best to restrict the places where they are exposed. So instead of ChainSection::provider_urls we'd have ChainSection::providers exposing just the provider names for the metrics.

@mangas
Copy link
Contributor Author

mangas commented Mar 27, 2023

Provider names won't work for firehose because we append a number to it while creating the conn_pool so providerX becomes providerX-N where N is 0..Total number of connections. Hence, you we can't dedup based on that, only the url. We could have a "parent provider" which returns providerX for all but it's quite easy to make a small change to this and just break the metrics.

@leoyvens
Copy link
Collaborator

leoyvens commented Mar 27, 2023

Having the 'parent provider' seems like a solution. You are concerned that the user changes the provider name in the config and loses the continuity of the metrics? The same could be said if the URL changes, I don't think one is more likely to change than the other.

@mangas
Copy link
Contributor Author

mangas commented Mar 27, 2023

Having the 'parent provider' seems like a solution. You are concerned that the user changes the provider name in the config and loses the continuity of the metrics? The same could be said if the URL changes, I don't think one is more likely to change than the other.

My concern is more about the code allowing for a dev to do the wrong thing inadvertently since the details of the conn pool leak into the different components like the metrics and deduplication

- Wire EndpointMetrics for rpc endpoints
- Use a percentage of traffic to retest errored adapters
@mangas mangas force-pushed the filipe/rpc-avoid-failed-adapters branch 3 times, most recently from 057f16d to a42b4e2 Compare March 28, 2023 10:38
@mangas mangas force-pushed the filipe/rpc-avoid-failed-adapters branch from a42b4e2 to 9cd8458 Compare March 28, 2023 15:28
&format!("{}-{}", provider.label, i),
// This label needs to be the original label so that the metrics
// can be deduped.
&provider.label,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a functional change right, instead of segregating the metrics by individual conns, they are grouped for the whole pool? Makes sense to me, but just confirming.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The metrics will be tagged with provider/host, result, request type and so on. I changed from hosts to providers which means that the dedup needs to happen on the provider level (eg, if one conn of providerA fails, all of the providerA conns should be de-prioritised until checked again).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense to me.

@mangas mangas merged commit 7f67bb8 into master Mar 28, 2023
@mangas mangas deleted the filipe/rpc-avoid-failed-adapters branch March 28, 2023 18:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants