Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make initial connection check optional #75

Closed
wants to merge 2 commits into from

Conversation

pk910
Copy link
Contributor

@pk910 pk910 commented Sep 23, 2023

I've added an additional parameter which can be used to disable the initial connection check:

client, err := http.New(ctx,
	http.WithAddress(endpoint),
	http.WithConnectionCheck(false),
)

This is useful when interacting with clients in early testnets, that might not fully follow the specs yet.
Even with these inconsistencies, I'd like to use these clients for other queries and would like to do this kind of error handling myself when necessary.
The forced initial check currently prevents these clients from being used via this library.

I've also made the DVT check optional and changed visibility of the optional code for max flexibility :)
This way, everyone can decide himself which checks are necessary if the built in defaults don't fit the individual needs.
The default behavior of this library is still the same, as the connection & DVT checks are enabled by default.

@mcdee
Copy link
Contributor

mcdee commented Oct 21, 2023

After reviewing the code, it potentially makes sense for this to be part of the multi package, as that has the ability to retry connections built into it. This could be enabled if passed addresses rather than node connections, and would be an extension of the active/inactive system that exists inside it today. Would this meet the requirement?

@mcdee
Copy link
Contributor

mcdee commented Feb 29, 2024

I have looked at this as part of a rewrite of the connection handling system, and the latest master now has the ability for nodes to come and go.

To enable it, add the parameter WithAllowDelayedStart(true) to the service startup. You can query the status of the client with IsActive(ctx) and IsSynced(ctx).

This code impacts a number of critical areas, so won't be released for a while, but if you can test it out against master and provide any feedback it would be appreciated.

@pk910
Copy link
Contributor Author

pk910 commented Feb 29, 2024

It's unfortunately the same "problem" as for the initial commit.

I'm doing all that error handling and liveness checks on my side to have better control over client connections.
I'm effectively using go-eth2-client for types & as a RPC layer only, all these isActive & isSynced checks are making the situation a lot worse.
Now I have to wait for your liveness check to mark the connection alive again after an outage, and can't even ask it to do a ping now...
Not even talking about the immense amount of unnecessary requests that are done due to the connection check.
(But honestly, that was the same with the initial connection check you had before)

I'm actually not sure what you're trying to achieve with these forced extra layers of checks..
Please pass through errors as they appear on RPC level and let the applications above handle that properly.
It should be at least an optional feature to disable these additional checks.

All that liveness checks are really bad for situations where you hold like 40 or 50 client connections.
Not only that it does recurring unnecessary pings to all of these, it's also blocking me from using several endpoints that would work even on unsynced / broken clients - just because your lib says no for no reason...

@mcdee
Copy link
Contributor

mcdee commented Feb 29, 2024

Now I have to wait for your liveness check to mark the connection alive again after an outage, and can't even ask it to do a ping now...

I can expose the method to update the status if you like.

Not even talking about the immense amount of unnecessary requests that are done due to the connection check.

At current this should be 1 request every 30 seconds, to an endpoint that is returning a small amount of data. Are you seeing more than that?

Please pass through errors as they appear on RPC level and let the applications above handle that properly.

When you say RPC level you mean the REST responses? These should be returned already, assuming that the connection is present. Also, I want to keep the protocol somewhat abstracted so don't want to be passing back connection-level errors (for example) beyond what is obtained during the setup phase.

Not only that it does recurring unnecessary pings to all of these, it's also blocking me from using several endpoints that would work even on unsynced / broken clients - just because your lib says no for no reason...

Each endpoint has a check in place, which requires the connection to either be active or synced. Are there any calls you are seeing that have a synced check when it should just be an active check?

@pk910
Copy link
Contributor Author

pk910 commented Feb 29, 2024

I can expose the method to update the status if you like.

I think that'd help a lot 👍
With the ping method exposed the application can still get the connection up on purpose.

It might be worth adding a callback so the application can react to connection losses, but polling the exposed IsActive / IsSynced should be fine too.

At current this should be 1 request every 30 seconds, to an endpoint that is returning a small amount of data. Are you seeing more than that?

Yea, you're right. I've thought you were doing the whole connection check as done before (including static values checks). which would be very heavy when done periodically.
I agree, only calling NodeSyncing is a big improvement on that :)

Also, I want to keep the protocol somewhat abstracted so don't want to be passing back connection-level errors

I think that's the root case of our discussion :D
I do see your point of these abstractions, but it kind of makes the library harder to use when relying on these low level errors in the application logic above.

However, after reviewing your changes again, I think that's actually a good middle way.
I'll try updating my stuff accordingly and will see how it works :)

@mcdee
Copy link
Contributor

mcdee commented Feb 29, 2024

Thanks for the additional info and thoughts.

I have added a public CheckConnectionState() and also added hooks for the various state transitions. I'm a bit happier with where I'm passing contexts around compared to the last version, but do want to carry out some more testing so the API might change a bit there.

I'm considering having different check intervals depending on the state of the connection, for example if the connection is down/desynced then ping it more frequently to find out if it is up/synced. That said, exposing CheckConnectionState() means that the existing system can be supplemented by external code anyway, so I may leave this as a "good enough" default.

@mcdee
Copy link
Contributor

mcdee commented Mar 25, 2024

Functionality now available in the library.

@mcdee mcdee closed this Mar 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants