Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace NodeRpcClient with direct NodeClient implementation for the client #192

Merged
merged 2 commits into from
May 23, 2024

Conversation

dastansam
Copy link
Contributor

@dastansam dastansam commented May 13, 2024

closes #27

Notes

SubspaceRpcApiServer

Implementation is mostly borrowed from SubspaceRpcApiServer implementation. I removed deny_unsafe feature, since this implementation is not exposed to outside and modified subscription functions.

MaybeNodeClient

I am still using the old way of instantiating a NodeClient implementation, i.e first default and then inject it once inner value is instantiated, to ensure it is easy to follow the changes. Should we change this, i.e don't wrap it in struct and ArcSwapOption?

Dynamic client type

I spent some time trying to implement your suggestion about dynamic FullClientT trait, however, couldn't manage to do it. The main blocker was the ProvideRuntimeApi which has an associated Api type which should be specified to make the custom trait object safe:

pub trait FullClientT: ProvideRuntimeApi<Block, Api = SomeApi> + HeaderBackend<Block> + ...;

impl<T> FullClientT for T where ProvideRuntimeApi<Block, Api = SomeApi> + HeaderBackend<Block> + ...;

// and this was the main issue
pub struct InnerNodeClient {
    client: Arc<dyn FullClientT>
}

maybe I was missing something, so if you have some suggestion for this, I can tackle it again

.subscribe()
.map(handle_slot_notification);

Ok(Box::pin(stream))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

had to modify subscription functions, so extra careful look here, please. This used pipe_from_stream(pending, stream) function and it seems it's impossible to instantiate PendingSubscriptionSink without using jsonrpsee proc-macro magic.

self.subscription_executor.spawn(
    "subspace-slot-info-subscription",
    Some("rpc"),
    pipe_from_stream(pending, stream).boxed(),
);

},
);

Ok(Box::pin(stream))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment about subscription functions

@dastansam dastansam marked this pull request as ready for review May 13, 2024 16:09
@dastansam dastansam requested a review from nazar-pc May 13, 2024 16:09
@dastansam dastansam self-assigned this May 13, 2024
Copy link
Member

@nazar-pc nazar-pc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is good, left a few notes and in-line comments:

Implementation is mostly borrowed from SubspaceRpcApiServer implementation. I removed deny_unsafe feature, since this implementation is not exposed to outside and modified subscription functions.

If this is the case, it is always nicer to create a file and drop the original in there with no changes, writing in the commit message where it was taken from. Then modifying it in a separate commit such that I see exactly where it is coming from.

I am still using the old way of instantiating a NodeClient implementation, i.e first default and then inject it once inner value is instantiated, to ensure it is easy to follow the changes. Should we change this, i.e don't wrap it in struct and ArcSwapOption?

If this is still necessary then I do not understand why you removed MaybeNodeClient. You are basically doing the same thing, you just renamed it and created InnerNodeClient exposed to the outside.

Inner data structures are usually private, so I'd vote for a similar setup to what was used prior: MaybeNodeClient + a client, in this case something like DirectNodeClient or something along those lines.

I spent some time trying to implement your suggestion about dynamic FullClientT trait, however, couldn't manage to do it. The main blocker was the ProvideRuntimeApi which has an associated Api type which should be specified to make the custom trait object safe:

I wish you talked to me about this. You can have fully generic node client, then wrap it into Box<> or Arc<> (depending on what works better, they both allocate on the heap) and then you can use it as Box<dyn NodeClient> without worrying about generic types of the struct that implements it.

So ideally something like this:

  1. Modify MaybeNodeClient to work with Box<dyn NodeClient> or Arc<dyn NodeClient> or something similar depending on what works best
  2. Copy RPC code with no changes (so I don't have to review it at all), probably the whole file literally
  3. Change RPC code to implement NodeClient instead of SubspaceRpcApiServer and use with MaybeNodeClient

src/backend/farmer/node_client.rs Outdated Show resolved Hide resolved
senders: Vec<async_oneshot::Sender<RewardSignatureResponse>>,
}

/// In-memory cache of last archived segment, such that when request comes back right after
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should only be necessary for RPC server. For in-process client you should be able to access pieces directly. Might be okay for now, but eventually this is definitely suboptimal.

We might want to change the API for example such that subscribe_archived_segment_headers gives you a Box<dyn Something> that you can extract segment header from, but then also iterate over all pieces in that segment. And on drop it'll do the acknowledgemnet automatically. Then we'll be able to more or less return a thin wrapped around ArchivedSegmentNotification instead of doing all of the tracking and matching that RPC server has to do. While RPC client can still wrap the same logic it does today behind nicer API, have alrady done this with Farm and other traits on farmer.

I understand this might seem outside of the scope, but this is the right way of thinking about the problem. Code is not static, we can change and improve APIs when they don't seem to match what we're trying to do particularly well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will create a separate issue for this

Cargo.toml Outdated Show resolved Hide resolved
@dastansam
Copy link
Contributor Author

dastansam commented May 16, 2024

I am mostly done with resolving the comments except this one:

Modify MaybeNodeClient to work with Box or Arc or something similar depending on what works best

do you mean having a dynamic NodeClient like this:

/// Wrapper around `NodeClient`
#[derive(Clone, Default)]
pub(in super::super) struct MaybeNodeClient {
    inner: Arc<dyn NodeClient>,
}

or

/// Inner node client that is injected into wrapper `MaybeNodeClient` once it is fully initialized
pub(in super::super) struct DirectNodeClient {
    client: Box<dyn FullClientT>,
    
// where `FullClientT` is super trait of all traits needed for a client

The first one fails because NodeClient trait does not have Sized trait constraint, second one I explained in the PR description

@nazar-pc
Copy link
Member

I meant the first one, but since it is a MaybeNodeClient, it'll be something like this instead:

#[derive(Debug, Clone, Default)]
pub(in super::super) struct MaybeNodeRpcClient {
    inner: Arc<ArcSwapOption<dyn NodeClient>>,
}

or this, or whatever similar thing works:

#[derive(Debug, Clone, Default)]
pub(in super::super) struct MaybeNodeRpcClient {
    inner: Arc<ArcSwapOption<Box<dyn NodeClient>>>,
}

Basically we need the same logic we had before, just with dyn trait instead of concrete type of the client so you don't have to mess with generics if you don't want to (generics are still an option though).

Copy link
Member

@nazar-pc nazar-pc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renaming MaybeNodeRpcClient into MaybeNodeClient makes sense to me, but putting DirectNodeClient into the same module is not. I'd expect to see almost the same MaybeNodeRpcClient in the same module with slight modifications for dyn trait (which would work with NodeRpcClient as well, so should be done in the first commit) and new node client to be added in a separate module as described with last commit tying everyting together.

Does it make sense? If you could do that and resolve merge conflict with rebased code that'd be perfect.

src/backend/farmer/maybe_node_client.rs Outdated Show resolved Hide resolved
src/backend/farmer/maybe_node_client.rs Outdated Show resolved Hide resolved
@dastansam dastansam force-pushed the replace-rpc-node-client branch 4 times, most recently from cc82f29 to 5a5419e Compare May 21, 2024 00:29
@dastansam
Copy link
Contributor Author

dastansam commented May 21, 2024

ok, finally managed to squash everything into two commits. First commit makes MaybeNodeRpcClient work with dynamic NodeClientExt and updates upstream dependencies. Also adds SubspaceRpc copy which is not compiled but necessary for commit diff. Second commit adds DirectNodeClient and completely removes NodeRpcClient dependency

@nazar-pc
Copy link
Member

You shouldn't need NodeClientExt, just NodeClient. NodeClientExt provides a single additional method that can be implemented with SegmentHeadersStore directly where it is used (it is intentionally NodeClientExt for this reason because only CLI farmer will actually need it, not Space Acres).

Copy link
Member

@nazar-pc nazar-pc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't review src/backend/farmer/direct_node_client.rs carefully yet, but this is in line with what I expected. I have a suspicion that rebasing takes quite a bit of effort for you, I can jump on a call and show how to do it very easily.

src/backend/networking.rs Outdated Show resolved Hide resolved
@@ -473,7 +473,7 @@ pub(super) async fn create_consensus_node(
force_new_slot_notifications: false,
subspace_networking: SubspaceNetworking::Reuse {
node,
bootstrap_nodes: dsn_bootstrap_nodes,
bootstrap_nodes: dsn_bootstrap_nodes.clone(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bootstrap nodes via RPC are used for network stack initialization, but here network stack is already initialized and there is no need to clone them and pass through to the DirectNodeClient. Even here you can actually already set it to Vec<> because it is only used for RPC that we'll be disabling anyway.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in general, I tried to remove all the irrelevant properties/parts of the RPC in the direct node client. Specifically for dsn_bootstrap_nodes, I was hesitant since it is used in the farmer_info()

src/backend/farmer.rs Show resolved Hide resolved
src/backend/farmer/direct_node_client.rs Show resolved Hide resolved
src/backend/farmer/direct_node_client.rs Show resolved Hide resolved
@dastansam
Copy link
Contributor Author

some comments in general:

  • NodeClientExt is used in create_network() function to fetch the last segment headers. Doesn't this make it necessary?
  • I am starting to get a lot more comfortable with squashing and rebasing, but tips for making it easier would help for sure
  • sometimes compiler hangs for a while and for some reason if I add a whitespace and remove it, it starts compiling again. so, sometimes those whitespaces accidentally end up in the commit. But I will be careful and run fmt before pushing :)

@nazar-pc
Copy link
Member

NodeClientExt is used in create_network() function to fetch the last segment headers. Doesn't this make it necessary?

Yes, the only reason it is there is to provide segment headers, but segment headers will come from SegmentHeadersStore. Not a blocker though, can always be refactored later.

I am starting to get a lot more comfortable with squashing and rebasing, but tips for making it easier would help for sure

JetBrains IDEs have excellent UI for them. Very easy to cherry-pick, squash, rebase, reorder commits, I'll show you.

sometimes compiler hangs for a while and for some reason if I add a whitespace and remove it, it starts compiling again. so, sometimes those whitespaces accidentally end up in the commit. But I will be careful and run fmt before pushing :)

I would recommend configuring IDE to always remove all whitespaces (except trailing new line) on save, so it never ends up being committed. Never had compiler getting stuck though 🤔

@dastansam
Copy link
Contributor Author

JetBrains IDEs have excellent UI for them. Very easy to cherry-pick, squash, rebase, reorder commits, I'll show you.

will take a look, but it's quite hard to switch IDEs once you are used to it :)

I would recommend configuring IDE to always remove all whitespaces (except trailing new line) on save, so it never ends up being committed. Never had compiler getting stuck though

I have all of this. Most of the times it does all the fixes and everything is good. But my machine struggles a little bit during the context/branch switches. And compiler getting stuck is probably due to couple of cargo processes running at the same time. but yeah, will be more cautious for sure

Copy link
Member

@nazar-pc nazar-pc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a bunch of small comments, but overall makes sense and does what it should, thanks for patience with me 🙂

src/backend/farmer/direct_node_client.rs Outdated Show resolved Hide resolved
src/backend/farmer/direct_node_client.rs Outdated Show resolved Hide resolved
src/backend/farmer/direct_node_client.rs Outdated Show resolved Hide resolved
src/backend/farmer/direct_node_client.rs Outdated Show resolved Hide resolved
src/backend/farmer/direct_node_client.rs Show resolved Hide resolved
debug!(%piece_index, "Re-creating genesis segment on demand");

// Try to re-create genesis segment on demand
match recreate_genesis_segment(&*self.client, self.kzg.clone()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is problematic in async context, please wrap it in spawn_blocking and await here. In RPC it was a blocking RPC endpoint and didn't need to be wrapped because of it.

Copy link
Contributor Author

@dastansam dastansam May 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was busy with another task, just got myself some time to fix this, sorry.

I tried to do this, but it seems I have to make Kzg struct thread safe. Currently it is not Send so can't have it inside the spawn_blocking closure. Do you have any suggestions to make this as less invasive as possible? Maybe wrap in Arc?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is 100% Send and Sync. Just make sure you move a clone of Kzg and not moving self

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, my bad, issue was cached_archived_segment and MutexGuard.

@dastansam
Copy link
Contributor Author

Left a bunch of small comments, but overall makes sense and does what it should, thanks for patience with me 🙂

I force pushed the changes to last comments, didn't want to create new commit for nits. hope that's ok


async fn piece(&self, piece_index: PieceIndex) -> Result<Option<Piece>, Error> {
let cached_segment = {
let cached_archived_segment = self.cached_archived_segment.lock();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This lock is not async, which means this code will block async executor for the duration of segment re-creation below. Moreover, in original code lock was intentionally held for the duration of the method, while new version doesn't, meaning that during concurrent access it is possible that you'll re-create archived segment unnecessarily twice.

Instead of taking lock twice like you did here, you should have fixed clippy complain by switching to async mutex instead (from async-lock crate), then code would be basically the same as it was before except a few .await points. Also you don't rally need to replace cached segment inside of blocking task, run just recreate_genesis_segment inside of blocking task and everything else can live outside of it just like it did before, we only want to move very expensive segment creation operation out of the async thread.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, resolved. Using async_lock::Mutex now. This affected subscribe_archived_segment_headers, so please take a look at it as well

@dastansam dastansam force-pushed the replace-rpc-node-client branch 2 times, most recently from 52bdcae to 38250f4 Compare May 23, 2024 11:56
@nazar-pc nazar-pc merged commit a2aa980 into main May 23, 2024
8 checks passed
@nazar-pc nazar-pc deleted the replace-rpc-node-client branch May 23, 2024 19:11
@nazar-pc nazar-pc mentioned this pull request May 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Replace RPC node client and disable RPC on the node
2 participants