Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kitsune Sharded Gossip #834

Merged
merged 94 commits into from
Aug 5, 2021
Merged

Kitsune Sharded Gossip #834

merged 94 commits into from
Aug 5, 2021

Conversation

freesig
Copy link
Contributor

@freesig freesig commented Jun 18, 2021

No description provided.

kitsune_space: Arc<kitsune_p2p::KitsuneSpace>,
since_ms: u64,
until_ms: u64,
arc_set: Arc<kitsune_p2p_types::dht_arc::DhtArcSet>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was scanning this code and it occurred to me that there's quite a surprising overlap between names coming up in the code here, given Atomic Reference Counting = Arc, and then the DHT related implementation is dealing in Arc's. The only thing that came to my mind about this is possibly leaving simple and short code comments to distinguish that the Arc<> on the outside is different than the ArcSet on the inside for future code reviewers and Holochain contributors?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps. I mean DhtArc is hopefully clearly different to Arc. Where would you put the comment though? This type is used in a lot of places. I hope that anyone familiar enough with rust would see Arc and know what that is. The only other thing to do would be to change the name from DhtArc?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, there's really nothing to be done for it thats useful. I suppose I just thought it was amusing in a way, and unlikely, and pictured some poor reader down the road :p nvm me

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could call it a Darc for short?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The distinction seems clear to me in code, but then when we talk about it more colloquially and in comments, there is this unfortunate name collision.

I still think some variant of "arc" is the best name for it, since it is really helpful to think of it as an arc of a circle, and that is the correct geometrical term.

The only other geometrical term that could make sense is a "sector", which includes the arc points but also includes the area between the two radii. This seems less precise, since really we are only dealing with the one-dimensional wrapping space of DHT locations, and I don't know what the area inside the circle would represent.

Base automatically changed from kitsune-arcsets-2 to develop June 25, 2021 20:45
Comment on lines +452 to +453
// FIXME: This sucks we have to iterate through the whole vec just to add Arcs.
// Are arcs really saving us that much?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ha, this is a case where I thought you were talking about DhtArcs here, not sync::Arcs, and was trying to wrap my head around it.

Well, the primitive type that kitsune deals with is basically Arc. So perhaps p2p_gossip_query_agents should be wrapping them in Arcs before returning. Is there any other reason not to? Other than the slight overhead of Arcs?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lol, I take it you saw my comment.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the distinction seems clear to me in code, but then when we talk about it more colloquially and in comments, there is this unfortunate name collision

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arc are a pain when you don't need them and you want an owned or mut value because you either need to clone the underlying value or try_unwrap it which is hard to get right so I'd prefer that the db didn't return arcs but I'm not going to object to it if everyone thinks it's the way to go.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have some comments in the code thinking the same thing, especially for op hashes, I think it's wasteful to put them all in Arcs especially when there could be thousands of them.

PutMetricDatum { .. } | QueryMetrics { .. } => {
// Same goes for metrics
| QueryGossipAgents { .. }
| FetchOpHashesForConstraints { .. }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we just rename this QueryOpHashes to match the others?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This query returns the agents and their arcs though?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or you mean rename the fetch op hashes?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, rename fetch. "fetch X for constraints" just sounds like a long way to say "query X" to me.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I renamed it, let me know if you object

Copy link
Member

@maackle maackle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still reading through, here's my comments so far

Comment on lines +378 to +383
// TODO: Check if the node is a successful peer or not and set the timeout accordingly.
// TODO: Actually I think this is the wrong time out? This is
// how long we wait to timeout a round.
round_timeout: self
.tuning_params
.gossip_peer_on_success_next_gossip_delay_ms,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should either of these TODOs be addressed now, especially the second one?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeh maybe we just have a completely different tuning param for round timeouts? I don't think it has anything to do with if there was an error in the past or not

@maackle
Copy link
Member

maackle commented Aug 2, 2021

Hmm, the sanity_run_for_five_seconds is sometimes stalling. It just happened in CI and happened once locally, but now I've run it a dozen times locally and it always passes.

Copy link
Member

@maackle maackle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is all my comments. I'll approve after this is wrapped up. The sanity_run_for_five_seconds is also sometimes hanging which we should probably look into as part of this.

Comment on lines +1 to +2
//! This module is the ideal interface we would have for the conductor (or other store that kitsune uses).
//! We should update the conductor to match this interface.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I was wondering what this module is for, I just noticed this comment. What do you mean though by updating the conductor, do you mean updating the whole KitsuneP2pEvent api?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeh exactly

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
//! This module is the ideal interface we would have for the conductor (or other store that kitsune uses).
//! We should update the conductor to match this interface.
//! This module is the ideal interface we would have for the conductor (or other store that kitsune uses).
//! We should update the KitsuneP2pEvent API to match this interface.

Comment on lines +122 to +131
// maackle: this kind of sucks. We have to await the handler task in order
// to join any panics from that task into the test to cause a failure.
// However, if there were no panics, then the task will never end,
// so we need to wait "long enough" for any panics to occur
match tokio::time::timeout(tokio::time::Duration::from_secs(3), task).await {
// Err means timeout, which means no panic, which means the test passed
Err(_) => (),
// Ok means the task ended, which means there was a panic, which means the test should fail
Ok(err) => panic!("handler task panicked: {:?}", err),
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We still need an adequate solution to this mess. @neonphog let's continue the discussion somewhere sometime.

space: &Arc<KitsuneSpace>,
agents: impl Iterator<Item = &Arc<KitsuneAgent>>,
op_hashes: Vec<Arc<KitsuneOpHash>>,
) -> KitsuneResult<Vec<(Arc<KitsuneOpHash>, Vec<u8>)>> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems silly to put the op hash in an Arc and not the op data itself, which is larger. Seems to speak to the question of if whether want to always put our kitsune hash types in Arcs, or if we want to do it case-by-case.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could put the data in an Arc too... but originally the hashes were in various HashSets so they actually got cloned, where the data was not. Once simple_bloom started keeping that sparse data map, perhaps it would have made sense to put an Arc around the data, because it was getting cloned, but the new gossip doesn't have a data map, right? so the data never gets cloned, and the Arc wouldn't help.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeh I don't think we should put this in an arc

fetch_op_hashes_for_constraints -> query_op_hashes
find_remote_agent_within_arc -> find_remote_agent_within_arcset
fetch_op_hash_data -> fetch_op_data
Copy link
Contributor

@neonphog neonphog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm comfortable merging this when others are ready.

Copy link
Member

@maackle maackle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🥳

@freesig freesig merged commit 9ed2a32 into develop Aug 5, 2021
@freesig freesig deleted the kitsune-gossip branch August 5, 2021 23:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants