-
Notifications
You must be signed in to change notification settings - Fork 210
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Merged by Bors] - Libp2p integration #2801
Conversation
tryBuild succeeded: |
bors try |
tryBuild failed: |
in the last test kibana failed to setup on time, but otherwise tests look stable |
lp2p/bootstrap/bootstrap.go
Outdated
if !ok { | ||
panic("event must be of type EventHandshakeComplete") | ||
} | ||
if _, exist := peers[hs.PID]; exist { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what does this mean? is it expected to be notified multiple times of the same event? or that the node maybe complete handshake with the same peer multiple times?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, connection with the same peer is not prohibited, the likelihood of it very low, but there is no explicit mechanism to enforce single connection if two peers will try to establish connection simultaneously, or if peer will create multiple connections by intention
lp2p/peerexchange/store.go
Outdated
} | ||
} | ||
|
||
w, err := os.Create(path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i wonder if we should create a tmp file and swap after the write is successful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that would be correct, i will open an issue after merge
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for the explanations.
tortoisebeacon/tortoise_beacon.go
Outdated
// NOTE(dshulyak) moved to goroutine because self-broadcast is applied synchronously | ||
tb.eg.Go(func() error { | ||
if err := tb.publisher.Publish(ctx, protocol, serialized); err != nil { | ||
return fmt.Errorf("broadcast: %w", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from the application logic point of view, a single failure should not stop the tortoise beacon from working, regardless of how unlikely Publish should fail.
return nil | ||
} | ||
|
||
if err := wc.net.Broadcast(ctx, GossipProtocol, broadcast); err != nil { | ||
if err := wc.publisher.Publish(ctx, GossipProtocol, msg); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually i think it may be a good practice to simply put all Publish on goroutine, since it validate synchronously.
lp2p/peerexchange/crawler.go
Outdated
} | ||
pending++ | ||
|
||
go func() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
understood. maybe leave a TODO?
lp2p/host.go
Outdated
if err != nil { | ||
return nil, fmt.Errorf("can't create peer addr from %s: %w", p, err) | ||
} | ||
cm.Protect(addr.ID, "bootstrap") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as discussed during your walk-through, i think we should not special-case bootnode
lp2p/peerexchange/addrbook.go
Outdated
a.logger.Debug("skipping adding non routable address%v", netAddr.String()) | ||
// XXX: this makes tests work with unroutable addresses(loopback) | ||
func (a *addrBook) updateAddress(addr, src *addrInfo) { | ||
if !IsRoutable(addr.IP) && IsRoutable(src.IP) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
still don't understand the rationale. does that mean we won't allow src as one of the peers?
say there is a private network, and it has a public node acting as its only gateway to the public network, we will not connect to it?
just want to understand. i understand its old code.
bors try |
tryBuild succeeded: |
bors merge |
## Motivation ## Open issues ### Internal queuing Gossipsub uses non blocking channels for internal queueing. In general this is reasonable approach, but we could use priority queue in place. Approach that i am thinking about requires small change in the codebase to make queue instantion replaceable. I will do some experimentation and then open a discussion in go-libp2p-pubsub repo. ```go type Queue interface { Next(context.Context) (*RPC, error) Send(context.Context, *RPC) error } type NewQueue() func(buffer int) Queue ``` And use this interface in [router](https://github.com/libp2p/go-libp2p-pubsub/blob/628353661b1292a682ddd898da1594f9795e7670/pubsub.go#L659-L662). ### No delayed confirmations Gossipsub launches async validation in a separate goroutine (limited by a parameter), and it expects goroutine to return before processing a message. Some data can be validated only in certain time window. This time window is weakly synchronized between peers. So it is natural that some peers will send data earlier than others and dropping this data is not an option. We want to buffer such data and validate it after time window arrives. Buffering it using goroutines is not optimal. Instead we could use delayed confirmation using callback (similar to what is done in current codebase). We use this approach for tortoise beacon proposals. I don't see blockers in the codebase that will prevent us from doing it, but requires more work. ### Library doesn't wait for goroutines to exit It doesn't mean that it leaks them, but it definitely doesn't wait for exit. More important resources (e.g. sockets) are cleaned up appropriately, ### Global logging We will need to overwrite logger configuration in ipfs/go-log library which will affect all libp2p subsystems. ## Changes <!-- Please describe in detail the changes made --> ## Test Plan <!-- Please specify how these changes were tested (e.g. unit tests, manual testing, etc.) --> ## TODO - [x] save identity into the file currently only private key is saved - [x] tests refactoring some interfaces are very different, try to find common interface that is easy to mock. - [x] activation - [x] tortoise beacon - [x] syncer - [x] miner - [x] fetch - [x] layerfetcher - [x] api - [x] cmd - [x] hare - [x] nodes are connecting too eagerly with other peers in devnet all nodes have 49 connections (with 50 nodes in total) modify peerexchange.Bootstrap to accept required number of peers, terminate early if connections were eastablished during the round. maybe thats not an issue actually, redundant connections are purged using another component (connection manager) based on separate configuration manager. and the gossip size is bounded. - [x] revert changes to p2p tests i removed some of the redundant tests for a faster debug, need to check if they are needed and re-enable them if thats the case. - [x] tests for bootstrap.Peers and bootstrap.Bootstrap - [x] submit sync binary removal as a separate request - [x] submit http server refactoring as a separate request ## DevOps Notes <!-- Please uncheck these items as applicable to make DevOps aware of changes that may affect releases --> - [x] This PR does not require configuration changes (e.g., environment variables, GitHub secrets, VM resources) - [x] This PR does not affect public APIs - [x] This PR does not rely on a new version of external services (PoET, elasticsearch, etc.) - [x] This PR does not make changes to log messages (which monitoring infrastructure may rely on)
Pull request successfully merged into develop. Build succeeded: |
Motivation
Open issues
Internal queuing
Gossipsub uses non blocking channels for internal queueing. In general this is reasonable approach, but we could use priority queue in place.
Approach that i am thinking about requires small change in the codebase to make queue instantion replaceable. I will do some experimentation and then open a discussion in go-libp2p-pubsub repo.
And use this interface in router.
No delayed confirmations
Gossipsub launches async validation in a separate goroutine (limited by a parameter), and it expects goroutine to return before processing a message.
Some data can be validated only in certain time window. This time window is weakly synchronized between peers. So it is natural that some peers will send data earlier than others and dropping this data is not an option. We want to buffer such data and validate it after time window arrives. Buffering it using goroutines is not optimal.
Instead we could use delayed confirmation using callback (similar to what is done in current codebase).
We use this approach for tortoise beacon proposals.
I don't see blockers in the codebase that will prevent us from doing it, but requires more work.
Library doesn't wait for goroutines to exit
It doesn't mean that it leaks them, but it definitely doesn't wait for exit. More important resources (e.g. sockets) are cleaned up appropriately,
Global logging
We will need to overwrite logger configuration in ipfs/go-log library which will affect all libp2p subsystems.
Changes
Test Plan
TODO
save identity into the file
currently only private key is saved
tests refactoring
some interfaces are very different, try to find common interface that is easy to mock.
nodes are connecting too eagerly with other peers
in devnet all nodes have 49 connections (with 50 nodes in total)
modify peerexchange.Bootstrap to accept required number of peers, terminate early if connections were eastablished during the round.
maybe thats not an issue actually, redundant connections are purged using another component (connection manager) based on separate configuration manager. and the gossip size is bounded.
revert changes to p2p tests
i removed some of the redundant tests for a faster debug, need to check if they are needed and re-enable them if thats the case.
tests for bootstrap.Peers and bootstrap.Bootstrap
submit sync binary removal as a separate request
submit http server refactoring as a separate request
DevOps Notes