Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Merged by Bors] - Libp2p integration #2801

Closed
wants to merge 102 commits into from

Conversation

dshulyak
Copy link
Contributor

@dshulyak dshulyak commented Sep 20, 2021

Motivation

Open issues

Internal queuing

Gossipsub uses non blocking channels for internal queueing. In general this is reasonable approach, but we could use priority queue in place.

Approach that i am thinking about requires small change in the codebase to make queue instantion replaceable. I will do some experimentation and then open a discussion in go-libp2p-pubsub repo.

type Queue interface {
    Next(context.Context) (*RPC, error)
    Send(context.Context, *RPC) error
}

type NewQueue() func(buffer int) Queue

And use this interface in router.

No delayed confirmations

Gossipsub launches async validation in a separate goroutine (limited by a parameter), and it expects goroutine to return before processing a message.

Some data can be validated only in certain time window. This time window is weakly synchronized between peers. So it is natural that some peers will send data earlier than others and dropping this data is not an option. We want to buffer such data and validate it after time window arrives. Buffering it using goroutines is not optimal.

Instead we could use delayed confirmation using callback (similar to what is done in current codebase).

We use this approach for tortoise beacon proposals.

I don't see blockers in the codebase that will prevent us from doing it, but requires more work.

Library doesn't wait for goroutines to exit

It doesn't mean that it leaks them, but it definitely doesn't wait for exit. More important resources (e.g. sockets) are cleaned up appropriately,

Global logging

We will need to overwrite logger configuration in ipfs/go-log library which will affect all libp2p subsystems.

Changes

Test Plan

TODO

  • save identity into the file
    currently only private key is saved

  • tests refactoring
    some interfaces are very different, try to find common interface that is easy to mock.

    • activation
    • tortoise beacon
    • syncer
    • miner
    • fetch
    • layerfetcher
    • api
    • cmd
    • hare
  • nodes are connecting too eagerly with other peers
    in devnet all nodes have 49 connections (with 50 nodes in total)
    modify peerexchange.Bootstrap to accept required number of peers, terminate early if connections were eastablished during the round.

    maybe thats not an issue actually, redundant connections are purged using another component (connection manager) based on separate configuration manager. and the gossip size is bounded.

  • revert changes to p2p tests
    i removed some of the redundant tests for a faster debug, need to check if they are needed and re-enable them if thats the case.

  • tests for bootstrap.Peers and bootstrap.Bootstrap

  • submit sync binary removal as a separate request

  • submit http server refactoring as a separate request

DevOps Notes

  • This PR does not require configuration changes (e.g., environment variables, GitHub secrets, VM resources)
  • This PR does not affect public APIs
  • This PR does not rely on a new version of external services (PoET, elasticsearch, etc.)
  • This PR does not make changes to log messages (which monitoring infrastructure may rely on)

@bors
Copy link

bors bot commented Oct 22, 2021

try

Build succeeded:

@dshulyak
Copy link
Contributor Author

bors try

bors bot added a commit that referenced this pull request Oct 22, 2021
@bors
Copy link

bors bot commented Oct 22, 2021

try

Build failed:

@dshulyak
Copy link
Contributor Author

in the last test kibana failed to setup on time, but otherwise tests look stable

if !ok {
panic("event must be of type EventHandshakeComplete")
}
if _, exist := peers[hs.PID]; exist {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does this mean? is it expected to be notified multiple times of the same event? or that the node maybe complete handshake with the same peer multiple times?

Copy link
Contributor Author

@dshulyak dshulyak Oct 22, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, connection with the same peer is not prohibited, the likelihood of it very low, but there is no explicit mechanism to enforce single connection if two peers will try to establish connection simultaneously, or if peer will create multiple connections by intention

lp2p/bootstrap/bootstrap.go Outdated Show resolved Hide resolved
lp2p/bootstrap/bootstrap.go Outdated Show resolved Hide resolved
lp2p/bootstrap/peers.go Outdated Show resolved Hide resolved
lp2p/bootstrap/peers.go Outdated Show resolved Hide resolved
lp2p/peerexchange/crawler.go Outdated Show resolved Hide resolved
lp2p/peerexchange/discovery.go Outdated Show resolved Hide resolved
lp2p/peerexchange/discovery.go Outdated Show resolved Hide resolved
lp2p/peerexchange/protocol.go Outdated Show resolved Hide resolved
}
}

w, err := os.Create(path)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i wonder if we should create a tmp file and swap after the write is successful.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that would be correct, i will open an issue after merge

Copy link
Contributor

@countvonzero countvonzero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the explanations.

// NOTE(dshulyak) moved to goroutine because self-broadcast is applied synchronously
tb.eg.Go(func() error {
if err := tb.publisher.Publish(ctx, protocol, serialized); err != nil {
return fmt.Errorf("broadcast: %w", err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from the application logic point of view, a single failure should not stop the tortoise beacon from working, regardless of how unlikely Publish should fail.

return nil
}

if err := wc.net.Broadcast(ctx, GossipProtocol, broadcast); err != nil {
if err := wc.publisher.Publish(ctx, GossipProtocol, msg); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually i think it may be a good practice to simply put all Publish on goroutine, since it validate synchronously.

lp2p/bootstrap/bootstrap.go Outdated Show resolved Hide resolved
}
pending++

go func() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

understood. maybe leave a TODO?

lp2p/host.go Outdated
if err != nil {
return nil, fmt.Errorf("can't create peer addr from %s: %w", p, err)
}
cm.Protect(addr.ID, "bootstrap")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as discussed during your walk-through, i think we should not special-case bootnode

a.logger.Debug("skipping adding non routable address%v", netAddr.String())
// XXX: this makes tests work with unroutable addresses(loopback)
func (a *addrBook) updateAddress(addr, src *addrInfo) {
if !IsRoutable(addr.IP) && IsRoutable(src.IP) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

still don't understand the rationale. does that mean we won't allow src as one of the peers?
say there is a private network, and it has a public node acting as its only gateway to the public network, we will not connect to it?

just want to understand. i understand its old code.

@dshulyak
Copy link
Contributor Author

bors try

bors bot added a commit that referenced this pull request Oct 25, 2021
@bors
Copy link

bors bot commented Oct 25, 2021

try

Build succeeded:

@dshulyak
Copy link
Contributor Author

bors merge

bors bot pushed a commit that referenced this pull request Oct 25, 2021
## Motivation

## Open issues

### Internal queuing 

Gossipsub uses non blocking channels for internal queueing. In general this is reasonable approach, but we could use priority queue in place.

Approach that i am thinking about requires small change in the codebase to make queue instantion replaceable. I will do some experimentation and then open a discussion in go-libp2p-pubsub repo.

```go
type Queue interface {
    Next(context.Context) (*RPC, error)
    Send(context.Context, *RPC) error
}

type NewQueue() func(buffer int) Queue
```

And use this interface in [router](https://github.com/libp2p/go-libp2p-pubsub/blob/628353661b1292a682ddd898da1594f9795e7670/pubsub.go#L659-L662).

### No delayed confirmations

Gossipsub launches async validation in a separate goroutine (limited by a parameter), and it expects goroutine to return before processing a message.

Some data can be validated only in certain time window. This time window is weakly synchronized between peers. So it is natural that some peers will send data earlier than others and dropping this data is not an option. We want to buffer such data and validate it after time window arrives. Buffering it using goroutines is not optimal.

Instead we could use delayed confirmation using callback (similar to what is done in current codebase).

We use this approach for tortoise beacon proposals. 

I don't see blockers in the codebase that will prevent us from doing it, but requires more work.

### Library doesn't wait for goroutines to exit

It doesn't mean that it leaks them, but it definitely doesn't wait for exit. More important resources (e.g. sockets) are cleaned up appropriately,

### Global logging

We will need to overwrite logger configuration in ipfs/go-log library which will affect all libp2p subsystems.


## Changes
<!-- Please describe in detail the changes made -->

## Test Plan
<!-- Please specify how these changes were tested 
(e.g. unit tests, manual testing, etc.) -->

## TODO

- [x] save identity into the file
  currently only private key is saved
- [x] tests refactoring
  some interfaces are very different, try to find common interface that is easy to mock.
  - [x] activation
  - [x] tortoise beacon
  - [x] syncer
  - [x] miner
  - [x] fetch
  - [x] layerfetcher
  - [x] api
  - [x] cmd
  - [x] hare
- [x] nodes are connecting too eagerly with other peers
  in devnet all nodes have 49 connections (with 50 nodes in total)
  modify peerexchange.Bootstrap to accept required number of peers, terminate early if connections were eastablished during the round.

  maybe thats not an issue actually, redundant connections are purged using another component (connection manager) based on separate configuration manager. and the gossip size is bounded.
- [x] revert changes to p2p tests
  i removed some of the redundant tests for a faster debug, need to check if they are needed and re-enable them if thats the case.
- [x] tests for bootstrap.Peers and bootstrap.Bootstrap
- [x] submit sync binary removal as a separate request
- [x] submit http server refactoring as a separate request


## DevOps Notes
<!-- Please uncheck these items as applicable to make DevOps aware of changes that may affect releases -->
- [x] This PR does not require configuration changes (e.g., environment variables, GitHub secrets, VM resources)
- [x] This PR does not affect public APIs
- [x] This PR does not rely on a new version of external services (PoET, elasticsearch, etc.)
- [x] This PR does not make changes to log messages (which monitoring infrastructure may rely on)
@bors
Copy link

bors bot commented Oct 25, 2021

Pull request successfully merged into develop.

Build succeeded:

@bors bors bot changed the title Libp2p integration [Merged by Bors] - Libp2p integration Oct 25, 2021
@bors bors bot closed this Oct 25, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants