Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace the Seeder #3881

Open
zebambam opened this issue Mar 12, 2019 · 39 comments
Open

Replace the Seeder #3881

zebambam opened this issue Mar 12, 2019 · 39 comments
Assignees
Labels
I-SECURITY Problems and improvements related to security. replace_the_seeder

Comments

@zebambam
Copy link

The existing Zcash seeder is written in C (a type- and memory- unsafe language) and would be more expensive to secure to an appropriate standard as-is than it would be to replace it with a more modern equivalent.

Additionally, it does not provide cryptographic authenticity or integrity over the data it provides to clients, and so clients rely on the data not being modified in transit.

We should replace the seeder with one that provides end-to-end cryptographic authenticity and integrity over the data it provides, write it in Rust (a language that provides type and memory safety features, and one that the ECC team has experience with).

The new version should provide data over a web framework instead of DNS, and take advantage of ECC’s CloudFlare distribution in order to cache results to make the system less susceptible to denial-of-service attacks.

Extra data types such as notary data (ticket number #FILL_IN_THIS_TICKET_NUMBER), time data, or future data distribution could be performed using this system quite simply, without requiring protocol additions (as would be the case with a seeder using the DNS protocol).

Without a change to the peer networking protocol between Zcash peers, the system of seeding cannot provide coverage of eclipse and sybil attacks, however to provide such coverage, this ticket will need to be completed, and the resulting system will be compatible with any such peer-to-peer network upgrades.

ECC does not intend to remain the sole authoritative supplier of network nodes for seeding purposes and so another requirement of this work is that it be distributable among other seeder nodes that users could decide to trust via a configuration change.

@softminus
Copy link
Contributor

i'm currently trying to code up a seeder in Rust using zebra's networking crates: https://github.com/superbaud/zcash-cotyledon

@softminus
Copy link
Contributor

i have it connecting to nodes; though it exits immediately afterward. softminus/zcash-cotyledon@fa66767

@softminus
Copy link
Contributor

softminus commented May 17, 2022

Per the zebra book, the zebra-network crate does crawl the network for additional peers (and the number of peers etc can be configured, we'll need to set that sufficiently high), but we will need to implement an inbound service that replies to incoming messages. We can't directly use the one that zebra uses out of the box because it also handles requests for blocks/headers/transactions/etc which is not something that makes sense for a seeder because we won't be keeping up with the chain.

@softminus
Copy link
Contributor

Also ZF has a dns seeder written in Go https://github.com/ZcashFoundation/dnsseeder (putting this down so i dont forget myself)

@teor2345
Copy link
Contributor

teor2345 commented May 24, 2022

Hey @superbaud I'm one of the Zebra engineers, I've done a bit of work on zebra-network.

Here are some things that might help, feel free to ping me on the Zcash discord if you have questions!

i have it connecting to nodes; though it exits immediately afterward. superbaud/zcash-cotyledon@fa66767

It looks like you're blocking on the result of zebra_network::init():
https://github.com/superbaud/zcash-cotyledon/blob/main/src/main.rs#L54

That result is a:

Future<Output = (
    Buffer<BoxService<Request, Response, BoxError>, Request>,
    Arc<std::sync::Mutex<AddressBook>>,
)>

So the zcash-cotyledon call rt.block_on(x) returns:

(
    Buffer<BoxService<Request, Response, BoxError>, Request>,
    Arc<std::sync::Mutex<AddressBook>>,
)

https://github.com/superbaud/zcash-cotyledon/blob/main/src/main.rs#L54

The zebra-network stack keeps crawling the network in the background. But your program exits because there's nothing left to do.

So you'll need some way to access the background tasks:

  1. Keep a reference to the BoxService, which holds tokio::JoinHandles for the network crawl and dial tasks
  2. Check AddressBook::sanitized() for new peers

You can monitor progress via the AddressBook::address_metrics_watcher(), or via zebra-network's logs (it uses the tracing crate).

You can also check for permanent failures via <BoxService as ServiceExt>::ready(). But we hardly ever see them in Zebra, so I wouldn't worry about it that much.

Per the zebra book, the zebra-network crate does crawl the network for additional peers (and the number of peers etc can be configured, we'll need to set that sufficiently high),

The default 75 outbound connection limit should be ok. You might want to set it higher if some peers are dropped from the active list, because Zebra has reached the connection limit with stable connections. (Zebra filters peers from sanitized if we haven't been connected to them for 3 hours.)

We could also implement some kind of peer churn, or add a method with a different time filter. Let us know if we can help here!

@softminus
Copy link
Contributor

softminus commented May 25, 2022

Thank you very much; this gives a good overall view of the features of the codebase that i'm gonna want to hook into!

I do have an issue I'm a bit stuck on -- I pushed some new code (basically a version of Inbound that only answers to zn::Request::Peers) and i'm getting a compilation error that looks like:

error[E0277]: the trait bound `Inbound: Clone` is not satisfied
   --> src/main.rs:233:49
    |
233 |     let (peer_set, address_book) = init(config, inbound, NoChainTip).await;
    |                                    ----         ^^^^^^^ the trait `Clone` is not implemented for `Inbound`
    |                                    |
    |                                    required by a bound introduced by this call
    |
note: required by a bound in `zebra_network::init`
   --> /Users/sasha_work/.cargo/git/checkouts/zebra-d0d93d7ded255019/2f636bf/zebra-network/src/peer_set/initialize.rs:85:66
    |
85  |     S: Service<Request, Response = Response, Error = BoxError> + Clone + Send + 'static,
    |                                                                  ^^^^^ required by this bound in `zebra_network::init`

and it's a bit confusing because I can see that in Zebra, the result of let inbound = ServiceBuilder::new().load_shed().buffer(inbound::downloads::MAX_INBOUND_CONCURRENCY).service(Inbound::new(setup_rx)); does derive Clone (I tested that by adding a .clone() call and didnt see any compilation errors -- there's probably a better way to test!), but when I do it in my code, it doesn't seem to work.

Obviously I'm going to keep trying to figure this out but if you see something obvious I'd be grateful if you could let me know!

@teor2345
Copy link
Contributor

teor2345 commented May 25, 2022

tower::Buffer implements Clone, even if the underlying service does not.

So if you wrap your inbound service in a Buffer, it should become Clone. I'd also recommend adding a load_shed(), to avoid weird errors under load. This is particularly important during startup.

let inbound = ServiceBuilder::new().load_shed().buffer(MAX_INBOUND_CONCURRENCY).service(Inbound::new(setup_rx));

As an alternative, you could replace the oneshot<T> channel with a watch<Option<T>> channel, which implements Clone. Then you can derive Clone on Inbound.

@softminus
Copy link
Contributor

Ohhhhh, I had removed the Buffer part of things thinking that that wouldn't cause anything to break and thinking it was something non-necessary (for it to compile) like load_shed() was. Thanks for pointing that out, now I get a different set of compile errors (which I think I can figure out on my own).

@softminus
Copy link
Contributor

Got it compiling and replying to zn::Request::Peers requests, thank you very much!

@teor2345
Copy link
Contributor

We just wrote up a Zebra startup doc, you might find the peer handling section useful.

@softminus
Copy link
Contributor

So it doesn't work quite yet for bootstrapping as a "fixed seed node" (by removing peers.dat, and invoking zcashd with -dnsseed=0 -port=12345), then adding

static SeedSpec6 pnSeed6_main[] = {
    {{0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0xff,0xff,0x7f,0x00,0x00,0x01}, 8233}
};

to src/chainparamsseeds.h

I do see the zn::Request::Peers requests but it seems to hang, presumably because zcashd expects the fixed seed nodes to be able to answer other requests. Out of curiosity I'm going to try adding some code to print out which IP/port the requests are coming from (because it gets lots of transaction/block/etc requests from other peers) so I can see what zcashd expects the fixed seed nodes to be able to do.

If we're going with the plan to use HTTP/gRPC-style seeding we shouldn't actually need to figure this out but I am curious why zcashd isn't able to use the peer IPs it gets from the fixed seed node (or if there's another problem with the way i'm using zebra-network).

@teor2345
Copy link
Contributor

teor2345 commented Jun 1, 2022

If we're going with the plan to use HTTP/gRPC-style seeding we shouldn't actually need to figure this out but I am curious why zcashd isn't able to use the peer IPs it gets from the fixed seed node (or if there's another problem with the way i'm using zebra-network).

It's possible there is a bug in Zebra's Peers responses, I'm not sure if we've ever tested this particular scenario.

A bug like that could also explain this known Zebra issue:

@softminus
Copy link
Contributor

Oh interesting, thanks for pointing me at that, I'm going to try and look deeper at what zcashd does on startup when I'm pointing it at a zebra-network-based node that only answers to Peers responses.

@teor2345
Copy link
Contributor

teor2345 commented Jun 1, 2022

Please let us know if you find anything, it could be a really simple fix on our end.

@str4d
Copy link
Contributor

str4d commented Jun 3, 2022

I had just started hacking on a Rust DNS seeder myself this evening for some downtime relaxation, and then while also trying to figure out zebra_network details I discovered the traffic on this issue 😄 I'll leave this to the people who actually have time to hack on it properly.

I do see the zn::Request::Peers requests but it seems to hang, presumably because zcashd expects the fixed seed nodes to be able to answer other requests. Out of curiosity I'm going to try adding some code to print out which IP/port the requests are coming from (because it gets lots of transaction/block/etc requests from other peers) so I can see what zcashd expects the fixed seed nodes to be able to do.

@superbaud I think you are misunderstanding how the current seeders operate: they are not in-band seed nodes. That is, zcashd does not anticipate the existence of network peers that only respond to getaddr messages, because that is not how the seeders interact with the network. If such a peer type were desired, it would need a corresponding service bit (to indicate to peers that it does not serve as a regular peer).

The current DNS seeders work by crawling in-band for peers, and then seeding those peers out-of-band, specifically via DNS. The OP suggests seeding via other more secure avenues, but the actual medium doesn't really matter too much for the purpose of structuring the seeder.

I'll specifically describe how zcash-seeder works, which is the seeder that this issue was opened about, the functionality of which needs to be replicated in order to replace it:

  • The seeder maintains a database of nodes it has learned about. The nodes are structured like so:
                seen nodes
               /          \
    (a) banned nodes       available nodes--------------
                          /       |                     \
                  tracked nodes   (b) unknown nodes   (e) active nodes
                 /           \
        (d) good nodes   (c) non-good nodes
    
    • (b) are new node addresses (IPs, domains etc.) that the seeder has learned about via crawling but has not yet tried to reach.
    • (e) is the set of nodes being actively tested, pulled from (b, c, d).
    • (d) (and the subset of (e) that was pulled from (d)) are the nodes that the seeder will advertise.
  • The seeder bootstraps itself by filling (b) with known fixed public peers (e.g. mainnet.z.cash).
    • This was mainly relevant for starting the Zcash network from scratch; nowadays we could also feed in other data sources, like advertised IPs from other Zcash DNS seeders.
  • A bunch of crawler threads are started, that each run a "get peer, test peer, record result" loop. The DNS listener is also spun up.
    • This can be done more nicely with async logic here.
  • Each crawler thread selects a node to query, either "randomly" from (b), or the least-recently-tested node from (c, d).
    • This moves the node to (e).
    • There's some additional decision logic that you can find here.
  • Each selected node is tested by running the handshake, and then conditionally fetching addresses:
    crawler      node
    --------------------
    version --->
            <--- version
            <--- verack
     verack --->
    If it has been less than a day since we last successfully queried
    this peer, close the connection after handshake is complete.
    
    Set a timeout of 120s for Tor, and 30s otherwise.
    getaddr --->
            <--- addr
            <--- addr (if more than 1000 addresses to return)
            <--- addr (if more than 2000 addresses to return)...
             ..
    Connection is closed at timeout (to capture all addr messages).
    
  • If the test fails (e.g. errors during the handshake like oversized messages or invalid headers), or if there have been too many successive failures, the node is added to (a) (banned, won't ever be retried).
    • This is important to avoid polluting the crawler with peers from e.g. code forks or chain forks of Zcash.
  • If the test is successful, then the node is added to (c, d) (the set of tracked peers). The node is specifically added to (d) (treated as a "good" node) if it additionally satisfies a list of conditions:
    • The peer's port must be 8233 on mainnet or 18233 on testnet.
      • DNS only gives out IPs, so the port needs to be predictable.
    • The NODE_NETWORK service bit must be set.
    • The peer's IP must be routable.
    • The peer's client version must be at least REQUIRE_VERSION (currently 170015 in the repo, 170100 on my own seeder, I'll be making a PR shortly).
    • The peer's block height must be at least 0 (we could bump this to require peers to be closer to the tip, but don't currently do so).
    • The peer must be reliably reachable (this is implemented by tracking the success or failure of each connection attempt, with an exponential decay of each attempt's weight).
  • When a DNS request is received, the DNS listener returns a subset of IPs from the "good" set (d) + good (e).

zcashd and zebrad nodes then make DNS queries to the known DNS seeder domain names when they start up (and possibly also when they run out of nodes they can connect to, I forget). It is this second part that is what bambam was proposing be replaced by HTTP or gRPC. I will note that an "advantage" of DNS seeding is that you get "free" caching throughout the DNS ecosystem (IDK if this is something upstream thought about when deciding to use it, but it's worth noting that a Cloudflare-based system is not providing caching as a new feature, only potential DoS resistance). In any case, if the new seeder is designed in a modular way, then any number of seeding components can be offered for serving node IPs, that are all fed by the data obtained from the crawler component.

@softminus
Copy link
Contributor

Thanks for the detailed writeup about the extant seeder, it does clarify what the code is doing!

That is, zcashd does not anticipate the existence of network peers that only respond to getaddr messages, because that is not how the seeders interact with the network. If such a peer type were desired, it would need a corresponding service bit (to indicate to peers that it does not serve as a regular peer).

That makes sense. I understand that it's not what the https://github.com/zcash/zcash-seeder code does, but I had seen the SeedSpec6 pnSeed6_main[] stuff in https://github.com/zcash/zcash/blob/master/src/chainparamsseeds.h#L10 that gets referred to in https://github.com/zcash/zcash/blob/master/src/net.cpp#L1505-L1512 and was wondering (not to replicate something existing, but to see if I could get it working / understand why it doesn't work) if I could get that sort of in-band seeding working with the zebra-network crate (which didn't work out).

The crawling you described in the extant zcash-seeder code seems similar enough to the crawler in the zebra-network crate, though there's some extra filtering (based on port number at least) that doesn't seem to be done in zebra-network -- for instance, I noticed a lot of connections to hosts on port 16125 (which is the port that ZelCash/flux uses; @daira pointed out in the node-dev discord channel that they didn't change the network magic):

tcp        0      0 10.138.0.5:40486        176.146.199.77:16125    ESTABLISHED 3874542/target/debu
tcp        0      0 10.138.0.5:39492        89.58.38.42:16125       ESTABLISHED 3874542/target/debu
tcp        0      0 10.138.0.5:54940        89.58.40.217:16125      ESTABLISHED 3874542/target/debu
tcp        0      0 10.138.0.5:37404        45.142.177.186:16125    ESTABLISHED 3874542/target/debu
tcp        0      0 10.138.0.5:41984        89.58.42.122:16125      ESTABLISHED 3874542/target/debu
tcp        0      0 10.138.0.5:54828        65.108.218.66:16125     ESTABLISHED 3874542/target/debu
tcp        0      0 10.138.0.5:39344        65.108.122.94:16125     ESTABLISHED 3874542/target/debu
tcp        0      0 10.138.0.5:34742        173.212.207.239:16125   ESTABLISHED 3874542/target/debu
tcp        0      0 10.138.0.5:46120        89.58.31.184:16125      ESTABLISHED 3874542/target/debu
tcp        0      0 10.138.0.5:36688        154.12.243.16:16125     ESTABLISHED 3874542/target/debu
tcp        0      0 10.138.0.5:44626        5.161.89.222:16125      ESTABLISHED 3874542/target/debu
tcp        0      0 10.138.0.5:55110        159.69.15.231:16125     ESTABLISHED 3874542/target/debu
tcp        0      0 10.138.0.5:39064        194.13.80.131:16125     ESTABLISHED 3874542/target/debu
tcp        0      0 10.138.0.5:49842        185.211.6.189:16125     ESTABLISHED 3874542/target/debu
tcp        0      0 10.138.0.5:45720        89.58.43.107:16125      ESTABLISHED 3874542/target/debu
tcp        0      0 10.138.0.5:40334        202.61.245.109:16125    ESTABLISHED 3874542/target/debu
tcp        0      0 10.138.0.5:46834        167.86.97.119:16125     ESTABLISHED 3874542/target/debu
tcp        0      0 10.138.0.5:60868        37.120.179.240:16125    ESTABLISHED 3874542/target/debu
tcp        0      0 10.138.0.5:49424        65.21.141.217:16125     ESTABLISHED 3874542/target/debu
tcp        0      0 10.138.0.5:45526        46.38.232.149:16125     ESTABLISHED 3874542/target/debu
tcp        0      0 10.138.0.5:47126        5.189.175.166:16125     ESTABLISHED 3874542/target/debu
tcp        0      0 10.138.0.5:42890        204.10.194.100:16125    ESTABLISHED 3874542/target/debu
tcp        0      0 10.138.0.5:38664        164.132.148.129:16125   ESTABLISHED 3874542/target/debu
tcp        0      0 10.138.0.5:49606        89.58.42.126:16125      ESTABLISHED 3874542/target/debu
tcp        0      0 10.138.0.5:34390        185.193.17.206:16125    ESTABLISHED 3874542/target/debu
tcp        0      0 10.138.0.5:56698        45.157.176.55:16125     ESTABLISHED 3874542/target/debu
tcp        0      0 10.138.0.5:38512        5.45.110.134:16125      ESTABLISHED 3874542/target/debu
tcp        0      0 10.138.0.5:36270        89.58.37.158:16125      ESTABLISHED 3874542/target/debu
tcp        0      0 10.138.0.5:40536        65.21.177.90:16125      ESTABLISHED 3874542/target/debu
tcp        0      0 10.138.0.5:41530        89.58.41.210:16125      ESTABLISHED 3874542/target/debu
tcp        0      0 10.138.0.5:55124        89.58.6.56:16125        ESTABLISHED 3874542/target/debu
tcp        0      0 10.138.0.5:33098        37.120.190.115:16125    ESTABLISHED 3874542/target/debu
tcp        0      0 10.138.0.5:33642        65.108.103.213:16125    ESTABLISHED 3874542/target/debu
tcp        0      0 10.138.0.5:36656        192.99.37.70:16125      ESTABLISHED 3874542/target/debu
tcp        0      0 10.138.0.5:35648        91.244.197.42:16125     ESTABLISHED 3874542/target/debu
tcp        0      0 10.138.0.5:36916        193.26.159.101:16125    ESTABLISHED 3874542/target/debu
tcp        0      0 10.138.0.5:52390        38.242.197.204:16125    ESTABLISHED 3874542/target/debu
tcp        0      0 10.138.0.5:53898        5.161.83.99:16125       ESTABLISHED 3874542/target/debu
tcp        0      0 10.138.0.5:50638        5.161.88.253:16125      ESTABLISHED 3874542/target/debu
tcp        0      0 10.138.0.5:35604        89.58.15.4:16125        ESTABLISHED 3874542/target/debu
tcp        0      0 10.138.0.5:50856        149.202.65.132:16125    ESTABLISHED 3874542/target/debu
tcp        0      0 10.138.0.5:54346        173.249.35.197:16125    ESTABLISHED 3874542/target/debu
tcp        0      0 10.138.0.5:50970        46.38.232.30:16125      ESTABLISHED 3874542/target/debu
tcp        0      0 10.138.0.5:47514        37.120.169.236:16125    ESTABLISHED 3874542/target/debu
tcp        0      0 10.138.0.5:53900        154.53.55.108:16125     ESTABLISHED 3874542/target/debu
tcp        0      0 10.138.0.5:45420        65.108.192.30:16125     ESTABLISHED 3874542/target/debu
tcp        0      0 10.138.0.5:43040        37.120.179.211:16125    ESTABLISHED 3874542/target/debu
tcp        0      0 10.138.0.5:38990        209.195.10.228:16125    ESTABLISHED 3874542/target/debu
tcp        0      0 10.138.0.5:33560        5.161.68.109:16125      ESTABLISHED 3874542/target/debu
tcp        0      0 10.138.0.5:43780        95.216.10.239:16125     ESTABLISHED 3874542/target/debu
tcp        0      0 10.138.0.5:58086        185.78.165.158:16125    ESTABLISHED 3874542/target/debu
tcp        0      0 10.138.0.5:51154        185.78.165.110:16125    ESTABLISHED 3874542/target/debu
tcp        0      0 10.138.0.5:50328        202.61.251.18:16125     ESTABLISHED 3874542/target/debu
tcp        0      0 10.138.0.5:58784        46.38.255.184:16125     ESTABLISHED 3874542/target/debu
tcp        0      0 10.138.0.5:45386        89.58.41.235:16125      ESTABLISHED 3874542/target/debu

@str4d
Copy link
Contributor

str4d commented Jun 4, 2022

I had seen the SeedSpec6 pnSeed6_main[] stuff in https://github.com/zcash/zcash/blob/master/src/chainparamsseeds.h#L10 that gets referred to in https://github.com/zcash/zcash/blob/master/src/net.cpp#L1505-L1512

Ah, that makes sense. And it doesn't work as an address-only seeder because that field is to hard-code some full nodes in the network that can bootstrap new nodes - they "seed" both addresses and blocks. Any fixed nodes added there need to be nodes that the DNS seeders would have considered "good". It's also equivalent to telling people to connect to the network by adding addnode=mainnet.z.cash to their config, so we tended to do the latter instead.

The crawling you described in the extant zcash-seeder code seems similar enough to the crawler in the zebra-network crate, though there's some extra filtering (based on port number at least) that doesn't seem to be done in zebra-network

The reason for that port filtering is because the seeders can't advertise ports over DNS, so clients need to be able to predict it. It would make sense for a more capable seeder to try and identify good peers that are on any port, and then sub-select only default-port peers for DNS seeding (if configured).

The main issue I see with reusing the zebra-network crawler is that for a seeder, we want to crawl the network but not remain actively connected to it, which necessitates a different peer set strategy than zebra-network currently implements. This is for two reasons: we don't want to consume connection resources that other nodes could be using, and we want to advertise peers that others can reach and connect to which means the seeder needs to be repeatedly re-measuring "can I connect to this peer".

@softminus
Copy link
Contributor

The reason for that port filtering is because the seeders can't advertise ports over DNS, so clients need to be able to predict it. It would make sense for a more capable seeder to try and identify good peers that are on any port, and then sub-select only default-port peers for DNS seeding (if configured).

I'm wondering how easy it'd be (without needing to make the seeder a full syncing node by itself) to verify that the good peers are actually synced to the zcash blockchain rather than a different blockchain that's using the same network magic and version strings that look vaguely similar (like MagicBean:6.0.0). Perhaps by manually (or by querying known-good nodes that ECC/ZF operates) figuring out what a decently recent (more recent than any significant chain forks) block is and periodically hardcoding that into the seeder config and querying potential peers for it?

The main issue I see with reusing the zebra-network crawler is that for a seeder, we want to crawl the network but not remain actively connected to it, which necessitates a different peer set strategy than zebra-network currently implements. This is for two reasons: we don't want to consume connection resources that other nodes could be using, and we want to advertise peers that others can reach and connect to which means the seeder needs to be repeatedly re-measuring "can I connect to this peer".

Makes sense; I think this week I'll try and figure out how to implement a more appropriate peer set strategy in rust.

@teor2345
Copy link
Contributor

teor2345 commented Jun 6, 2022

The main issue I see with reusing the zebra-network crawler is that for a seeder, we want to crawl the network but not remain actively connected to it, which necessitates a different peer set strategy than zebra-network currently implements. This is for two reasons: we don't want to consume connection resources that other nodes could be using, and we want to advertise peers that others can reach and connect to which means the seeder needs to be repeatedly re-measuring "can I connect to this peer".

Makes sense; I think this week I'll try and figure out how to implement a more appropriate peer set strategy in rust.

We could also modify zebra-network so it can be configured to disconnect from peers more rapidly.

For example, we could pass a disconnection function on peer set initialisation. Then before each request, we call the disconnection function with the peer address, version, and connection creation time.

Then for a seeder, you tell zebra-network to disconnect from peers after 30 seconds. (Or whatever time you want.)

@teor2345
Copy link
Contributor

teor2345 commented Jun 6, 2022

@superbaud are these zebra-network changes something that would be helpful?

(They'd be really quick for us to do.)

@softminus
Copy link
Contributor

For example, we could pass a disconnection function on peer set initialisation. Then before each request, we call the disconnection function with the peer address, version, and connection creation time.

By "before each request" you mean "triggered when there's a message from the peer" or "before each message initiated internally towards the peer"? If it's the latter, how would we make sure that the disconnection function would get called frequently enough?

@teor2345
Copy link
Contributor

teor2345 commented Jun 6, 2022

For example, we could pass a disconnection function on peer set initialisation. Then before each request, we call the disconnection function with the peer address, version, and connection creation time.

By "before each request" you mean "triggered when there's a message from the peer" or "before each message initiated internally towards the peer"? If it's the latter, how would we make sure that the disconnection function would get called frequently enough?

The peer set would check all peers, before it routes any request to any peer. (We do our existing peer readiness and version checks this way, and they work well.)

Since zebra-network crawls the peer set for new addresses every minute, every peer would get checked at least that often. (You can adjust this interval using Config.crawl_new_peer_interval.)

We could also add a specific Nil request that lets you check the peer set whenever you want.

@teor2345
Copy link
Contributor

teor2345 commented Jun 8, 2022

@superbaud are these zebra-network changes something that would be helpful?

(They'd be really quick for us to do.)

I tried to do this, but it seems like I need to rewrite it as a trait, because Rust's function generics are a bit tricky.

I'm going to be busy for a few days, but I should be able to get back to it next week.

@softminus
Copy link
Contributor

@superbaud are these zebra-network changes something that would be helpful?

I'm not sure, to be honest, I think for the seeder itself it'd be easier to use zebra_network::connect_isolated_tcp_direct and manually connect/query/disconnect from peers instead of trying to get the zebra peer set algorithm to behave drastically differently.

@teor2345
Copy link
Contributor

@superbaud are these zebra-network changes something that would be helpful?

I'm not sure, to be honest, I think for the seeder itself it'd be easier to use zebra_network::connect_isolated_tcp_direct and manually connect/query/disconnect from peers instead of trying to get the zebra peer set algorithm to behave drastically differently.

No worries, happy to do whatever is easiest for you.

Please let us know if you run into any issues!

@softminus
Copy link
Contributor

There's some peer metadata which we need access to (from the connect_isolated-style functions) in order to reproduce the seeder filtering algorithms and I'm unsure how to make that happen:

  • The value of the PeerServices bitset
  • The numeric protocol version number (the stuff that looks like 170100)
  • The textual user-agent string (the stuff that looks like /MagicBean:5.0.0/)

I see there's a negotiate_version in handshake.rs that seems to do the version handshaking and it returns a tuple (remote_version, remote_services, remote_canonical_addr) (which is almost all we want, it's missing the text user-agent string).

The place where negotiate_version is called is impl<S, PeerTransport, C> Service<HandshakeRequest<PeerTransport>> for Handshake<S, C>; but I'm not sure how the latter gets called exactly, and it doesn't return/pass through the remote version/services info.

@teor2345
Copy link
Contributor

teor2345 commented Jun 22, 2022

There's some peer metadata which we need access to (from the connect_isolated-style functions) in order to reproduce the seeder filtering algorithms

I'll see if I can change connect_isolated* to return that data, it should be a pretty quick change.

@softminus
Copy link
Contributor

Also, would it be reasonable to write up a patch to augment MetaAddr to keep track of the numeric protocol version number and the textual user-agent string?

@teor2345
Copy link
Contributor

Also, would it be reasonable to write up a patch to augment MetaAddr to keep track of the numeric protocol version number and the textual user-agent string?

In Zebra we try to track serialized data, per-connection data, and inbound/outbound connection data separately.

The current design is roughly:

  • AddrV1 and AddrV2 - serialized outbound connection address in addr and addr2 messages
  • MetaAddr - outbound peer connection addresses and last connection status, serialized from AddrV1 and AddrV2
  • LoadTrackedClient - contains current protocol version for an active inbound or outbound connection

So I think we might want to do something like:

  • split the Version out of LoadTrackedClient into a new PeerConnectionInfo struct
  • add whatever other fields we want
  • add PeerConnectionInfo to LoadTrackedClient and MetaAddr (as last_connection_info)
  • return LoadTrackedClient from connect_isolated* (and rename that struct to ConnectedClient)

Would that work for how you want to use MetaAddr?

@softminus
Copy link
Contributor

I think so (I hadn't seen LoadTrackedClient yet, thanks for pointing it at me)! If connect_isolated* can give us the peer services bitmask, the numeric protocol version number, and the textual user-agent string one way or another we should be able to get this to work.

The reason I suggested MetaAddr is because I found myself considering reimplementing some code that'd look suspiciously like MetaAddr and AddressBook to keep track of discovered peers -- which would be silly to do if we can add the relevant fields to MetaAddr and thus take advantage of the work that's been done with the AddressBook structure and functions.

@teor2345
Copy link
Contributor

teor2345 commented Jul 4, 2022

Just letting you know that I'm starting on this today.

Sorry it has taken longer than I thought, we've been tuning up Zebra to deal with the extra transaction load on the network.

@softminus
Copy link
Contributor

Just letting you know that I'm starting on this today.

Sorry it has taken longer than I thought, we've been tuning up Zebra to deal with the extra transaction load on the network.

No worries and thanks; let me know if I can help out or give feedback or anything.

@teor2345
Copy link
Contributor

teor2345 commented Aug 2, 2022

I just opened a draft PR with these changes today:
ZcashFoundation/zebra#4870

Here's what's in the PR:

  • return peer metadata from the connect_isolated functions

Here's what we still have to do:

  • add extra peer metadata to MetaAddr
  • add unit tests

Once you're happy with the metadata, it should be pretty easy to add it to MetaAddr as part of our handshake address book updates.

Sorry this has taken a while, network load and Zcon have been taking up most of my time!

@teor2345
Copy link
Contributor

teor2345 commented Aug 2, 2022

Oh and if you need any extra AddressBook methods, just let me know.

@softminus
Copy link
Contributor

I'm going to look at that PR today and give feedback, thank you so much for working on it!

@softminus
Copy link
Contributor

Added my feedback comment on the PR itself: ZcashFoundation/zebra#4870 (comment)

@teor2345
Copy link
Contributor

teor2345 commented Feb 6, 2023

@softminus just wondering how the DNS seeder is going?

The Zebra team might have some time to work on the DNS seeder itself or zebra-network API changes over the next few months. But we're still at the proposal/scope stage right now.

Would you like some help finishing it off? How much work is left, do you think?

I'm happy to chat over Discord or my zfnd.org email if that's easier, I'm teor on both.

@softminus
Copy link
Contributor

@softminus just wondering how the DNS seeder is going?

The Zebra team might have some time to work on the DNS seeder itself or zebra-network API changes over the next few months. But we're still at the proposal/scope stage right now.

Would you like some help finishing it off? How much work is left, do you think?

I'm happy to chat over Discord or my zfnd.org email if that's easier, I'm teor on both.

Hey, sorry I didn't get around to replying to this until now; https://github.com/softminus/zcash-cotyledon is the repo I'm working on and it does sort of work but I am in the middle of a refactor to try and make it so the probe/criteria parameters can be adjusted in a config file vs hardcoded.

I would definitely be up to working together to finish it off or even just chatting about it, let me know when would work for you.

@teor2345
Copy link
Contributor

it does sort of work but I am in the middle of a refactor to try and make it so the probe/criteria parameters can be adjusted in a config file vs hardcoded.

Sounds great!

I would definitely be up to working together to finish it off or even just chatting about it, let me know when would work for you.

We're still in the planning stages ourselves, we might end up with slightly different ways of scanning the network and putting it all into an application.

Because we know zebra-network and zebrad really well, it might be easier for us to tweak their existing scanning and application behaviour to get what we want. (Or it might not!) So we might be interested in a separate crate that takes peer addresses + info, and selects the addresses to put into DNS.

We haven't decided on the DNS server we're going to use yet. Here's one possible DNS server, do you know of any others?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
I-SECURITY Problems and improvements related to security. replace_the_seeder
Projects
None yet
Development

No branches or pull requests

5 participants