-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bug: autosharding resolves content topics to wrong shard #2538
Comments
You are correct that resolving to The configuration you provide lacks I suggest erroring if
Indeed a problem. Wondering though if it makes sense to configure just 1 shard that is In summary:
|
|
I might be wrong but also in the context of static and auto sharding I was thinking that:
|
Urgh. This is messy in our config and I can understand the confusion.
|
Based on the discussion, we can define the following configurations:
These topic resolutions are based on what I saw implemented in
Please lmk if this is how it should work |
The problem we face is because the network specs. for cluster id != 1 are undefined. Couple points;
Maybe adding a The reasoning behind autosharding for cluster != 1 was to make it easier to debug and test. If this is not the case then what are we doing? |
It does make the testing easier. |
I feel like we should not test cluster != 1 since it's undefined behavior (in the context of autosharding) that would simplify thing greatly no? Since cluster != 1 is for testing purpose only, should it really be tested? Also when you say We need a simpler way to config Nwaku. Seams like this should be simple but it's not and makes testing difficult. |
Can you share that spec please, maybe I'm missing something We started testing autosharding on cluster != 1 mainly because on cluster 1 RLN is enabled by default and the testing frameworks are not configured for relay yet and even if they were, I feel the rate limiting will be a problem for tests where we send lots of messages in parallel Also as discussed with @alrevuelta in this thread, autosharding on other clusters should be possible not only for testing purposes, |
TWN -> https://rfc.vac.dev/spec/64/#network-shards and autosharding -> https://rfc.vac.dev/spec/51/#automatic-sharding We can make changes if some part is unclear.
I see
Yes it is possible. Maybe we could define autosharding on cluster != 1 as having shards 0-1023 just like static shards? https://rfc.vac.dev/spec/51/#static-sharding |
Revisiting this issue. I know this contradicts a bit what I said earlier, but @SionoiS makes some good points:
In short, my suggestion:
WDYT of this suggestion? cc @fbarbu15 @SionoiS @alrevuelta @richard-ramos |
Thanks @jm-clius, that makes sense for me and it would simplify testing for autosharding. |
Unsure what you mean by this. A network is defined by its nodes. You can already do this by running nodes like this and connecting them together.
Sure. Guess we would need a network flag for this (auto vs static). But tbh, if status is not using static sharding, why having auto vs static? Just one type of sharding and that's it (=auto). |
Only, if we assume that a node subscribe to all shards.
Yes, by not allowing autosharding for every cluster we don't have to deal with the edge cases just yet. 👍 |
Within the context of a configuration, RLN membership set and autosharding, the "network" definition is a combination of cluster ID and shards (with implied generation, membership set, rate limit and contract address). We have a single RLN membership for the network defined by cluster ID 1. If I have a node participating in the Waku Network and this node is encapsulated in an application that decides to use autosharding, any content topic the node publishes to should map according to a hash function that is "aware" of the 8 shards that define Gen 0 of the network. This should be true, whether the node is subscribed to all 8 pubsub topics or not. In fact, the subscriptions of the node should not affect the hashing of content topics to pubsub topics at all (by specification).
I don't think strictly-speaking Waku has autosharding - it's just a convenience API on top of a Waku network. We only have "static" sharding on the Waku protocol level. Autosharding provides a convenient application level API that automatically populates the shard/pubsub topic based on content topic. For that it needs to know what the hash space looks like, which depends on a cluster + num_of_shards (and generation) definition. We've only defined TWN 8 shards so far. My proposal is to define one more such network for testing purposes. I don't necessarily think we'd need to add configuration - autosharding is an implied use case if the application uses an API without specifying a pubsub topic. |
imho that's not how it works. This subscribes to all because thats the default
But this, only subscribes to
|
Well I'm even more confused now. I though |
This is not the final solution, but a node should know the available shards of the network. Either directly (topics) or implicitly (like I would deprecate it, but I'm afraid that will break some things that by now I'm unaware how to deal with. |
Right. I notice that this is used to initialize the sharding. The only place this number of shards is used seems to be for autosharding. In retrospect, this seems to me to be the wrong approach, especially because we also subscribe to all Everything else should be deprecated, IMO. |
I think there are 2 issues here"
1st: content topics resolves to a very big shard number if pubsub-topic is not present in the docker flags
EX:
docker run -i -t -p 34696:34696 -p 34697:34697 -p 34698:34698 -p 34699:34699 -p 34700:34700 harbor.status.im/wakuorg/nwaku:latest --listen-address=0.0.0.0 --rest=true --rest-admin=true --websocket-support=true --log-level=DEBUG --rest-relay-cache-capacity=100 --websocket-port=34698 --rest-port=34696 --tcp-port=34697 --discv5-udp-port=34699 --rest-address=0.0.0.0 --nat=extip:172.18.139.12 --peer-exchange=true --discv5-discovery=true --cluster-id=2 --content-topic=/toychat/2/huilong/proto --relay=true --filter=true
Will resolve to
/waku/2/rs/2/58355
While it should resolve to
/waku/2/rs/2/3
2nd: content topics resolves any content topics to 0 if pubsub-topic is present in the docker flags
EX:
docker run -i -t -p 34696:34696 -p 34697:34697 -p 34698:34698 -p 34699:34699 -p 34700:34700 harbor.status.im/wakuorg/nwaku:latest --listen-address=0.0.0.0 --rest=true --rest-admin=true --websocket-support=true --log-level=DEBUG --rest-relay-cache-capacity=100 --websocket-port=34698 --rest-port=34696 --tcp-port=34697 --discv5-udp-port=34699 --rest-address=0.0.0.0 --nat=extip:172.18.139.12 --peer-exchange=true --discv5-discovery=true --cluster-id=2 --pubsub-topic=/waku/2/rs/2/2 --content-topic=/toychat/2/huilong/proto --content-topic=/statusim/1/community/cbor --content-topic=/waku/2/content/test.js --relay=true --filter=true
I can see in the logs
While /toychat/2/huilong/proto should resolve to shard 3, /statusim/1/community/cbor to 4 and /waku/2/content/test.js to 1
The text was updated successfully, but these errors were encountered: