Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug: triggerDiscovery not implemented for handleNewRelayTopicSubscription #1044

Closed
romanzac opened this issue Feb 26, 2024 · 10 comments
Closed
Assignees
Labels
bug Something isn't working

Comments

@romanzac
Copy link
Collaborator

romanzac commented Feb 26, 2024

Description

Discovery in not started when new topic is subscribed and is in an unhealthy state. When I try to process additional event "relay.PEER_JOINED", the health of topic still won't change. It is not clear how the new topic becomes healthy.

To Reproduce

Checkout 0091a52
cd go-waku/waku/v2/peermanager
go test -run TestHandlePeerTopicEvent

Expected behavior

Discovery implemented for the situation new topic + unhealthy state. Or description on how the discovery should be conducted otherwise.

@romanzac romanzac added the bug Something isn't working label Feb 26, 2024
@romanzac
Copy link
Collaborator Author

romanzac commented Feb 26, 2024

More info visible at b8b39a0 Despite peerstore having the information, the topic members won't get updated.

call pm.host.Peerstore().(*wps.WakuPeerstoreImpl).PeersByPubSubTopic(pubSubTopic) - OK
call peerTopic.topic.ListPeers() - No peers after event handling is done.

Could be the reason relay protocol (relay.WakuRelayID_v200) I use in test?

// getPeers for getting all the peers for a given protocol
// since peerMap is only used in peerManager that's why it is unexported
func (slots *ServiceSlots) getPeers(proto protocol.ID) *peerMap {
	if proto == relay.WakuRelayID_v200 {
		return nil
	}
	slots.mu.Lock()
	defer slots.mu.Unlock()
	if slots.m[proto] == nil {
		slots.m[proto] = newPeerMap()
	}
	return slots.m[proto]
}

@chaitanyaprem
Copy link
Collaborator

chaitanyaprem commented Mar 4, 2024

This is not implemented, as it was not required.

The peer discovery gets triggered due to the peerManager connectivityLoop which keeps checking for health of a topic and initiates or discovers peers to connect to in order to improve topicHealth.

The connectivityLoop starts as part of PeerManager Start functionality.

@romanzac
Copy link
Collaborator Author

romanzac commented Mar 4, 2024

This is not implemented, as it was not required.

The peer discovery gets triggered due to the peerManager connectivityLoop which keeps checking for health of a topic and initiates or discovers peers to connect to in order to improve topicHealth.

The connectivityLoop starts as part of PeerManager Start functionality.

I understand. I have unpacked pm.Start() function to execute "go pm.peerEventLoop(ctx)" as my TestHandlePeerTopicEvent body. I believe I have all pieces of code together. I have 1+4 hosts connected (connectedness=1) with one pubSubTopic. Still I cannot get non empty list of peers calling topic.topic.ListPeers() thus checkAndUpdateTopicHealth() won't upgrade health of the topic. Could you help please to identify what is missing here ?

f74bb4e

I spend so much energy on this because it is related to reliability. Hope we all could benefit.

@romanzac
Copy link
Collaborator Author

romanzac commented Mar 4, 2024

More debug info. I have 4 IDS peers with DirOutbound connection from one query yet getting zero connections from another query. More in commit 4b6f6d2

2024-03-04T19:59:54.765+0800 INFO gowaku peermanager/topic_event_handler_test.go:289 IDS peers {"in ": "0", "out": "4"}
2024-03-04T19:59:54.765+0800 INFO gowaku peermanager/topic_event_handler_test.go:292 IDS peers {"not connected": "0"}
2024-03-04T19:59:54.765+0800 INFO gowaku.peer-manager peermanager/peer_manager.go:290 ensureMinRelayConnsPerTopic {"pubSubTopic": "/waku/2/go/pm/test"}
2024-03-04T19:59:54.765+0800 INFO gowaku.peer-manager peermanager/peer_manager.go:298 subscribed topic is not sufficiently healthy, initiating more connections to maintain health {"pubSubTopic": "/waku/2/go/pm/test", "connectedPeerCount": 0, "optimumPeers": 4}

@chaitanyaprem
Copy link
Collaborator

Took a look at the test code.

I am making some assumptions here, but IIUC you want to verify peerTopicHealth being reported correctly based on peerCount subscribed to a pubsubTopic.
There is already a unit test i wrote which takes care of that. Refer

func TestConnectionStatusChanges(t *testing.T) {

Also, if a peer is not subscribed to the pubsubTopic, then even if the node is connected to that peer topicHealth wouldn't change as that peer wouldn't be counted for health of this particular topic.

It would help if you can indicate object of your test so i can suggest what needs to be done.

@romanzac
Copy link
Collaborator Author

romanzac commented Mar 5, 2024

Took a look at the test code.

I am making some assumptions here, but IIUC you want to verify peerTopicHealth being reported correctly based on peerCount subscribed to a pubsubTopic. There is already a unit test i wrote which takes care of that. Refer

func TestConnectionStatusChanges(t *testing.T) {

Also, if a peer is not subscribed to the pubsubTopic, then even if the node is connected to that peer topicHealth wouldn't change as that peer wouldn't be counted for health of this particular topic.

It would help if you can indicate object of your test so i can suggest what needs to be done.

Thanks for the example. Test objective is to test topic event handlers in Peer Manager. Particularly to see how handlerPeerTopicEvent() could process "relay.PEER_JOINED" and "relay.PEER_LEFT" events with regard to topic health. We could possibly simplify it to just check on wether the host has left the lip2p2 topic or not. I agree health state change is even more complex to verify. My point was to see interaction Peer Manager <> libp2p works well in regard to health.

@chaitanyaprem
Copy link
Collaborator

Took a look at the test code.
I am making some assumptions here, but IIUC you want to verify peerTopicHealth being reported correctly based on peerCount subscribed to a pubsubTopic. There is already a unit test i wrote which takes care of that. Refer

func TestConnectionStatusChanges(t *testing.T) {

Also, if a peer is not subscribed to the pubsubTopic, then even if the node is connected to that peer topicHealth wouldn't change as that peer wouldn't be counted for health of this particular topic.
It would help if you can indicate object of your test so i can suggest what needs to be done.

Thanks for the example. Test objective is to test topic event handlers in Peer Manager. Particularly to see how handlerPeerTopicEvent() could process "relay.PEER_JOINED" and "relay.PEER_LEFT" events with regard to topic health. We could possibly simplify it to just check on wether the host has left the lip2p2 topic or not. I agree health state change is even more complex to verify. My point was to see interaction Peer Manager <> libp2p works well in regard to health.

I am wondering if it makes sense to simulate so much code just to test this. Rather can't we test it from a high-level API similar to what i wrote for healthCheck?

@romanzac
Copy link
Collaborator Author

romanzac commented Mar 6, 2024

Took a look at the test code.
I am making some assumptions here, but IIUC you want to verify peerTopicHealth being reported correctly based on peerCount subscribed to a pubsubTopic. There is already a unit test i wrote which takes care of that. Refer

func TestConnectionStatusChanges(t *testing.T) {

Also, if a peer is not subscribed to the pubsubTopic, then even if the node is connected to that peer topicHealth wouldn't change as that peer wouldn't be counted for health of this particular topic.
It would help if you can indicate object of your test so i can suggest what needs to be done.

Thanks for the example. Test objective is to test topic event handlers in Peer Manager. Particularly to see how handlerPeerTopicEvent() could process "relay.PEER_JOINED" and "relay.PEER_LEFT" events with regard to topic health. We could possibly simplify it to just check on wether the host has left the lip2p2 topic or not. I agree health state change is even more complex to verify. My point was to see interaction Peer Manager <> libp2p works well in regard to health.

I am wondering if it makes sense to simulate so much code just to test this. Rather can't we test it from a high-level API similar to what i wrote for healthCheck?

I agree, if the complexity of the unit test goes up to the level of a complete stack, it would make sense to just test that stack. Do you mean test https://github.com/waku-org/go-waku/blob/master/cmd/waku/server/rest/health.go ? Or which healthCheck do you refer to ?

@chaitanyaprem
Copy link
Collaborator

Took a look at the test code.
I am making some assumptions here, but IIUC you want to verify peerTopicHealth being reported correctly based on peerCount subscribed to a pubsubTopic. There is already a unit test i wrote which takes care of that. Refer

func TestConnectionStatusChanges(t *testing.T) {

Also, if a peer is not subscribed to the pubsubTopic, then even if the node is connected to that peer topicHealth wouldn't change as that peer wouldn't be counted for health of this particular topic.
It would help if you can indicate object of your test so i can suggest what needs to be done.

Thanks for the example. Test objective is to test topic event handlers in Peer Manager. Particularly to see how handlerPeerTopicEvent() could process "relay.PEER_JOINED" and "relay.PEER_LEFT" events with regard to topic health. We could possibly simplify it to just check on wether the host has left the lip2p2 topic or not. I agree health state change is even more complex to verify. My point was to see interaction Peer Manager <> libp2p works well in regard to health.

I am wondering if it makes sense to simulate so much code just to test this. Rather can't we test it from a high-level API similar to what i wrote for healthCheck?

I agree, if the complexity of the unit test goes up to the level of a complete stack, it would make sense to just test that stack. Do you mean test https://github.com/waku-org/go-waku/blob/master/cmd/waku/server/rest/health.go ? Or which healthCheck do you refer to ?

Referring to this test

func TestConnectionStatusChanges(t *testing.T) {

@romanzac
Copy link
Collaborator Author

romanzac commented Mar 8, 2024

I've reduced a scope of unit tests in #1035 Is there any other test required from me to write ? If not, I would suggest to close this issue for your original comment "This is not implemented, as it was not required." We can test health later at interop level.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Status: Done
Development

No branches or pull requests

3 participants