Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Flakey Tests on CI for PubSub examples #1540

Closed
maschad opened this issue Jan 10, 2023 · 4 comments · Fixed by #1549
Closed

Fix Flakey Tests on CI for PubSub examples #1540

maschad opened this issue Jan 10, 2023 · 4 comments · Fixed by #1549
Labels
need/triage Needs initial labeling and prioritization

Comments

@maschad
Copy link
Member

maschad commented Jan 10, 2023

Severity:

Low

Description:

On occasion the test-example (pubsub) fails with messages not being received I suspect we could try increasing the delay

Steps to reproduce the error:

Not necessarily easily reproducible as it doesn't happen frequently

@maschad maschad added the need/triage Needs initial labeling and prioritization label Jan 10, 2023
@tabcat
Copy link
Contributor

tabcat commented Jan 10, 2023

I had similar issues and fixed by listening for subscription-change events to make sure all nodes were peered: hldb/welo@3f60808

for some reason, even with long timeouts, some nodes would never peer

@achingbrain
Copy link
Member

@achingbrain
Copy link
Member

I think instead of increasing the delay a better fix in the message filtering test might be for node1 to wait until it sees node2 and node3 in it's peer list for the 'fruit' topic before publishing the messages?

@maschad
Copy link
Member Author

maschad commented Jan 11, 2023

I was originally confused by the logs saying Error: Not enough banana messages - received 1, expected 2 even though two banana messages had been received.

I think instead of increasing the delay a better fix in the message filtering test might be for node1 to wait until it sees node2 and node3 in it's peer list for the 'fruit' topic before publishing the messages?

What's interesting is that this approach doesn't seem to work unless node3 dials node1 directly, i.e. node3 doesn't show up in node1 's subscribers unless node3 dials node1 directly (see 7a6bff3#diff-9483cec2def56f6a8975c97c997d70741698e22fb8a971785d83502f7169bbf1R42)

achingbrain added a commit that referenced this issue Jan 14, 2023
Instead of waiting an arbitrary amount of time for subscriptions to propagate, before sending messages, ensure that node1 has node2's subs and node2 has node3's subs.

Closes #1540
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
need/triage Needs initial labeling and prioritization
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants