Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multidim Interop tests are broken in master #1809

Closed
achingbrain opened this issue Jun 14, 2023 · 10 comments · Fixed by #1845
Closed

Multidim Interop tests are broken in master #1809

achingbrain opened this issue Jun 14, 2023 · 10 comments · Fixed by #1845
Assignees
Labels
kind/test Testing work P0 Critical: Tackled by core team ASAP

Comments

@achingbrain
Copy link
Member

This PR was merged with green CI but now these tests fail in master.

It looks like master runs more test variations there than there were on the PR:

Master:

image

https://github.com/libp2p/js-libp2p/actions/runs/5251996577

#1788:

image

https://github.com/libp2p/js-libp2p/actions/runs/5201404450

CC @MarcoPolo @maschad

@achingbrain achingbrain added the need/triage Needs initial labeling and prioritization label Jun 14, 2023
@maschad maschad added P0 Critical: Tackled by core team ASAP need/triage Needs initial labeling and prioritization kind/test Testing work and removed need/triage Needs initial labeling and prioritization labels Jun 15, 2023
@maschad
Copy link
Member

maschad commented Jun 15, 2023

It looks like master runs more test variations there than there were on the PR:

I believe that's because the workflow file is going to run the tests against the latest JS versions in test plans but @MarcoPolo can confirm this. Regardless, I don't running those variations is a drawback as that provides better coverage.

It seems like #1804 introduced some inconsistency in the master CI runs (given that the workflow on that PR check passed) and that previous runs tested against this webrtc and webrtc-direct matrix were passing prior to this PR being merged on master, and subsequent runs post-merge of #1804 are failing.

@achingbrain
Copy link
Member Author

Hmm, interesting. The multidim tests passed for the #1804 PR though so that might be a red herring?

@maschad
Copy link
Member

maschad commented Jun 15, 2023

Hmm, interesting. The multidim tests passed for the #1804 PR though so that might be a red herring?

Yah that's what I was referring to by stating (given that the workflow on that PR check passed) but I don't think it's a red herring because all subsequent master CI Multidim interop runs have failed post-merge of #1804 so that seems to be consistent. The error also seems to be the same in that the streams are aborting early (there is also another error related to monorepo refactor but seems that will be fixed by #1828)

But these tests pass locally when I run them, perhaps it's a timeout issue ?

@MarcoPolo
Copy link
Collaborator

The extra tests that happened in master are because I merged this PR on test-plans libp2p/test-plans#189 that introduced js-libp2p v0.45 to the test-plans repo so that every implementation (including js-libp2p) tests against this released version of js-libp2p. #1788 Was merged before libp2p/test-plans#189 was, so it didn't know about that released version yet.

The timelime:

@MarcoPolo
Copy link
Collaborator

I wonder if the test passed in #1804 because dependency resolution worked out then, but failed after because of a different dependency resolution (just a guess). I'm going to try running the multidim CI for 1804 and see if it still passes.

@achingbrain
Copy link
Member Author

If I'm reading this correctly that means the tests that are run are controlled by whatever is in master of https://github.com/libp2p/test-plans?

Is there a way to depend on a specific version of the multidim tests instead? Otherwise commits in an unrelated repo can break CI in this repo which makes the build non-deterministic and leads to confusion like above.

Previously this was handled by having releases of https://github.com/libp2p/interop with semver and all that.

@achingbrain
Copy link
Member Author

Ah, wait - we can control that here.

@maschad maschad assigned achingbrain and unassigned maschad Jun 19, 2023
@p-shahi
Copy link
Member

p-shahi commented Jun 19, 2023

@achingbrain
It looks like all of the interop tests are now failing as a result of failing to import modules post the monorepo consolidation: https://github.com/libp2p/js-libp2p/actions/runs/5315540166/jobs/9624024987

Error [ERR_MODULE_NOT_FOUND]: Cannot find package '@libp2p/crypto' imported from /app/packages/libp2p/dist/src/libp2p.js\n" +

Ah there's a PR in flight that will likely address this #1831

@achingbrain
Copy link
Member Author

achingbrain commented Jun 20, 2023

#1842 fixes that particular error but the tests still fail so there's more work to do.

One error I'm seeing in the logs right now is:

unexpected exception in ping test: Error: The operation was aborted

This seems like a timeout to me. @maschad has had success running these tests locally so this could be the test suite overwhelming the weaker CI machines.

The other is:

unexpected exception in ping test: Error: stream ended before 1 bytes became available

This happens when the remote closes the stream before it sends any data (e.g. before multistream select has a chance to run). It's symptomatic of the issue described here and is probably related to the above.

@MarcoPolo
Copy link
Collaborator

MarcoPolo commented Jun 20, 2023

WebRTC tests seem flaky to me. I don't think this is a resource issue in CI. Running these tests locally (in compose and npm test) also sometimes fail for me.

Let's remove the webrtc tests for now, and add them back in when they're not flaky.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/test Testing work P0 Critical: Tackled by core team ASAP
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

4 participants