Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor(swamp/tests)skip tests #2802

Closed
wants to merge 2 commits into from

Conversation

vgonkivs
Copy link
Member

@vgonkivs vgonkivs commented Oct 3, 2023

Overview

Checklist

  • New and updated code has appropriate documentation
  • New and updated code has new and/or updated testing
  • Required CI checks are passing
  • Visual proof for any user facing features like CLI or documentation updates
  • Linked issues closed with keywords

@vgonkivs vgonkivs added the kind:refactor Attached to refactoring PRs label Oct 3, 2023
@vgonkivs vgonkivs self-assigned this Oct 3, 2023
@codecov-commenter
Copy link

codecov-commenter commented Oct 3, 2023

Codecov Report

Merging #2802 (fb74eb2) into main (20c7476) will decrease coverage by 0.01%.
Report is 1 commits behind head on main.
The diff coverage is n/a.

@@            Coverage Diff             @@
##             main    #2802      +/-   ##
==========================================
- Coverage   51.81%   51.80%   -0.01%     
==========================================
  Files         161      161              
  Lines       10827    10827              
==========================================
- Hits         5610     5609       -1     
- Misses       4721     4723       +2     
+ Partials      496      495       -1     

see 6 files with indirect coverage changes

@vgonkivs vgonkivs force-pushed the skip_flaky_tests branch 2 times, most recently from 73dce97 to bdbada0 Compare October 3, 2023 10:50
@vgonkivs
Copy link
Member Author

vgonkivs commented Oct 3, 2023

TestBlobModule
https://github.com/celestiaorg/celestia-node/actions/runs/6393884735/job/17354064855?pr=2802#step:5:726
    full_node.go:181: tearing down testnode
https://github.com/celestiaorg/celestia-node/actions/runs/6393884735/job/17354064855?pr=2802#step:5:727
    testing.go:1465: race detected during execution of test

@vgonkivs
Copy link
Member Author

vgonkivs commented Oct 3, 2023

--- FAIL: TestFullDiscoveryViaBootstrapper (0.07s)
https://github.com/celestiaorg/celestia-node/actions/runs/6393510820/job/17352897557?pr=2802#step:4:116
    testing.go:50: 
https://github.com/celestiaorg/celestia-node/actions/runs/6393510820/job/17352897557?pr=2802#step:4:117
        	Error Trace:	/home/runner/go/pkg/mod/github.com/celestiaorg/celestia-app@v1.0.0-rc17/test/util/testnode/full_node.go:178
https://github.com/celestiaorg/celestia-node/actions/runs/6393510820/job/17352897557?pr=2802#step:4:118
        	            				/home/runner/work/celestia-node/celestia-node/core/testing.go:50
https://github.com/celestiaorg/celestia-node/actions/runs/6393510820/job/17352897557?pr=2802#step:4:119
        	            				/home/runner/work/celestia-node/celestia-node/nodebuilder/tests/swamp/swamp.go:81
https://github.com/celestiaorg/celestia-node/actions/runs/6393510820/job/17352897557?pr=2802#step:4:120
        	            				/home/runner/work/celestia-node/celestia-node/nodebuilder/tests/p2p_test.go:73
https://github.com/celestiaorg/celestia-node/actions/runs/6393510820/job/17352897557?pr=2802#step:4:121
        	Error:      	Received unexpected error:
https://github.com/celestiaorg/celestia-node/actions/runs/6393510820/job/17352897557?pr=2802#step:4:122
        	            	listen tcp 127.0.0.1:38939: bind: address already in use
https://github.com/celestiaorg/celestia-node/actions/runs/6393510820/job/17352897557?pr=2802#step:4:123
        	Test:       	TestFullDiscoveryViaBootstrapper

@vgonkivs
Copy link
Member Author

vgonkivs commented Oct 3, 2023

with -race flag

--- FAIL: TestSyncAgainstBridge_NonEmptyChain (17.79s)
https://github.com/celestiaorg/celestia-node/actions/runs/6393679580/job/17353406273?pr=2802#step:5:227
    swamp.go:107: 
https://github.com/celestiaorg/celestia-node/actions/runs/6393679580/job/17353406273?pr=2802#step:5:228
        	Error Trace:	/home/runner/work/celestia-node/celestia-node/nodebuilder/tests/swamp/swamp.go:107
https://github.com/celestiaorg/celestia-node/actions/runs/6393679580/job/17353406273?pr=2802#step:5:229
        	            				/home/runner/go/pkg/mod/golang.org/x/exp@v0.0.0-20230817173708-d852ddb80c63/maps/maps.go:90
https://github.com/celestiaorg/celestia-node/actions/runs/6393679580/job/17353406273?pr=2802#step:5:230
        	            				/home/runner/work/celestia-node/celestia-node/nodebuilder/tests/swamp/swamp.go:106
https://github.com/celestiaorg/celestia-node/actions/runs/6393679580/job/17353406273?pr=2802#step:5:231
        	            				/opt/hostedtoolcache/go/1.21.1/x64/src/testing/testing.go:1169
https://github.com/celestiaorg/celestia-node/actions/runs/6393679580/job/17353406273?pr=2802#step:5:232
        	            				/opt/hostedtoolcache/go/1.21.1/x64/src/testing/testing.go:1347
https://github.com/celestiaorg/celestia-node/actions/runs/6393679580/job/17353406273?pr=2802#step:5:233
        	            				/opt/hostedtoolcache/go/1.21.1/x64/src/testing/testing.go:1589
https://github.com/celestiaorg/celestia-node/actions/runs/6393679580/job/17353406273?pr=2802#step:5:234
        	Error:      	Received unexpected error:
https://github.com/celestiaorg/celestia-node/actions/runs/6393679580/job/17353406273?pr=2802#step:5:235
        	            	node: failed to stop within timeout(2m0s): context deadline exceeded
https://github.com/celestiaorg/celestia-node/actions/runs/6393679580/job/17353406273?pr=2802#step:5:236
        	Test:       	TestSyncAgainstBridge_NonEmptyChain

@vgonkivs
Copy link
Member Author

vgonkivs commented Oct 3, 2023

--- FAIL: TestShrexNDFromLights (10.95s)
https://github.com/celestiaorg/celestia-node/actions/runs/6393884735/job/17354064855?pr=2802#step:5:726
    full_node.go:181: tearing down testnode
https://github.com/celestiaorg/celestia-node/actions/runs/6393884735/job/17354064855?pr=2802#step:5:727
    testing.go:1465: race detected during execution of test

@vgonkivs
Copy link
Member Author

vgonkivs commented Oct 3, 2023

--- FAIL: TestSyncLightWithTrustedPeers (25.79s)
https://github.com/celestiaorg/celestia-node/actions/runs/6393884735/job/17354064855?pr=2802#step:5:5885)
    swamp.go:107: 
https://github.com/celestiaorg/celestia-node/actions/runs/6393884735/job/17354064855?pr=2802#step:5:5886)
        	Error Trace:	/home/runner/work/celestia-node/celestia-node/nodebuilder/tests/swamp/swamp.go:107
https://github.com/celestiaorg/celestia-node/actions/runs/6393884735/job/17354064855?pr=2802#step:5:5887)
        	            				/home/runner/go/pkg/mod/golang.org/x/exp@v0.0.0-20230817173708-d852ddb80c63/maps/maps.go:90
https://github.com/celestiaorg/celestia-node/actions/runs/6393884735/job/17354064855?pr=2802#step:5:5888)
        	            				/home/runner/work/celestia-node/celestia-node/nodebuilder/tests/swamp/swamp.go:106
https://github.com/celestiaorg/celestia-node/actions/runs/6393884735/job/17354064855?pr=2802#step:5:5889)
        	            				/opt/hostedtoolcache/go/1.21.1/x64/src/testing/testing.go:1169
https://github.com/celestiaorg/celestia-node/actions/runs/6393884735/job/17354064855?pr=2802#step:5:5890)
        	            				/opt/hostedtoolcache/go/1.21.1/x64/src/testing/testing.go:1347
https://github.com/celestiaorg/celestia-node/actions/runs/6393884735/job/17354064855?pr=2802#step:5:5891)
        	            				/opt/hostedtoolcache/go/1.21.1/x64/src/testing/testing.go:1589
https://github.com/celestiaorg/celestia-node/actions/runs/6393884735/job/17354064855?pr=2802#step:5:5892)
        	Error:      	Received unexpected error:
https://github.com/celestiaorg/celestia-node/actions/runs/6393884735/job/17354064855?pr=2802#step:5:5893
        	            	node: failed to stop within timeout(2m0s): context deadline exceeded
https://github.com/celestiaorg/celestia-node/actions/runs/6393884735/job/17354064855?pr=2802#step:5:5894
        	Test:       	TestSyncLightWithTrustedPeers
https://github.com/celestiaorg/celestia-node/actions/runs/6393884735/job/17354064855?pr=2802#step:5:5895
    full_node.go:181: tearing down testnode
https://github.com/celestiaorg/celestia-node/actions/runs/6393884735/job/17354064855?pr=2802#step:5:5896
    testing.go:1465: race detected during execution of test

@vgonkivs
Copy link
Member Author

vgonkivs commented Oct 3, 2023

--- FAIL: TestSyncStartStopLightWithBridge (14.26s)
[4762](https://github.com/celestiaorg/celestia-node/actions/runs/6394231754/job/17355184018?pr=2802#step:5:4763)
    swamp.go:107: 
[4763](https://github.com/celestiaorg/celestia-node/actions/runs/6394231754/job/17355184018?pr=2802#step:5:4764)
        	Error Trace:	/home/runner/work/celestia-node/celestia-node/nodebuilder/tests/swamp/swamp.go:107
[4764](https://github.com/celestiaorg/celestia-node/actions/runs/6394231754/job/17355184018?pr=2802#step:5:4765)
        	            				/home/runner/go/pkg/mod/golang.org/x/exp@v0.0.0-20230817173708-d852ddb80c63/maps/maps.go:90
[4765](https://github.com/celestiaorg/celestia-node/actions/runs/6394231754/job/17355184018?pr=2802#step:5:4766)
        	            				/home/runner/work/celestia-node/celestia-node/nodebuilder/tests/swamp/swamp.go:106
[4766](https://github.com/celestiaorg/celestia-node/actions/runs/6394231754/job/17355184018?pr=2802#step:5:4767)
        	            				/opt/hostedtoolcache/go/1.21.1/x64/src/testing/testing.go:1169
[4767](https://github.com/celestiaorg/celestia-node/actions/runs/6394231754/job/17355184018?pr=2802#step:5:4768)
        	            				/opt/hostedtoolcache/go/1.21.1/x64/src/testing/testing.go:1347
[4768](https://github.com/celestiaorg/celestia-node/actions/runs/6394231754/job/17355184018?pr=2802#step:5:4769)
        	            				/opt/hostedtoolcache/go/1.21.1/x64/src/testing/testing.go:1589
[4769](https://github.com/celestiaorg/celestia-node/actions/runs/6394231754/job/17355184018?pr=2802#step:5:4770)
        	Error:      	Received unexpected error:
[4770](https://github.com/celestiaorg/celestia-node/actions/runs/6394231754/job/17355184018?pr=2802#step:5:4771)
        	            	node: failed to stop within timeout(2m0s): context deadline exceeded
[4771](https://github.com/celestiaorg/celestia-node/actions/runs/6394231754/job/17355184018?pr=2802#step:5:4772)
        	Test:       	TestSyncStartStopLightWithBridge
[4772](https://github.com/celestiaorg/celestia-node/actions/runs/6394231754/job/17355184018?pr=2802#step:5:4773)
    full_node.go:181: tearing down testnode
[4773](https://github.com/celestiaorg/celestia-node/actions/runs/6394231754/job/17355184018?pr=2802#step:5:4774)
    testing.go:1465: race detected during execution of test

@vgonkivs
Copy link
Member Author

vgonkivs commented Oct 3, 2023

--- FAIL: TestSyncLightAgainstFull (16.87s)
[5451](https://github.com/celestiaorg/celestia-node/actions/runs/6394231754/job/17355184018?pr=2802#step:5:5452)
    swamp.go:107: 
[5452](https://github.com/celestiaorg/celestia-node/actions/runs/6394231754/job/17355184018?pr=2802#step:5:5453)
        	Error Trace:	/home/runner/work/celestia-node/celestia-node/nodebuilder/tests/swamp/swamp.go:107
[5453](https://github.com/celestiaorg/celestia-node/actions/runs/6394231754/job/17355184018?pr=2802#step:5:5454)
        	            				/home/runner/go/pkg/mod/golang.org/x/exp@v0.0.0-20230817173708-d852ddb80c63/maps/maps.go:90
[5454](https://github.com/celestiaorg/celestia-node/actions/runs/6394231754/job/17355184018?pr=2802#step:5:5455)
        	            				/home/runner/work/celestia-node/celestia-node/nodebuilder/tests/swamp/swamp.go:106
[5455](https://github.com/celestiaorg/celestia-node/actions/runs/6394231754/job/17355184018?pr=2802#step:5:5456)
        	            				/opt/hostedtoolcache/go/1.21.1/x64/src/testing/testing.go:1169
[5456](https://github.com/celestiaorg/celestia-node/actions/runs/6394231754/job/17355184018?pr=2802#step:5:5457)
        	            				/opt/hostedtoolcache/go/1.21.1/x64/src/testing/testing.go:1347
[5457](https://github.com/celestiaorg/celestia-node/actions/runs/6394231754/job/17355184018?pr=2802#step:5:5458)
        	            				/opt/hostedtoolcache/go/1.21.1/x64/src/testing/testing.go:1589
[5458](https://github.com/celestiaorg/celestia-node/actions/runs/6394231754/job/17355184018?pr=2802#step:5:5459)
        	Error:      	Received unexpected error:
[5459](https://github.com/celestiaorg/celestia-node/actions/runs/6394231754/job/17355184018?pr=2802#step:5:5460)
        	            	node: failed to stop within timeout(2m0s): context deadline exceeded
[5460](https://github.com/celestiaorg/celestia-node/actions/runs/6394231754/job/17355184018?pr=2802#step:5:5461)
        	Test:       	TestSyncLightAgainstFull
[5461](https://github.com/celestiaorg/celestia-node/actions/runs/6394231754/job/17355184018?pr=2802#step:5:5462)
    full_node.go:181: tearing down testnode
[5462](https://github.com/celestiaorg/celestia-node/actions/runs/6394231754/job/17355184018?pr=2802#step:5:5463)
    testing.go:1465: race detected during execution of test

@vgonkivs
Copy link
Member Author

vgonkivs commented Oct 3, 2023

--- FAIL: TestBridgeNodeAsBootstrapper (5.88s)
[1356](https://github.com/celestiaorg/celestia-node/actions/runs/6394494464/job/17356459336#step:5:1357)
    full_node.go:181: tearing down testnode
[1357](https://github.com/celestiaorg/celestia-node/actions/runs/6394494464/job/17356459336#step:5:1358)
    testing.go:1465: race detected during execution of test

@ramin
Copy link
Collaborator

ramin commented Jan 15, 2024

@vgonkivs going to close this one and pick up where you got to in another branch with current set of work on trying to clean up flakiness etc

@ramin ramin closed this Jan 15, 2024
@ramin ramin mentioned this pull request Jan 18, 2024
ramin added a commit that referenced this pull request Jan 23, 2024
<!--
Thank you for submitting a pull request!

Please make sure you have reviewed our contributors guide before
submitting your
first PR.

Please ensure you've addressed or included references to any related
issues.

Tips:
- Use keywords like "closes" or "fixes" followed by an issue number to
automatically close related issues when the PR is merged (e.g., "closes
#123" or "fixes #123").
- Describe the changes made in the PR.
- Ensure the PR has one of the required tags (kind:fix, kind:misc,
kind:break!, kind:refactor, kind:feat, kind:deps, kind:docs, kind:ci,
kind:chore, kind:testing)

-->

Building off @vgonkivs's [work to skip
tests](#2802), this
attempts to get a baseline "green" ci for us. There are still a couple
of tests that VERY intermittently flake in CI, but this way they will
stand out WHEN they happen and we can track down vs entire thing being
red always.

This does quite a few things

- introduces build tags on swamp tests in each named file to allow us to
run parts of the swamp/integration tests independently (ie: `go test
./... -tags=blob`
- adds an `integration` tag to allow running all still
- splits the integration tests into it's own workflow file
(`integration-tests.yml`) which is now triggered from `go-ci.yml`
- splits each swamp/integration tests to run as its own job so we can
see which are failing/flakey more explicitly
- utilizes go's -short test flag to Skip a few swamp/integration tests
that are consistently failing. Current we are skipping
`TestFullReconstructFromFulls`, `TestFullReconstructFromLights`,
`TestSyncStartStopLightWithBridge` which is less than we were originally
skipping in #2802
- plugs in our verbose/debug stuff to integration tests in addition to
unit which we had before

Unit tests 

- splits some of the unit tests that were "race flakey" into
`*_no_race_test.go` and adds `!race` tag to some others to get to pass
consistently when running unit tests with -race flag
- macos-latest unit tests still fail on race test ONLY in GitHub actions
CI 🤷‍♂️

Next Steps

- create issues for each short run / skipped 
- create issues for any tests that fail race that are NOT the race issue
on upstream cosmos-sdk
- create issue for macos-latest in GitHub race fail
- create issue for macos-latest intermittent fail

I think we let this run for a while, then once we see it be consistently
green for a bit and isolate any more flakes that pop up, we can toggle
on more branch requirements, then in fixing the above issues, we can be
green. [Being green is not
easy](https://www.youtube.com/watch?v=51BQfPeSK8k).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind:refactor Attached to refactoring PRs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants