-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HBASE-23899 [Flakey Test] Stabilizations and Debug #1212
Closed
Closed
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
💔 -1 overall
This message was automatically generated. |
saintstack
force-pushed
the
HBASE-23899
branch
from
February 26, 2020 23:47
1ffe465
to
bcae55c
Compare
💔 -1 overall
This message was automatically generated. |
A miscellaney. Add extra logging to help w/ debug to a bunch of tests. Fix some issues particular where we ran into mismatched filesystem complaint. Some modernizations, removal of unnecessary deletes (especially after seeing tests fail in table delete), and cleanup. Recategorized one tests because it starts four clusters in the one JVM from medium to large. Finally, zk standalone server won't come on occasion; added debug and thread dumping to help figure why ( manifests as test failing in startup saying master didn't launch). hbase-mapreduce/src/test/java/org/apache/hadoop/hbase/snapshot/TestExportSnapshot.java Fixes occasional mismatched filesystems where the difference is file:// vs file:/// or we pick up hdfs schema when it a local fs test. Had to do this vetting of how we do make qualified on a Path in a few places, not just here as a few tests failed with this same issue. Code in here is used by a lot of tests that each in turn suffered this mismatch. Refactor for clarity hbase-mapreduce/src/test/java/org/apache/hadoop/hbase/snapshot/TestExportSnapshotV1NoCluster.java Unused import. hbase-procedure/src/test/java/org/apache/hadoop/hbase/procedure2/store/wal/TestWALProcedureStore.java This test fails if tmp dir is not where it expects because tries to make rootdir there. Give it a rootdir under test data dir. hbase-server/src/test/java/org/apache/hadoop/hbase/TestZooKeeper.java This change is probably useless. I think the issue is actually a problem addressed later where our test for zk server being up gets stuck and never times out. hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestSplitOrMergeStatus.java Move off deprecated APIs. hbase-server/src/test/java/org/apache/hadoop/hbase/master/balancer/BalancerTestBase.java Log when we fail balance check for DEBUG Currently just says 'false' hbase-server/src/test/java/org/apache/hadoop/hbase/master/procedure/TestSplitWALProcedure.java NPEs on way out if setup failed. hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java Add logging when assert fails to help w/ DEBUG hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestRegionServerAbortTimeout.java Don't bother removing stuff on teardown. All gets thrown away anyways. Saw a few hangs in here in the teardown where hdfs was down before expected messing up shutdown. hbase-zookeeper/src/main/java/org/apache/hadoop/hbase/zookeeper/MiniZooKeeperCluster.java Add timeout on socket; was seeing check for zk server getting stuck and never timing out (test time out in startup) hbase-mapreduce/src/test/java/org/apache/hadoop/hbase/snapshot/TestExportSnapshotWithTemporaryDirectory.java Write to test data dir instead. Be careful about how we make qualified paths. hbase-mapreduce/src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableInputFormatScanBase.java Remove snowflake configs. hbase-server/src/test/java/org/apache/hadoop/hbase/replication/TestReplicationStatus.java Add a hacky pause. Tried adding barriers but didn't work. Needs deep dive. hbase-zookeeper/src/main/java/org/apache/hadoop/hbase/zookeeper/MiniZooKeeperCluster.java Remove code copied from zk and use zk methods directly instead. A general problem is that zk cluster doesn't come up occasionally but no clue why. Add thread dumping and state check.
saintstack
force-pushed
the
HBASE-23899
branch
from
February 27, 2020 22:02
bcae55c
to
fcddbdf
Compare
I merged this after fixing checkstyle and findbugs to branch-2 and master. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
A miscellaney. Add extra logging to help w/ debug to a bunch of tests.
Fix some issues particular where we ran into mismatched filesystem
complaint. Some modernizations, removal of unnecessary deletes
(especially after seeing tests fail in table delete), and cleanup.
Recategorized one tests because it starts four clusters in the one
JVM from medium to large. Finally, zk standalone server won't come
on occasion; added debug and thread dumping to help figure why (
manifests as test failing in startup saying master didn't launch).
hbase-mapreduce/src/test/java/org/apache/hadoop/hbase/snapshot/TestExportSnapshot.java
Fixes occasional mismatched filesystems where the difference is file:// vs file:///
or we pick up hdfs schema when it a local fs test. Had to do this
vetting of how we do make qualified on a Path in a few places, not
just here as a few tests failed with this same issue. Code in here is
used by a lot of tests that each in turn suffered this mismatch.
Refactor for clarity
hbase-mapreduce/src/test/java/org/apache/hadoop/hbase/snapshot/TestExportSnapshotV1NoCluster.java
Unused import.
hbase-procedure/src/test/java/org/apache/hadoop/hbase/procedure2/store/wal/TestWALProcedureStore.java
This test fails if tmp dir is not where it expects because tries to
make rootdir there. Give it a rootdir under test data dir.
hbase-server/src/test/java/org/apache/hadoop/hbase/TestZooKeeper.java
This change is probably useless. I think the issue is actually
a problem addressed later where our test for zk server being
up gets stuck and never times out.
hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestSplitOrMergeStatus.java
Move off deprecated APIs.
hbase-server/src/test/java/org/apache/hadoop/hbase/master/balancer/BalancerTestBase.java
Log when we fail balance check for DEBUG Currently just says 'false'
hbase-server/src/test/java/org/apache/hadoop/hbase/master/procedure/TestSplitWALProcedure.java
NPEs on way out if setup failed.
hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java
Add logging when assert fails to help w/ DEBUG
hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestRegionServerAbortTimeout.java
Don't bother removing stuff on teardown. All gets thrown away anyways.
Saw a few hangs in here in the teardown where hdfs was down before
expected messing up shutdown.
hbase-zookeeper/src/main/java/org/apache/hadoop/hbase/zookeeper/MiniZooKeeperCluster.java
Add timeout on socket; was seeing check for zk server getting stuck
and never timing out (test time out in startup)
hbase-mapreduce/src/test/java/org/apache/hadoop/hbase/snapshot/TestExportSnapshotWithTemporaryDirectory.java
Write to test data dir instead.
Be careful about how we make qualified paths.
hbase-mapreduce/src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableInputFormatScanBase.java
Remove snowflake configs.
hbase-server/src/test/java/org/apache/hadoop/hbase/replication/TestReplicationStatus.java
Add a hacky pause. Tried adding barriers but didn't work. Needs deep
dive.
hbase-zookeeper/src/main/java/org/apache/hadoop/hbase/zookeeper/MiniZooKeeperCluster.java
Remove code copied from zk and use zk methods directly instead.
A general problem is that zk cluster doesn't come up occasionally but
no clue why. Add thread dumping and state check.