Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HADOOP-18641. Cut excess dependencies from cloud connectors. #5429

Conversation

steveloughran
Copy link
Contributor

@steveloughran steveloughran commented Feb 23, 2023

  • Exclude imports which come in with hadoop-common
  • Add explicit import of hadoop's org.codehaus.jettison declaration to hadoop-aliyun
  • Cut duplicate and inconsistent hbase-server declarations from hadoop-project

How was this patch tested?

  • building and looking at imports; verifying compilation worked.
  • testing of azure in progress.

For code changes:

  • Does the title or this PR starts with the corresponding JIRA issue id (e.g. 'HADOOP-17799. Your PR title ...')?
  • Object storage: have the integration tests been executed and the endpoint declared according to the connector-specific documentation?
  • If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?
  • If applicable, have you updated the LICENSE, LICENSE-binary, NOTICE-binary files?

@steveloughran
Copy link
Contributor Author

+looks like there are things in the distro not in our license. other than the jdk one, all come from aliyun sdk

aliyun-java-sdk-core-4.5.10.jar
aliyun-java-sdk-kms-2.11.0.jar
aliyun-java-sdk-ram-3.1.0.jar
aliyun-sdk-oss-3.13.0.jar
ini4j-0.5.4.jar
jdk.tools-1.8.jar
jdom2-2.0.6.jar
junit-4.13.2.jar
opentracing-api-0.33.0.jar
opentracing-noop-0.33.0.jar
opentracing-util-0.33.0.jar
org.jacoco.agent-0.8.5-runtime.jar

@steveloughran steveloughran changed the title HADOOP-1864. Cut excess dependencies from cloud connectors. HADOOP-18641. Cut excess dependencies from cloud connectors. Feb 23, 2023
* Exclude imports which come in with hadoop-common
* Add explicit import of hadoop's org.codehaus.jettison declaration
  to hadoop-aliyun
* Cut duplicate and inconsistent hbase-server declarations from hadoop-project

Change-Id: I529b4243cf31f389b3ad67069559c9c3b1f2efde
@steveloughran steveloughran force-pushed the build/HADOOP-18642-excess-dependencies branch from 9e80cd0 to d880c84 Compare February 23, 2023 18:25
Change-Id: Iffaa4c6ebfed5aed6275981ea2dfc25f892d5ccf
Change-Id: Iee1e1182fcbdee190a16e1fe0b75e05d37aec74b
@steveloughran
Copy link
Contributor Author

this is a blocker for 3.3.5; please review ASAP. thx

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 10m 11s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+0 🆗 xmllint 0m 0s xmllint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
-1 ❌ test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
_ branch-3.3.5 Compile Tests _
+0 🆗 mvndep 4m 50s Maven dependency ordering for branch
+1 💚 mvninstall 44m 39s branch-3.3.5 passed
+1 💚 compile 18m 33s branch-3.3.5 passed
+1 💚 mvnsite 2m 22s branch-3.3.5 passed
+1 💚 javadoc 2m 10s branch-3.3.5 passed
+1 💚 shadedclient 99m 9s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 25s Maven dependency ordering for patch
+1 💚 mvninstall 1m 29s the patch passed
+1 💚 compile 17m 56s the patch passed
+1 💚 javac 17m 56s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 mvnsite 2m 20s the patch passed
+1 💚 javadoc 2m 0s the patch passed
+1 💚 shadedclient 30m 16s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 0m 26s hadoop-project in the patch passed.
+1 💚 unit 2m 12s hadoop-azure in the patch passed.
+1 💚 unit 0m 30s hadoop-aliyun in the patch passed.
+1 💚 unit 1m 1s hadoop-azure-datalake in the patch passed.
+1 💚 asflicense 0m 51s The patch does not generate ASF License warnings.
166m 50s
Subsystem Report/Notes
Docker ClientAPI=1.42 ServerAPI=1.42 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5429/1/artifact/out/Dockerfile
GITHUB PR #5429
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient codespell detsecrets xmllint
uname Linux 10f01049a65a 4.15.0-200-generic #211-Ubuntu SMP Thu Nov 24 18:16:04 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision branch-3.3.5 / 9e80cd0
Default Java Private Build-1.8.0_352-8u352-ga-1~18.04-b08
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5429/1/testReport/
Max. process+thread count 601 (vs. ulimit of 5500)
modules C: hadoop-project hadoop-tools/hadoop-azure hadoop-tools/hadoop-aliyun hadoop-tools/hadoop-azure-datalake U: .
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5429/1/console
versions git=2.17.1 maven=3.6.0
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 39s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+0 🆗 xmllint 0m 0s xmllint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
-1 ❌ test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
_ branch-3.3.5 Compile Tests _
+0 🆗 mvndep 10m 59s Maven dependency ordering for branch
+1 💚 mvninstall 31m 53s branch-3.3.5 passed
+1 💚 compile 17m 45s branch-3.3.5 passed
+1 💚 mvnsite 3m 2s branch-3.3.5 passed
+1 💚 javadoc 2m 49s branch-3.3.5 passed
+1 💚 shadedclient 89m 57s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 30s Maven dependency ordering for patch
+1 💚 mvninstall 1m 36s the patch passed
+1 💚 compile 17m 1s the patch passed
+1 💚 javac 17m 1s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 mvnsite 2m 59s the patch passed
+1 💚 javadoc 2m 37s the patch passed
+1 💚 shadedclient 30m 4s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 0m 36s hadoop-project in the patch passed.
+1 💚 unit 2m 33s hadoop-azure in the patch passed.
+1 💚 unit 0m 40s hadoop-aliyun in the patch passed.
+1 💚 unit 1m 13s hadoop-azure-datalake in the patch passed.
+1 💚 asflicense 0m 59s The patch does not generate ASF License warnings.
148m 5s
Subsystem Report/Notes
Docker ClientAPI=1.42 ServerAPI=1.42 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5429/2/artifact/out/Dockerfile
GITHUB PR #5429
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient codespell detsecrets xmllint
uname Linux 0d52878bde6b 4.15.0-200-generic #211-Ubuntu SMP Thu Nov 24 18:16:04 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision branch-3.3.5 / d880c84
Default Java Private Build-1.8.0_352-8u352-ga-1~18.04-b08
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5429/2/testReport/
Max. process+thread count 725 (vs. ulimit of 5500)
modules C: hadoop-project hadoop-tools/hadoop-azure hadoop-tools/hadoop-aliyun hadoop-tools/hadoop-azure-datalake U: .
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5429/2/console
versions git=2.17.1 maven=3.6.0
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@mukund-thakur
Copy link
Contributor

@steveloughran Why just reverting https://issues.apache.org/jira/browse/HADOOP-18590 is not enough?

Copy link
Contributor

@ashutoshcipher ashutoshcipher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes look good to me.

Pending results from testing of azure in progress.. If they are good. I am +1.

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 10m 7s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+0 🆗 xmllint 0m 0s xmllint was not available.
+0 🆗 shelldocs 0m 0s Shelldocs was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
-1 ❌ test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
_ branch-3.3.5 Compile Tests _
+0 🆗 mvndep 4m 52s Maven dependency ordering for branch
-1 ❌ mvninstall 33m 46s /branch-mvninstall-root.txt root in branch-3.3.5 failed.
+1 💚 compile 18m 58s branch-3.3.5 passed
+1 💚 mvnsite 25m 56s branch-3.3.5 passed
-1 ❌ javadoc 5m 44s /branch-javadoc-root.txt root in branch-3.3.5 failed.
+1 💚 shadedclient 30m 15s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 25s Maven dependency ordering for patch
+1 💚 mvninstall 26m 17s the patch passed
+1 💚 compile 18m 6s the patch passed
+1 💚 javac 18m 6s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 mvnsite 21m 35s the patch passed
+1 💚 shellcheck 0m 0s No new issues.
-1 ❌ javadoc 6m 42s /results-javadoc-javadoc-root.txt root generated 702 new + 5850 unchanged - 0 fixed = 6552 total (was 5850)
+1 💚 shadedclient 31m 51s patch has no errors when building and testing our client artifacts.
_ Other Tests _
-1 ❌ unit 714m 26s /patch-unit-root.txt root in the patch passed.
+1 💚 asflicense 1m 25s The patch does not generate ASF License warnings.
941m 9s
Reason Tests
Failed junit tests hadoop.hdfs.TestRollingUpgrade
hadoop.fs.viewfs.TestViewFileSystemLinkMergeSlash
hadoop.hdfs.server.balancer.TestBalancerRPCDelay
hadoop.hdfs.server.datanode.TestDataNodeRollingUpgrade
Subsystem Report/Notes
Docker ClientAPI=1.42 ServerAPI=1.42 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5429/3/artifact/out/Dockerfile
GITHUB PR #5429
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient codespell detsecrets xmllint shellcheck shelldocs
uname Linux 815357d386db 4.15.0-200-generic #211-Ubuntu SMP Thu Nov 24 18:16:04 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision branch-3.3.5 / 5251a14
Default Java Private Build-1.8.0_352-8u352-ga-1~18.04-b08
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5429/3/testReport/
Max. process+thread count 2146 (vs. ulimit of 5500)
modules C: hadoop-project hadoop-tools/hadoop-azure hadoop-tools/hadoop-aliyun hadoop-tools/hadoop-azure-datalake . U: .
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5429/3/console
versions git=2.17.1 maven=3.6.0 shellcheck=0.4.6
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 10m 45s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+0 🆗 xmllint 0m 0s xmllint was not available.
+0 🆗 shelldocs 0m 0s Shelldocs was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
-1 ❌ test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
_ branch-3.3.5 Compile Tests _
+0 🆗 mvndep 4m 54s Maven dependency ordering for branch
+1 💚 mvninstall 35m 15s branch-3.3.5 passed
+1 💚 compile 18m 40s branch-3.3.5 passed
+1 💚 mvnsite 25m 49s branch-3.3.5 passed
+1 💚 javadoc 7m 2s branch-3.3.5 passed
+1 💚 shadedclient 31m 23s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 24s Maven dependency ordering for patch
+1 💚 mvninstall 26m 2s the patch passed
+1 💚 compile 18m 0s the patch passed
+1 💚 javac 18m 0s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 mvnsite 21m 26s the patch passed
+1 💚 shellcheck 0m 0s No new issues.
+1 💚 javadoc 6m 44s the patch passed
+1 💚 shadedclient 31m 31s patch has no errors when building and testing our client artifacts.
_ Other Tests _
-1 ❌ unit 713m 2s /patch-unit-root.txt root in the patch passed.
+1 💚 asflicense 1m 22s The patch does not generate ASF License warnings.
941m 33s
Reason Tests
Failed junit tests hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl
hadoop.hdfs.server.balancer.TestBalancerWithHANameNodes
hadoop.hdfs.tools.TestDFSAdmin
hadoop.hdfs.server.datanode.TestDataNodeRollingUpgrade
Subsystem Report/Notes
Docker ClientAPI=1.42 ServerAPI=1.42 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5429/4/artifact/out/Dockerfile
GITHUB PR #5429
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient codespell detsecrets xmllint shellcheck shelldocs
uname Linux 3024f4877558 4.15.0-200-generic #211-Ubuntu SMP Thu Nov 24 18:16:04 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision branch-3.3.5 / 5251a14
Default Java Private Build-1.8.0_352-8u352-ga-1~18.04-b08
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5429/4/testReport/
Max. process+thread count 2244 (vs. ulimit of 5500)
modules C: hadoop-project hadoop-tools/hadoop-azure hadoop-tools/hadoop-aliyun hadoop-tools/hadoop-azure-datalake . U: .
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5429/4/console
versions git=2.17.1 maven=3.6.0 shellcheck=0.4.6
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@steveloughran
Copy link
Contributor Author

HDFS failures, all of which I consider to be race conditions/timing issues and so not blockers.

TestDataNodeRollingUpgrade.deleteAndEnsureInTrash

java.lang.AssertionError
	at org.junit.Assert.fail(Assert.java:87)
	at org.junit.Assert.assertTrue(Assert.java:42)
	at org.junit.Assert.assertTrue(Assert.java:53)
	at org.apache.hadoop.hdfs.server.datanode.TestDataNodeRollingUpgrade.deleteAndEnsureInTrash(TestDataNodeRollingUpgrade.java:141)
	at org.apache.hadoop.hdfs.server.datanode.TestDataNodeRollingUpgrade.testWithLayoutChangeAndRollback(TestDataNodeRollingUpgrade.java:420)

is an assert after some heartbeats

    triggerHeartBeats();
    assertTrue(trashFile.exists());  // here
    assertFalse(blockFile.exists());

The file was deleted, just didn't show up

TestBalancerWithHANameNodes.testBalancerWithObserverWithFailedNode: timeout

org.apache.hadoop.hdfs.server.balancer.TestBalancerWithHANameNodes.testBalancerWithObserverWithFailedNode

Failing for the past 1 build (Since #4 )
Took 3 min 0 sec.
Error Message
test timed out after 180000 milliseconds
Stacktrace
org.junit.runners.model.TestTimedOutException: test timed out after 180000 milliseconds

TestDFSAdmin.testAllDatanodesReconfig

race condition; created https://issues.apache.org/jira/browse/HDFS-16934

TestFsDatasetImpl.testReportBadBlocks

Failing for the past 1 build (Since #4 )
Took 7.4 sec.
Error Message
expected:<1> but was:<0>
Stacktrace
java.lang.AssertionError: expected:<1> but was:<0>
	at org.junit.Assert.fail(Assert.java:89)
	at org.junit.Assert.failNotEquals(Assert.java:835)
	at org.junit.Assert.assertEquals(Assert.java:647)
	at org.junit.Assert.assertEquals(Assert.java:633)

assert is after a 3s sleep waiting for reports coming in. Going to brittle against delays. creating a jira; LambdaTestUtils.eventually() should be used around this assert

      dataNode.reportBadBlocks(block, dataNode.getFSDataset()
          .getFsVolumeReferences().get(0));
      Thread.sleep(3000);                                           // 3s sleep
      BlockManagerTestUtil.updateState(cluster.getNamesystem()
          .getBlockManager());
      // Verify the bad block has been reported to namenode
      Assert.assertEquals(1, cluster.getNamesystem().getCorruptReplicaBlocks());  // here

@steveloughran
Copy link
Contributor Author

Why just reverting https://issues.apache.org/jira/browse/HADOOP-18590 is not enough?

different issues. this one is that doing some downstream builds importing hadoop-cloud-storage was bringing in stuff they didn't need, and reviewing that showed some some jar updates hadn't updated the LICENSE-binary with the new version numbers and new transitive dependencies.

all test failures are unrelated and created a couple of hdfs jiras for fixing the obvious ones.

@steveloughran
Copy link
Contributor Author

@omalley could you look at this if you get a chance -just build and license fixup

Copy link
Contributor

@mukund-thakur mukund-thakur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM +1. ignoring yetus test failures as those are not related

@steveloughran steveloughran merged commit 72f8c2a into apache:branch-3.3.5 Feb 25, 2023
@steveloughran steveloughran deleted the build/HADOOP-18642-excess-dependencies branch February 25, 2023 09:37
@ayushtkn
Copy link
Member

I think wrong Jira Id in the commit message. It mentions HADOOP-18641 but it is actually HADOOP-18642
https://issues.apache.org/jira/browse/HADOOP-18642
HADOOP-18641 is cyclonedx stuff
https://issues.apache.org/jira/browse/HADOOP-18641

steveloughran added a commit to steveloughran/hadoop that referenced this pull request Feb 27, 2023
)


POM and LICENSE fixup of transient dependencies
* Exclude hadoop-cloud-storage imports which come in with hadoop-common
* Add explicit import of hadoop's org.codehaus.jettison declaration
  to hadoop-aliyun
* Tune aliyun jars imports
* Cut duplicate and inconsistent hbase-server declarations from
  hadoop-project
* Update LICENSE-binary for the current set of libraries in the
  hadoop 3.3.5 release.

Contributed by Steve Loughran
steveloughran added a commit to steveloughran/hadoop that referenced this pull request Feb 27, 2023
)

POM and LICENSE fixup of transient dependencies
* Exclude hadoop-cloud-storage imports which come in with hadoop-common
* Add explicit import of hadoop's org.codehaus.jettison declaration
  to hadoop-aliyun
* Tune aliyun jars imports
* Cut duplicate and inconsistent hbase-server declarations from
  hadoop-project
* Update LICENSE-binary for the current set of libraries.

Contributed by Steve Loughran

Change-Id: I0c3ed9ac8ebf4e9842563931bb6339946e215676
steveloughran added a commit that referenced this pull request Feb 28, 2023
POM and LICENSE fixup of transient dependencies
* Exclude hadoop-cloud-storage imports which come in with hadoop-common
* Add explicit import of hadoop's org.codehaus.jettison declaration
  to hadoop-aliyun
* Tune aliyun jars imports
* Update LICENSE-binary for the current set of libraries.

Contributed by Steve Loughran
asfgit pushed a commit that referenced this pull request Feb 28, 2023
POM and LICENSE fixup of transient dependencies
* Exclude hadoop-cloud-storage imports which come in with hadoop-common
* Add explicit import of hadoop's org.codehaus.jettison declaration
  to hadoop-aliyun
* Tune aliyun jars imports
* Cut duplicate and inconsistent hbase-server declarations from
  hadoop-project
* Update LICENSE-binary for the current set of libraries in the
  hadoop 3.3.5 release.

Contributed by Steve Loughran
@steveloughran
Copy link
Contributor Author

ooh, no, i've just done that across the board. let me update the jira

ferdelyi pushed a commit to ferdelyi/hadoop that referenced this pull request May 26, 2023
)


POM and LICENSE fixup of transient dependencies
* Exclude hadoop-cloud-storage imports which come in with hadoop-common
* Add explicit import of hadoop's org.codehaus.jettison declaration
  to hadoop-aliyun
* Tune aliyun jars imports
* Update LICENSE-binary for the current set of libraries.

Contributed by Steve Loughran
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants