Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BIGTOP-3706. Bump Hadoop to 3.3.3. #916

Merged
merged 2 commits into from Jun 22, 2022
Merged

Conversation

iwasakims
Copy link
Member

https://issues.apache.org/jira/browse/BIGTOP-3706

@iwasakims
Copy link
Member Author

I ran smoke tests of Hadoop.

$ ./docker-hadoop.sh \
     --create 3 \
     --image bigtop/puppet:trunk-rockylinux-8 \
     --memory 8g \
     --repo file:///bigtop-home/output \
     --disable-gpg-check \
     --stack hdfs,yarn,mapreduce \
     --smoke-tests hdfs,yarn,mapreduce

Tests of distcp and mapreduce failed. looking into the cause. If it takes time, I would like to address the test failure in follow-up JIRAs.

@iwasakims
Copy link
Member Author

Previously yarn app submitted by root user are submitted "root.root" queue by default. Currently "root" seems to be used as default queue name which is invalid.

[root@04603225dae7 /]# yarn jar /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar pi 1 1000
...
java.io.IOException: org.apache.hadoop.yarn.exceptions.YarnException: Failed to submit application_1655533034162_0001 to YARN : root is not a leaf queue
	at org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:346)
	at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:251)
	at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1571)

Explicitly specifying queue name fixes this. Users other than root should have no issue.

# yarn jar /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar pi -Dmapred.job.queue.name=root.root 1 1000

@iwasakims
Copy link
Member Author

Smoke-tests of hdfs,yarn,mapreduce passed with the last commit.

We used FairScheduler for addressing failure of smoke-tests of Oozie (BIGTOP-3406). I will address it later by tuning the configuration of CapacitySchduler. I prefer CapacityScheduler for tests since it is the default and actively mainteined than FairScheduler.

@iwasakims
Copy link
Member Author

Building Hadoop fails if jars of ZooKeeper built by Bigtop exists in local repository (~/.m2/repository).

$ ./gradlew zookeeper-pkg hadoop-pkg
...
[ERROR] Found artifact with unexpected contents: '/home/rocky/srcs/bigtop/build/hadoop/rpm/BUILD/hadoop-3.3.3-src/hadoop-client-modules/hadoop-client-minicluster/target/hadoop-client-minicluster-3.3.3.jar'
    Please check the following and either correct the build or update
    the allowed list with reasoning.

    edu/
    edu/umd/
    edu/umd/cs/
    edu/umd/cs/findbugs/
    edu/umd/cs/findbugs/annotations/
    edu/umd/cs/findbugs/annotations/NonNull.class
    edu/umd/cs/findbugs/annotations/CheckForNull.class
    edu/umd/cs/findbugs/annotations/DefaultAnnotationForFields.class
    ...

Since this issue can be fixed by rm -rf ~/.m2/repository/org/apache/zookeeper before building Hadoop, ZooKeeper jar built by Bigtop looks containing unexpected classes which are not appear in jars pulled from public Maven repositories.

The patch of #890 fixes this issue.

@guyuqi
Copy link
Member

guyuqi commented Jun 21, 2022

Thanks for working on this. LGTM.

Build Arm64 debian-10 docker images (protocbuf-3.7.1) based on PR: #915.
Verified Hadoop-3.3.3 build within built docker image above.

Also run the smoke tests.

index ad77d34a..d7bcdf8e 100644
--- a/provisioner/docker/config_debian-10.yaml
+++ b/provisioner/docker/config_debian-10.yaml
@@ -20,5 +20,5 @@ docker:
 repo: "http://repos.bigtop.apache.org/releases/3.0.0/debian/10/$(ARCH)"
 distro: debian
 components: [hdfs, yarn, mapreduce]
-enable_local_repo: false
+enable_local_repo: true
 smoke_test_components: [hdfs, yarn, mapreduce]

Smoke tests:

./docker-hadoop.sh -C config_debian-10.yaml -c 3 -s -d
Environment check...
Check docker:
Docker version 20.10.15, build fd82621
Check docker-compose:
Docker Compose version v2.6.0
Check ruby:
ruby 2.7.0p0 (2019-12-25 revision 647ee6f091) [aarch64-linux-gnu]
No cluster exists!
[+] Running 4/4
 ⠿ Network 20220621_163341_r8800_default     Created                                                                                    0.1s
 ⠿ Container 20220621_163341_r8800-bigtop-3  Started                                                                                    2.2s
 ⠿ Container 20220621_163341_r8800-bigtop-1  Started                                                                                    2.2s
 ⠿ Container 20220621_163341_r8800-bigtop-2  Started                                                                                    2.2s
......
...
.
Successfully started process 'Gradle Test Executor 4'

org.apache.bigtop.itest.hadoop.mapreduce.TestHadoopExamples STANDARD_ERROR
    22/06/21 08:50:01 INFO lang.Object: Failed to unpack jar resources.  Attemting to use bigtop sources
    22/06/21 08:50:01 INFO lang.Object: MAKING DIRECTORIES ..................... examples examples-output

Gradle Test Executor 4 finished executing tests.

> Task :bigtop-tests:smoke-tests:mapreduce:test
Finished generating test XML results (0.002 secs) into: /bigtop-home/bigtop-tests/smoke-tests/mapreduce/build/test-results/test
Generating HTML test report...
Finished generating test html results (0.017 secs) into: /bigtop-home/bigtop-tests/smoke-tests/mapreduce/build/reports/tests/test
Now testing...
:bigtop-tests:smoke-tests:mapreduce:test (Thread[Execution worker for ':' Thread 58,5,main]) completed. Took 6 mins 2.258 secs.

BUILD SUCCESSFUL in 19m 29s
41 actionable tasks: 14 executed, 27 up-to-date
Stopped 1 worker daemon(s).
+ rm -rf buildSrc/build/test-results/binary
+ rm -rf /bigtop-home/.gradle

Let me take the tests on Fedora-35/Rockylinux-8(rpm packaging and smoke tests) before merging this.

@guyuqi
Copy link
Member

guyuqi commented Jun 22, 2022

+1.
Verified packages build and smoke tests on Fedora-35/Rockylinux-8.

@guyuqi guyuqi merged commit d120a92 into apache:master Jun 22, 2022
@guyuqi
Copy link
Member

guyuqi commented Jun 22, 2022

Thanks, @iwasakims

sekikn pushed a commit to sekikn/bigtop that referenced this pull request Jul 4, 2022
* BIGTOP-3706. Bump Hadoop to 3.3.3.

* using CapacityScheduler to make smoke-tests work run by root user with the default configuration.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants