Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KAFKA-9857:Failed to build image ducker-ak-openjdk-8 on arm #8489

Conversation

jiameixie
Copy link
Contributor

The default OpenJDK base image is openjdk:8. When building image on arm, no
matching manifest for linux/arm64/v8 in the manifest list entries error will
occur. For arm, the default OpenJDK should be set to arm64v8/openjdk:8

Change-Id: Ib450a36b3977a167743c24476ec1810f4830b66b
Signed-off-by: Jiamei Xie jiamei.xie@arm.com

More detailed description of your change,
if necessary. The PR title and PR message become
the squashed commit message, so use a separate
comment to ping reviewers.

Summary of testing strategy (including rationale)
for the feature or bug fix. Unit and/or integration
tests are expected for any behaviour change and
system tests should be considered for larger changes.

Committer Checklist (excluded from commit message)

  • Verify design and implementation
  • Verify test coverage and CI build status
  • Verify documentation (including upgrade notes)

The default OpenJDK base image is openjdk:8. When building image on arm, no
matching manifest for linux/arm64/v8 in the manifest list entries error will
occur. For arm, the default OpenJDK should be set to arm64v8/openjdk:8

Change-Id: Ib450a36b3977a167743c24476ec1810f4830b66b
Signed-off-by: Jiamei Xie <jiamei.xie@arm.com>
@jiameixie
Copy link
Contributor Author

@guozhangwang @ijuma @junrao PTAL, thanks

@junrao
Copy link
Contributor

junrao commented Apr 23, 2020

@cmccabe : Does this look good to you? Thanks.

@cmccabe
Copy link
Contributor

cmccabe commented Apr 23, 2020

This looks reasonable. What testing have we done so far?

@jiameixie
Copy link
Contributor Author

@cmccabe I run it on arm, it works fine.

@jiameixie
Copy link
Contributor Author

@cmccabe I got some failures by running "bash tests/docker/run_tests.sh" on ARM, but most of it can be run successfully after some modification, except some tests related to version compatibility. Following is the summary:

  1. Fail due to timeout. Increasing timeout makes it run successfully.
    kafkatest.tests.streams.streams_upgrade_test.StreamsUpgradeTest.test_metadata_upgrade.to_version=2.1.1.from_version=0.10.2.2
    kafkatest.tests.core.replica_scale_test.ReplicaScaleTest.test_clean_bounce.partition_count=34.topic_count=500.replication_factor=3
    kafkatest.tests.core.group_mode_transactions_test.GroupModeTransactionsTest.test_transactions.failure_mode=clean_bounce.bounce_target=clients
    kafkatest.tests.streams.streams_broker_bounce_test.StreamsBrokerBounceTest.test_broker_type_bounce.num_threads=1.sleep_time_secs=120.failure_mode=clean_bounce.broker_type=controller
    kafkatest.tests.streams.streams_broker_bounce_test.StreamsBrokerBounceTest.test_broker_type_bounce.num_threads=1.sleep_time_secs=120.failure_mode=clean_bounce.broker_type=leader
    kafkatest.tests.streams.streams_broker_bounce_test.StreamsBrokerBounceTest.test_broker_type_bounce.num_threads=1.sleep_time_secs=120.failure_mode=clean_shutdown.broker_type=controller
  2. Version compatibility
    kafkatest.tests.core.downgrade_test.TestDowngrade.test_upgrade_and_downgrade.version=2.0.1.security_protocol=SASL_SSL.compression_types=.snappy
    kafkatest.tests.core.upgrade_test.TestUpgrade.test_upgrade.from_kafka_version=0.8.2.2.to_message_format_version=None.compression_types=.snappy
    kafkatest.tests.core.compatibility_test_new_broker_test.ClientCompatibilityTestNewBroker.test_compatibility.consumer_version=0.9.0.1.producer_version=0.9.0.1.compression_types=.snappy.timestamp_type=LogAppendTime
    kafkatest.tests.core.compatibility_test_new_broker_test.ClientCompatibilityTestNewBroker.test_compatibility.consumer_version=0.9.0.1.producer_version=dev.compression_types=.snappy.timestamp_type=CreateTime
    kafkatest.tests.core.compatibility_test_new_broker_test.ClientCompatibilityTestNewBroker.test_compatibility.consumer_version=2.0.1.producer_version=2.0.1.compression_types=.snappy.timestamp_type=CreateTime
    kafkatest.tests.core.compatibility_test_new_broker_test.ClientCompatibilityTestNewBroker.test_compatibility.consumer_version=dev.producer_version=0.9.0.1.compression_types=.snappy.timestamp_type=None
    kafkatest.tests.core.compatibility_test_new_broker_test.ClientCompatibilityTestNewBroker.test_compatibility.compression_types=.none.timestamp_type=None.producer_version=dev.new_consumer=False.consumer_version=0.9.0.1
    kafkatest.tests.core.upgrade_test.TestUpgrade.test_upgrade.from_kafka_version=0.9.0.1.to_message_format_version=None.compression_types=.snappy
    kafkatest.tests.core.upgrade_test.TestUpgrade.test_upgrade.from_kafka_version=2.0.1.to_message_format_version=None.compression_types=.snappy
  3. Wrong parameter for StreamsSmokeTestJobRunnerService constructor. It has been fixed upstream.
    kafkatest.tests.streams.streams_broker_bounce_test.StreamsBrokerBounceTest.test_all_brokers_bounce.failure_mode=clean_bounce.num_failures=3
    kafkatest.tests.streams.streams_broker_bounce_test.StreamsBrokerBounceTest.test_all_brokers_bounce.failure_mode=hard_bounce.num_failures=3
  4. network_degrade doesn't ready in time. I have a PR about it.
    kafkatest.tests.core.network_degrade_test.NetworkDegradeTest.test_rate.rate_limit_kbit=1000000.device_name=eth0.task_name=rate-1000-latency-50.latency_ms=50

@jiameixie
Copy link
Contributor Author

@cmccabe @junrao There were Jenkins errors. Does my PR need any more modification? Thanks.

@OneCricketeer
Copy link

Since Java 8 is EOL, why not upgrade to openjdk 11?

The default OpenJDK base image is openjdk:8. When building image on arm, no
matching manifest for linux/arm64/v8 in the manifest list entries error will
occur. For arm, the default OpenJDK should be set to arm64v8/openjdk:8. Java
8 is end of life, so upgrade to openjdk 11.

Change-Id: Ib450a36b3977a167743c24476ec1810f4830b66b
Signed-off-by: Jiamei Xie <jiamei.xie@arm.com>
@jiameixie
Copy link
Contributor Author

@OneCricketeer Yes, you're right. I recommitted the pr and upgraded openjdk 8 to openjdk 11.

@@ -42,7 +42,14 @@ docker_run_memory_limit="2000m"
default_num_nodes=14

# The default OpenJDK base image.
default_jdk="openjdk:8"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, you could've kept this

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But 11

default_jdk="openjdk:8"
case "$(uname -m)" in
aarch64)
default_jdk="arm64v8/openjdk:11"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then done default_jdk=arm64v8/${default_jdk}

default_jdk="arm64v8/openjdk:11"
;;
*)
default_jdk="openjdk:11"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then the wild card wouldn't be necessary

The default OpenJDK base image is openjdk:8. When building image on arm, no
matching manifest for linux/arm64/v8 in the manifest list entries error will
occur. For arm, the default OpenJDK should be set to arm64v8/openjdk:8. Java
8 is end of life, so upgrade to openjdk 11. Openjdk:11 is a multi-arch image.

Change-Id: Ib450a36b3977a167743c24476ec1810f4830b66b

Signed-off-by: Jiamei Xie <jiamei.xie@arm.com>
@jiameixie
Copy link
Contributor Author

@OneCricketeer I tested just now. Openjdk:11 is a multi-arch. X86 and arm use the same image name.

@jiameixie
Copy link
Contributor Author

@OneCricketeer Openjdk:11 is multi-arch, and openjdk:8 is not multi-arch. It's ok to run command "docker run -it openjdk:11 bash" on both arm and x86, while running command "docker run -it openjdk:8 bash" will bring an error.

@jiameixie
Copy link
Contributor Author

@OneCricketeer So if update to OpenJDK 11, there is no need to set default_jdk according to machine architecture. Thanks for your reminding

@jiameixie
Copy link
Contributor Author

@cmccabe I updated it to openjdk:11. Is that ok?

@jiameixie
Copy link
Contributor Author

@cmccabe What tests should be done? Thanks

@JunHe77
Copy link

JunHe77 commented Sep 18, 2020

@junrao @cmccabe Ping... is there any way we can help you merge this? Thanks.

@chia7712
Copy link
Contributor

there is an existent bug after updating base image from openjdk:8 to openjdk:11 (see #9324). feel free to merge the fix to your PR :)

…openjdk:11. Merge remote-tracking branch 'chia/MINOR-9324' into Failed-to-build-image-ducker-ak-openjdk-8-on-arm
@jiameixie
Copy link
Contributor Author

@chia7712 Thanks. I have merged it.

@lizthegrey
Copy link

Honeycomb doesn't directly have an interest in this as we use Confluent's packages, but we do run Confluent's Kafka distro on ARM64 (successfully).

@JunHe77
Copy link

JunHe77 commented Oct 28, 2020

Really appreciated for the feedback, @lizthegrey . Glad to know there are actual use cases to run and test Kafka on Arm64, and it seems that merging this will help users/developers from community perpective.

@martin-g
Copy link
Member

I've tested this PR on my ARM64 machine and at TravisCI. At both environments it fails with:

...
Collecting pynacl>=1.0.1
  Downloading PyNaCl-1.4.0.tar.gz (3.4 MB)
  Installing build dependencies: started
  Installing build dependencies: finished with status 'error'
  ERROR: Command errored out with exit status 1:
   command: /usr/bin/python3 /usr/local/lib/python3.7/dist-packages/pip install --ignore-installed --no-user --prefix /tmp/pip-build-env-wmhbzpt1/overlay --no-warn-script-location --no-binary :none: --only-binary :none: -i https://pypi.org/simple -- 'setuptools>=40.8.0' wheel 'cffi>=1.4.1; python_implementation != '"'"'PyPy'"'"''
       cwd: None
  Complete output (14 lines):
  Traceback (most recent call last):
    File "/usr/lib/python3.7/runpy.py", line 193, in _run_module_as_main
      "__main__", mod_spec)
    File "/usr/lib/python3.7/runpy.py", line 85, in _run_code
      exec(code, run_globals)
    File "/usr/local/lib/python3.7/dist-packages/pip/__main__.py", line 23, in <module>
      from pip._internal.cli.main import main as _main  # isort:skip # noqa
    File "/usr/local/lib/python3.7/dist-packages/pip/_internal/cli/main.py", line 5, in <module>
      import locale
    File "/usr/lib/python3.7/locale.py", line 16, in <module>
      import re
    File "/usr/lib/python3.7/re.py", line 143, in <module>
      class RegexFlag(enum.IntFlag):
  AttributeError: module 'enum' has no attribute 'IntFlag'
  ----------------------------------------
ERROR: Command errored out with exit status 1: /usr/bin/python3 /usr/local/lib/python3.7/dist-packages/pip install --ignore-installed --no-user --prefix /tmp/pip-build-env-wmhbzpt1/overlay --no-warn-script-location --no-binary :none: --only-binary :none: -i https://pypi.org/simple -- 'setuptools>=40.8.0' wheel 'cffi>=1.4.1; python_implementation != '"'"'PyPy'"'"'' Check the logs for full command output.
WARNING: You are using pip version 20.2.2; however, version 20.3.3 is available.
You should consider upgrading via the '/usr/bin/python3 -m pip install --upgrade pip' command.
The command '/bin/sh -c pip3 install --upgrade cffi virtualenv pyasn1 boto3 pycrypto pywinrm ipaddress enum34 && pip3 install --upgrade ducktape==0.8.0' returned a non-zero code: 1
docker failed
ducker-ak up failed

@martin-g
Copy link
Member

I've created #9794. It applies the changes from this PR and also fixes the issue in the comment above.
It also adds an extra job for Travis to run the build and the tests on ARM64 at AWS Graviton2 node.

@jiameixie jiameixie closed this Dec 23, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants