Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bitnami/kafka] Could not find or load main class kafka.tools.StorageTool #25954

Open
eht16 opened this issue May 16, 2024 · 6 comments
Open
Assignees
Labels
kafka stale 15 days without activity tech-issues The user has a technical issue about an application triage Triage is needed

Comments

@eht16
Copy link

eht16 commented May 16, 2024

Name and Version

bitnami/kafka 28.2.4

What architecture are you using?

amd64

What steps will reproduce the bug?

Install the Bitnami Kafka chart via helm upgrade --install --namespace kafka-test kafka-test oci://registry-1.docker.io/bitnamicharts/kafka --values custom-values.yaml.

custom.values are listed below.

Are you using any custom parameters or values?

kubeVersion: "1.25"
image:
  debug: true

controller:
  replicaCount: 1
  controllerOnly: true
  persistence:
    enabled: false

broker:
  replicaCount: 0

volumePermissions:
  enabled: true

kraft:
  enabled: true
  clusterId: "(redacted)"

zookeeper:
  enabled: false

listeners:
  client:
    protocol: "PLAINTEXT"
  controller:
    protocol: "PLAINTEXT"
  interbroker:
    protocol: "PLAINTEXT"
  external:
    protocol: "PLAINTEXT"

What is the expected behavior?

Kafka service is starting up

What do you see instead?

The init container seems to run fine even though it has no output.
The "kafka" container crashes shortly after startup with the following output:

| kafka kafka 14:51:49.49 INFO  ==>                                                                                                                                                                                                            │
│ kafka kafka 14:51:49.49 INFO  ==> Welcome to the Bitnami kafka container                                                                                                                                                                     │
│ kafka kafka 14:51:49.49 INFO  ==> Subscribe to project updates by watching https://github.com/bitnami/containers                                                                                                                             │
│ kafka kafka 14:51:49.49 INFO  ==> Submit issues and feature requests at https://github.com/bitnami/containers/issues                                                                                                                         │
│ kafka kafka 14:51:49.49 INFO  ==> Upgrade to Tanzu Application Catalog for production environments to access custom-configured and pre-packaged software components. Gain enhanced features, including Software Bill of Materials (SBOM), CV │
│ kafka kafka 14:51:49.49 INFO  ==>                                                                                                                                                                                                            │
│ kafka kafka 14:51:49.49 INFO  ==> ** Starting Kafka setup **                                                                                                                                                                                 │
│ kafka kafka 14:51:49.57 INFO  ==> Initializing KRaft storage metadata                                                                                                                                                                        │
│ kafka kafka 14:51:49.57 INFO  ==> Formatting storage directories to add metadata...                                                                                                                                                          │
│ kafka Error: Could not find or load main class kafka.tools.StorageTool                                                                                                                                                                       │
│ kafka Caused by: java.lang.ClassNotFoundException: kafka.tools.StorageTool                         

And then is started again by the STS and fails until it reaches the CrashLoopBackOff state.

Additional information

I wonder what is the reason for this crash.
Since there do not seem many users affected by this, it might be rather a configuration problem than a bug.
Still, I could reproduce it in two different Kubernetes clusters.

The above provided custom values are a stripped down version of my actual setup which originates from the Sentry Helm chart (sentry-kubernetes/charts#1241) but both show the same error.

What I've tried so far:

  • setting kubeVersion and leaving it empty
  • toggling controller.controllerOnly
  • toggling volumePermissions.enable
  • changing kraft.clusterId
  • using a PVC via controller.persistence.enabled: true and controller.persistence.existingClaim

Kubernetes version: 1.25

@eht16 eht16 added the tech-issues The user has a technical issue about an application label May 16, 2024
@github-actions github-actions bot added the triage Triage is needed label May 16, 2024
@eht16
Copy link
Author

eht16 commented May 17, 2024

It seems it is not related to the custom values but rather to the K8S cluster.

Out of curiosity, I tested the above configuration with a local "k3d" cluster (K8S 1.22) and there Kafka is starting up properly.

Back in my production cluster to get a shell into the Kafka POD (after changing the command to "bash -c sleep 3600", I see:

I have no name!@kafka-test-controller-0:/$ /opt/bitnami/scripts/kafka/setup.sh
kafka 06:40:15.41 INFO  ==> Initializing KRaft storage metadata
kafka 06:40:15.41 INFO  ==> Formatting storage directories to add metadata...
Error: Could not find or load main class kafka.tools.StorageTool
Caused by: java.lang.ClassNotFoundException: kafka.tools.StorageTool

I have no name!@kafka-test-controller-0:/$ /opt/bitnami/scripts/kafka/run.sh
kafka 06:40:21.10 INFO  ==> ** Starting Kafka **
Error: Could not find or load main class kafka.Kafka
Caused by: java.lang.ClassNotFoundException: kafka.Kafka

Running the scripts this way is surely not the proper way but I think it's good enough to see that there might be a problem with loading the Java classes.

Some more information:

I have no name!@kafka-test-controller-0:/$ uname -a
Linux kafka-test-controller-0 4.18.0-147.5.1.6.h766.eulerosv2r9.x86_64 #1 SMP Sat May 28 09:00:28 UTC 2022 x86_64 GNU/Linux

I have no name!@kafka-test-controller-0:/$ java -version
openjdk version "17.0.11" 2024-04-16 LTS
OpenJDK Runtime Environment (build 17.0.11+10-LTS)
OpenJDK 64-Bit Server VM (build 17.0.11+10-LTS, mixed mode, sharing)

Kubernetes version: 1.25.6
Container Runtime: containerd 1.6.14
Node OS: EulerOS 2.0 (seems like a CentOS based distro)

What could cause this?
I don't see the relation of ClassNotFound error in Java to the underlying K8S cluster. Everything Java runtime related should be in the Docker image, I'd guessed. And after all, it's all on the amd64 platform.

@eht16
Copy link
Author

eht16 commented May 17, 2024

I think I got it:

controller.podSecurityContext.seccompProfile.type is the key to success.
When I set the "type" to Unconfined and so do not use any Seccomp profile, it works and Kafka starts up.
This is also relevant for broker.podSecurityContext, provisioning.podSecurityContext and maybe more.

So, the Seccomp profile "RuntimeDefault" seems to be different between the Kubernetes clusters I have tested.

To verify, I used "amicontained" (https://github.com/genuinetools/amicontained) in my production cluster and in the local "k3d" test cluster.
I didn't find a better way to compare the Seccomp profiles, no idea if they can be read via the Kubernetes API.

The result is pretty similar but the default profile in my production cluster had six more blocked syscalls:

  • NAME_TO_HANDLE_AT
  • PKEY_ALLOC
  • PKEY_FREE
  • PKEY_MPROTECT
  • PROCESS_VM_READV
  • PROCESS_VM_WRITEV

My guess is that the Seccomp profile is defined by the container runtime and in my production cluster it is "containerd" and in the "k3d" test cluster it is "Docker".

I think I will continue with controller.podSecurityContext.seccompProfile.type: Unconfined, knowing that it will reduce the security features provided by the blocked syscalls.

For reference the "amicontained" output from my production cluster:

Container Runtime: kube
Has Namespaces:
	pid: true
	user: false
AppArmor Profile: unconfined
Capabilities:
Seccomp: filtering
Blocked Syscalls (70):
	MSGRCV SYSLOG SETUID SETGID SETSID SETREUID SETREGID SETGROUPS SETRESUID SETRESGID USELIB USTAT SYSFS VHANGUP PIVOT_ROOT _SYSCTL CHROOT ACCT SETTIMEOFDAY MOUNT UMOUNT2 SWAPON SWAPOFF REBOOT SETHOSTNAME SETDOMAINNAME IOPL IOPERM CREATE_MODULE INIT_MODULE DELETE_MODULE GET_KERNEL_SYMS QUERY_MODULE QUOTACTL NFSSERVCTL GETPMSG PUTPMSG AFS_SYSCALL TUXCALL SECURITY LOOKUP_DCOOKIE CLOCK_SETTIME VSERVER MBIND SET_MEMPOLICY GET_MEMPOLICY KEXEC_LOAD ADD_KEY REQUEST_KEY KEYCTL MIGRATE_PAGES FUTIMESAT UNSHARE MOVE_PAGES UTIMENSAT PERF_EVENT_OPEN FANOTIFY_INIT NAME_TO_HANDLE_AT OPEN_BY_HANDLE_AT SETNS PROCESS_VM_READV PROCESS_VM_WRITEV KCMP FINIT_MODULE KEXEC_FILE_LOAD BPF USERFAULTFD PKEY_MPROTECT PKEY_ALLOC PKEY_FREE
Looking for Docker.sock

And the output from the "k3d" cluster:

Container Runtime: not-found
Has Namespaces:
	pid: true
	user: false
AppArmor Profile: unconfined
Capabilities:
Seccomp: filtering
Blocked Syscalls (64):
	MSGRCV SYSLOG SETUID SETGID SETSID SETREUID SETREGID SETGROUPS SETRESUID SETRESGID USELIB USTAT SYSFS VHANGUP PIVOT_ROOT _SYSCTL CHROOT ACCT SETTIMEOFDAY MOUNT UMOUNT2 SWAPON SWAPOFF REBOOT SETHOSTNAME SETDOMAINNAME IOPL IOPERM CREATE_MODULE INIT_MODULE DELETE_MODULE GET_KERNEL_SYMS QUERY_MODULE QUOTACTL NFSSERVCTL GETPMSG PUTPMSG AFS_SYSCALL TUXCALL SECURITY LOOKUP_DCOOKIE CLOCK_SETTIME VSERVER MBIND SET_MEMPOLICY GET_MEMPOLICY KEXEC_LOAD ADD_KEY REQUEST_KEY KEYCTL MIGRATE_PAGES FUTIMESAT UNSHARE MOVE_PAGES UTIMENSAT PERF_EVENT_OPEN FANOTIFY_INIT OPEN_BY_HANDLE_AT SETNS KCMP FINIT_MODULE KEXEC_FILE_LOAD BPF USERFAULTFD
Looking for Docker.sock

@eht16 eht16 closed this as completed May 17, 2024
@eht16
Copy link
Author

eht16 commented May 17, 2024

Maybe it's worth to mention this in the README of this chart?
I'm not yet sure whether the extra blocked syscalls in my cluster really originate from the Seccomp profile of "containerd" or whether it might be a modification of the cloud provider.

Anyway, in case other users are affected similarly it might be worth a note in the README and/or in the values.yaml.

@eht16 eht16 reopened this May 17, 2024
@github-actions github-actions bot removed the solved label May 17, 2024
@javsalgar javsalgar changed the title Kafka: Could not find or load main class kafka.tools.StorageTool [bitnami/kafka] Could not find or load main class kafka.tools.StorageTool May 20, 2024
@javsalgar
Copy link
Contributor

Hi!

Thank you so much for letting us know! I will forward it to the documentation team, but as you discovered the issue, would you like to add a small note in the configuration section of the chart?

@eht16
Copy link
Author

eht16 commented May 20, 2024

Thank you so much for letting us know! I will forward it to the documentation team

Thank you.

but as you discovered the issue, would you like to add a small note in the configuration section of the chart?

I'd rather not. I'm not yet confident enough what seccomp profile is used resp. where and how it is set, need to figure this out.

Copy link

github-actions bot commented Jun 5, 2024

This Issue has been automatically marked as "stale" because it has not had recent activity (for 15 days). It will be closed if no further activity occurs. Thanks for the feedback.

@github-actions github-actions bot added the stale 15 days without activity label Jun 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kafka stale 15 days without activity tech-issues The user has a technical issue about an application triage Triage is needed
Projects
None yet
Development

No branches or pull requests

3 participants