Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Normalization: arm64 image bad exec format #15498

Closed
yagehu opened this issue Aug 10, 2022 · 23 comments
Closed

Normalization: arm64 image bad exec format #15498

yagehu opened this issue Aug 10, 2022 · 23 comments
Labels

Comments

@yagehu
Copy link

yagehu commented Aug 10, 2022

Environment

  • Airbyte version: 0.39.42-alpha
  • OS Version / Instance: AWS EKS-managed node group with AL2_ARM_64 image.
  • Deployment: Deployed EKS cluster with ARM64 node group
  • Source Connector and version: Official GitHub source connector
  • Destination Connector and version: Official alpha PostgreSQL destination connector
  • Step where error happened: Sync with basic normalization

Current Behavior

  • Run sync with normalization turned off -> OK
  • Run sync with normalization turned on -> Error

If I check the logs for the normalization pod, the error message points to an arch mismatch:

$ kubectl -n airbyte logs -f normalization-normalize-6-0-etiuw main
standard_init_linux.go:228: exec user process caused: exec format error

My EKS cluster worker nodes are all ARM64 and it has been able to pull the correct arch image fine. It's just with the airbyte/normalization image.

Logs

java.lang.RuntimeException: io.airbyte.workers.exception.WorkerException: Running the launcher normalization-orchestrator failed
	at io.airbyte.workers.temporal.TemporalUtils.withBackgroundHeartbeat(TemporalUtils.java:320) ~[io.airbyte-airbyte-workers-0.39.38-alpha.jar:?]
	at io.airbyte.workers.temporal.sync.LauncherWorker.run(LauncherWorker.java:90) ~[io.airbyte-airbyte-workers-0.39.38-alpha.jar:?]
	at io.airbyte.workers.temporal.TemporalAttemptExecution.lambda$getWorkerThread$2(TemporalAttemptExecution.java:155) ~[io.airbyte-airbyte-workers-0.39.38-alpha.jar:?]
	at java.lang.Thread.run(Thread.java:833) [?:?]
Caused by: io.airbyte.workers.exception.WorkerException: Running the launcher normalization-orchestrator failed
	at io.airbyte.workers.temporal.sync.LauncherWorker.lambda$run$3(LauncherWorker.java:181) ~[io.airbyte-airbyte-workers-0.39.38-alpha.jar:?]
	at io.airbyte.workers.temporal.TemporalUtils.withBackgroundHeartbeat(TemporalUtils.java:315) ~[io.airbyte-airbyte-workers-0.39.38-alpha.jar:?]
	... 3 more
Caused by: io.airbyte.workers.exception.WorkerException: Non-zero exit code!
	at io.airbyte.workers.temporal.sync.LauncherWorker.lambda$run$3(LauncherWorker.java:165) ~[io.airbyte-airbyte-workers-0.39.38-alpha.jar:?]
	at io.airbyte.workers.temporal.TemporalUtils.withBackgroundHeartbeat(TemporalUtils.java:315) ~[io.airbyte-airbyte-workers-0.39.38-alpha.jar:?]
	... 3 more
2022-08-09 23:36:43 �[32mINFO�[m i.a.w.t.TemporalAttemptExecution(get):131 - Stopping cancellation check scheduling...
2022-08-09 23:36:43 �[32mINFO�[m i.a.w.t.TemporalUtils(withBackgroundHeartbeat):291 - Stopping temporal heartbeating...
2022-08-09 23:36:43 �[33mWARN�[m i.t.i.a.POJOActivityTaskHandler(activityFailureToResult):307 - Activity failure. ActivityId=5a42ea10-999f-38aa-9687-292c9cd3c436, activityType=Normalize, attempt=1
java.lang.RuntimeException: io.temporal.serviceclient.CheckedExceptionWrapper: java.util.concurrent.ExecutionException: java.lang.RuntimeException: io.airbyte.workers.exception.WorkerException: Running the launcher normalization-orchestrator failed
	at io.airbyte.workers.temporal.TemporalUtils.withBackgroundHeartbeat(TemporalUtils.java:289) ~[io.airbyte-airbyte-workers-0.39.38-alpha.jar:?]
	at io.airbyte.workers.temporal.sync.NormalizationActivityImpl.normalize(NormalizationActivityImpl.java:75) ~[io.airbyte-airbyte-workers-0.39.38-alpha.jar:?]
	at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:?]
	at jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) ~[?:?]
	at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:?]
	at java.lang.reflect.Method.invoke(Method.java:568) ~[?:?]
	at io.temporal.internal.activity.POJOActivityTaskHandler$POJOActivityInboundCallsInterceptor.execute(POJOActivityTaskHandler.java:214) ~[temporal-sdk-1.8.1.jar:?]
	at io.temporal.internal.activity.POJOActivityTaskHandler$POJOActivityImplementation.execute(POJOActivityTaskHandler.java:180) ~[temporal-sdk-1.8.1.jar:?]
	at io.temporal.internal.activity.POJOActivityTaskHandler.handle(POJOActivityTaskHandler.java:120) ~[temporal-sdk-1.8.1.jar:?]
	at io.temporal.internal.worker.ActivityWorker$TaskHandlerImpl.handle(ActivityWorker.java:204) ~[temporal-sdk-1.8.1.jar:?]
	at io.temporal.internal.worker.ActivityWorker$TaskHandlerImpl.handle(ActivityWorker.java:164) ~[temporal-sdk-1.8.1.jar:?]
	at io.temporal.internal.worker.PollTaskExecutor.lambda$process$0(PollTaskExecutor.java:93) ~[temporal-sdk-1.8.1.jar:?]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) [?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?]
	at java.lang.Thread.run(Thread.java:833) [?:?]
Caused by: io.temporal.serviceclient.CheckedExceptionWrapper: java.util.concurrent.ExecutionException: java.lang.RuntimeException: io.airbyte.workers.exception.WorkerException: Running the launcher normalization-orchestrator failed
	at io.temporal.serviceclient.CheckedExceptionWrapper.wrap(CheckedExceptionWrapper.java:56) ~[temporal-serviceclient-1.8.1.jar:?]
	at io.temporal.internal.sync.WorkflowInternal.wrap(WorkflowInternal.java:448) ~[temporal-sdk-1.8.1.jar:?]
	at io.temporal.activity.Activity.wrap(Activity.java:51) ~[temporal-sdk-1.8.1.jar:?]
	at io.airbyte.workers.temporal.TemporalAttemptExecution.get(TemporalAttemptExecution.java:135) ~[io.airbyte-airbyte-workers-0.39.38-alpha.jar:?]
	at io.airbyte.workers.temporal.sync.NormalizationActivityImpl.lambda$normalize$3(NormalizationActivityImpl.java:103) ~[io.airbyte-airbyte-workers-0.39.38-alpha.jar:?]
	at io.airbyte.workers.temporal.TemporalUtils.withBackgroundHeartbeat(TemporalUtils.java:284) ~[io.airbyte-airbyte-workers-0.39.38-alpha.jar:?]
	... 14 more
Caused by: java.util.concurrent.ExecutionException: java.lang.RuntimeException: io.airbyte.workers.exception.WorkerException: Running the launcher normalization-orchestrator failed
	at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:396) ~[?:?]
	at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2073) ~[?:?]
	at io.airbyte.workers.temporal.TemporalAttemptExecution.get(TemporalAttemptExecution.java:129) ~[io.airbyte-airbyte-workers-0.39.38-alpha.jar:?]
	at io.airbyte.workers.temporal.sync.NormalizationActivityImpl.lambda$normalize$3(NormalizationActivityImpl.java:103) ~[io.airbyte-airbyte-workers-0.39.38-alpha.jar:?]
	at io.airbyte.workers.temporal.TemporalUtils.withBackgroundHeartbeat(TemporalUtils.java:284) ~[io.airbyte-airbyte-workers-0.39.38-alpha.jar:?]
	... 14 more
Caused by: java.lang.RuntimeException: io.airbyte.workers.exception.WorkerException: Running the launcher normalization-orchestrator failed
	at io.airbyte.workers.temporal.TemporalUtils.withBackgroundHeartbeat(TemporalUtils.java:320) ~[io.airbyte-airbyte-workers-0.39.38-alpha.jar:?]
	at io.airbyte.workers.temporal.sync.LauncherWorker.run(LauncherWorker.java:90) ~[io.airbyte-airbyte-workers-0.39.38-alpha.jar:?]
	at io.airbyte.workers.temporal.TemporalAttemptExecution.lambda$getWorkerThread$2(TemporalAttemptExecution.java:155) ~[io.airbyte-airbyte-workers-0.39.38-alpha.jar:?]
	... 1 more
Caused by: io.airbyte.workers.exception.WorkerException: Running the launcher normalization-orchestrator failed
	at io.airbyte.workers.temporal.sync.LauncherWorker.lambda$run$3(LauncherWorker.java:181) ~[io.airbyte-airbyte-workers-0.39.38-alpha.jar:?]
	at io.airbyte.workers.temporal.TemporalUtils.withBackgroundHeartbeat(TemporalUtils.java:315) ~[io.airbyte-airbyte-workers-0.39.38-alpha.jar:?]
	at io.airbyte.workers.temporal.sync.LauncherWorker.run(LauncherWorker.java:90) ~[io.airbyte-airbyte-workers-0.39.38-alpha.jar:?]
	at io.airbyte.workers.temporal.TemporalAttemptExecution.lambda$getWorkerThread$2(TemporalAttemptExecution.java:155) ~[io.airbyte-airbyte-workers-0.39.38-alpha.jar:?]
	... 1 more
Caused by: io.airbyte.workers.exception.WorkerException: Non-zero exit code!
	at io.airbyte.workers.temporal.sync.LauncherWorker.lambda$run$3(LauncherWorker.java:165) ~[io.airbyte-airbyte-workers-0.39.38-alpha.jar:?]
	at io.airbyte.workers.temporal.TemporalUtils.withBackgroundHeartbeat(TemporalUtils.java:315) ~[io.airbyte-airbyte-workers-0.39.38-alpha.jar:?]
	at io.airbyte.workers.temporal.sync.LauncherWorker.run(LauncherWorker.java:90) ~[io.airbyte-airbyte-workers-0.39.38-alpha.jar:?]
	at io.airbyte.workers.temporal.TemporalAttemptExecution.lambda$getWorkerThread$2(TemporalAttemptExecution.java:155) ~[io.airbyte-airbyte-workers-0.39.38-alpha.jar:?]
	... 1 more

Steps to Reproduce

  1. Create a connection with github source and postgresql dest.
  2. Turn on normalization.
  3. Run sync.

Are you willing to submit a PR?

Happy to but I have not been able to determine where in the github workflow the image gets pushed.

@yagehu
Copy link
Author

yagehu commented Aug 10, 2022

I think this is where the image gets built and pushed:

if [[ "airbyte/normalization" == "${image_name}" ]]; then
echo "Publishing normalization images (version: $versioned_image)"
GIT_REVISION=$(git rev-parse HEAD)
# We use a buildx docker container when building multi-stage builds from one docker compose file
# This works because all the images depend only on already public images
docker buildx create --name connector-buildx --driver docker-container --use
# Note: "buildx bake" needs to be run within the directory
local original_pwd=$PWD
cd airbyte-integrations/bases/base-normalization
VERSION=$image_version GIT_REVISION=$GIT_REVISION docker buildx bake \
--set "*.platform=$build_arch" \
-f docker-compose.build.yaml \
--push
VERSION=latest GIT_REVISION=$GIT_REVISION docker buildx bake \
--set "*.platform=$build_arch" \
-f docker-compose.build.yaml \
--push
docker buildx rm connector-buildx
cd $original_pwd
else
# We have to go arch-by-arch locally (see https://github.com/docker/buildx/issues/59 for more info) due to our base images (e.g. airbyte-integrations/bases/base-java)
# Alternative local approach @ https://github.com/docker/buildx/issues/301#issuecomment-755164475
# We need to use the regular docker buildx driver (not docker container) because we need this intermediate contaiers to be available for later build steps
for arch in $(echo $build_arch | sed "s/,/ /g")
do
echo "building base images for $arch"
docker buildx build -t airbyte/integration-base:dev --platform $arch --load airbyte-integrations/bases/base
docker buildx build -t airbyte/integration-base-java:dev --platform $arch --load airbyte-integrations/bases/base-java
local arch_versioned_image=$image_name:`echo $arch | sed "s/\//-/g"`-$image_version
echo "Publishing new version ($arch_versioned_image) from $path"
docker buildx build -t $arch_versioned_image --platform $arch --push $path
docker manifest create $latest_image --amend $arch_versioned_image
docker manifest create $versioned_image --amend $arch_versioned_image

It seems like airbyte/normalization is built differently than other images with --set "*.platform=$build_arch". This might be the reason.

@natalyjazzviolin natalyjazzviolin changed the title airbyte/normalization arm64 image bad exec format Normalization: arm64 image bad exec format Aug 10, 2022
@natalyjazzviolin
Copy link
Contributor

Hi @yagehu, thank you for bringing this to our attention! I have triaged it to the appropriate team!

@yagehu
Copy link
Author

yagehu commented Aug 10, 2022

Great! A quick note is this is only reproducible on ARM64 Linux and not M1 Macbooks presumably because of Rosetta.

@yagehu
Copy link
Author

yagehu commented Aug 10, 2022

Simplest repro:

  1. Spin up an ARM64 EC2 instance
  2. Install docker
  3. docker pull airbyte/normalization
  4. Docker run it.

@evantahler
Copy link
Contributor

Is this still an issue with our latest normalization images? We've been publishing arm64 normalization containers since June 10 and made a slight change to this build process which likely didn't build the image you were using at the time

@michael-baraboo-cnr
Copy link

@evantahler I just tried the repro instructions above on one of our arm64 ec2 instances using the most recent tag and the issue remains:

$ lscpu | head -n1
Architecture:        aarch64
$ docker run airbyte/normalization:0.2.23
standard_init_linux.go:228: exec user process caused: exec format error

@naxels
Copy link

naxels commented Nov 17, 2022

I can confirm that this doesn't work on my ARM processor (seem error as above, exec format error). while it works fine on x64.

@evantahler
Copy link
Contributor

A summary of some research:

The solution is to not use the https://hub.docker.com/r/fishtownanalytics/dbt base image, and build the deps (python) which we need ourselves.

@evantahler evantahler added the team/destinations Destinations team's backlog label Nov 17, 2022
@evantahler
Copy link
Contributor

cc @airbytehq/jdbc-connectors

@kggx
Copy link

kggx commented Apr 12, 2023

Any updates on this? Still facing the same issues on 0.4.0

$ lscpu | head -n1
Architecture:                    aarch64
$ docker run airbyte/normalization:0.4.0
exec /airbyte/entrypoint.sh: exec format error

Many thanks! Willing to help if needed but I would need some guidance where to start.

@evantahler
Copy link
Contributor

evantahler commented Apr 12, 2023

We are working on removing our separate normalization images, and instead be performing these same operations (still using dbt at first) as part of the normal destination containers. That means that our regular docker images (which are already multi-arch) will be used, and this problem should be indirectly solved.

We are targeting early Q4, 2023 for this change

@ajagnanan
Copy link

ajagnanan commented May 22, 2023

I'm running into this issue too. Is there a way to configure the image used via the Helm chart? We can then build a custom image and push it to our repo, and use that instead while the migration happens.

@amankesharwani7
Copy link

Team, I am also facing the same issue and this is a blocker for our usecase, Any ETA on resolution ?

@stultus
Copy link

stultus commented Jul 5, 2023

I'm also facing this issue.

@lavData
Copy link

lavData commented Aug 4, 2023

I'm also facing this issue when build custom airbyte-worker:0.50.11

@zachloertscher
Copy link

zachloertscher commented Aug 24, 2023

We're also facing this issue. Using an ARM based architecture on our EC2 would cut our CPU by 50% - but we are getting the same error 😢

Airbyte version: 0.44.3
snowflake destination version: 2.1.1

@evantahler
Copy link
Contributor

evantahler commented Aug 25, 2023

Starting next week, normalization will be removed from destination-snowflake and destination-bigquery with the Destinations V2 launch. By the end of this year, all dbt normalization containers will be removed from Airbyte.

@evantahler
Copy link
Contributor

@zachloertscher - I believe we are are publishing destination-snowflake with an ARM build: https://hub.docker.com/r/airbyte/destination-snowflake/tags. Is something not working?

Screenshot 2023-08-24 at 5 30 02 PM

@zachloertscher
Copy link

@evantahler yes we are still getting this error: exec user process caused: exec format error

This forum post more clearly illustrates the issue (runs fine on ARM with Rosetta, not on Amazon Linux 2). We also tried to run it on a Amazon Linux 2 instance, specifically the r6g.2xlarge EC2 instance type

@cedstrom
Copy link

+1. I see this with the clickhouse connector on arm64 and confirmed that the clickhouse-destination image is a multiarch image on docker hub.

@cedstrom
Copy link

cedstrom commented Aug 31, 2023

@evantahler The issue exists because the file tree is an x86_64 tree even though the image is arm64, so yes, while the arm images are being built, they're being built incorrectly.

$ uname -i
aarch64

$ docker run --name tmp --entrypoint='/bin/bash' airbyte/normalization-clickhouse:0.4.3
exec /bin/bash: exec format error

$ docker inspect airbyte/normalization-clickhouse:0.4.3  | grep Arch
        "Architecture": "arm64",

$ mkdir tmp
$ docker cp tmp:/ tmp/

$ file tmp/bin/bash
tmp/bin/bash: ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=31c321f9f0c1f86a379f7efaaeb75f707998f27f, for GNU/Linux 3.2.0, stripped

This also explains why it works on Mx Macs, because Rosetta happily will translate those binaries.

@evantahler
Copy link
Contributor

Ah, that makes sense (and had been what we are doing to shoehorn in a dbt build image which does not have an arm build). Starting this week, we are rolling out destinations v2 (#26028) which doesn't use dbt normalization, and is built properly from a base image that does have arm support. Destinations Snowflake v3.x and BigQuery 2.x are available today, and other destinations will be updated this year.

We will not be fixing these images with dbt, as dbt is being actively removed from within Airbyte destnations. Destinations V2 images should work properly on arm hosts.

@cgardens
Copy link
Contributor

cgardens commented Feb 9, 2024

We aren't going to pick this up. We want to stay away from orchestrating dbt. We recommend using Airflow or Dagster to do this orchestration. Docs.

@cgardens cgardens closed this as completed Feb 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.