Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Airbyte Temporal - "Failed to resolved name" (Wrong temporal link used by airbyte-cron) #20152

Closed
gajus opened this issue Dec 6, 2022 · 8 comments

Comments

@gajus
Copy link

gajus commented Dec 6, 2022

Getting error:

WARNING: [Channel<359>: (airbyte-temporal:7233)] Failed to resolve name. status=Status{code=UNAVAILABLE, description=Unable to resolve host airbyte-temporal, cause=java.lang.RuntimeException: java.net.UnknownHostException: airbyte-temporal
	at io.grpc.internal.DnsNameResolver.resolveAddresses(DnsNameResolver.java:223)
	at io.grpc.internal.DnsNameResolver.doResolve(DnsNameResolver.java:282)
	at io.grpc.grpclb.GrpclbNameResolver.doResolve(GrpclbNameResolver.java:63)
	at io.grpc.internal.DnsNameResolver$Resolve.run(DnsNameResolver.java:318)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
	at java.base/java.lang.Thread.run(Thread.java:1589)
Caused by: java.net.UnknownHostException: airbyte-temporal
	at java.base/java.net.InetAddress$CachedAddresses.get(InetAddress.java:952)
	at java.base/java.net.InetAddress.getAllByName0(InetAddress.java:1658)
	at java.base/java.net.InetAddress.getAllByName(InetAddress.java:1524)
	at io.grpc.internal.DnsNameResolver$JdkAddressResolver.resolveAddress(DnsNameResolver.java:631)
	at io.grpc.internal.DnsNameResolver.resolveAddresses(DnsNameResolver.java:219)
	... 6 more

Looking at our Kubernetes deployment, it appears that the correct host should be contra-airbyte-temporal:

at 13:17:13 ❯ k get svc -n airbyte
NAME                                                  TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)    AGE
airbyte-minio-svc                                     ClusterIP   10.16.13.54    <none>        9000/TCP   92m
contra-airbyte                                        ClusterIP   10.16.4.140    <none>        80/TCP     8h
contra-airbyte-airbyte-connector-builder-server-svc   ClusterIP   10.16.5.16     <none>        8003/TCP   8h
contra-airbyte-airbyte-server-svc                     ClusterIP   10.16.15.130   <none>        8001/TCP   8h
contra-airbyte-airbyte-webapp-svc                     ClusterIP   10.16.5.76     <none>        80/TCP     8h
contra-airbyte-temporal                               ClusterIP   10.16.8.1      <none>        7233/TCP   8h

I believe this is because our Helm deployment name is contra-airbyte.

Our configuration:

airbyte:
  externalDatabase:
    database: airbyte
    host: '...'
    password: '...'
    port: 5432
    user: postgres
  logs:
    minio:
      enabled: true
    s3:
      enabled: false
  minio:
    auth:
      rootPassword: minio123
      rootUser: minio
    enabled: true
    persistence:
      accessMode: ReadWriteOnce
      size: 160Gi
  postgresql:
    enabled: false
  serviceAccount:
    name: airbyte-admin
app:
  deployment:
    version: '0.0.0'

Deployed as:

apiVersion: v2
dependencies:
  - name: airbyte
    repository: https://airbytehq.github.io/helm-charts
    version: '0.42.4'
name: contra-airbyte
version: '1.0.0'

How to fix the URL that airbyte-cron is using?

@gajus gajus added needs-triage type/bug Something isn't working labels Dec 6, 2022
@gajus
Copy link
Author

gajus commented Dec 7, 2022

CC @dizel852 as this a pretty major issue that I am looking for a workaround.

@gajus
Copy link
Author

gajus commented Dec 7, 2022

I was trying to read Helm chart code, but so far I cannot even tell where this host is defined.

Looking at the chart code, I thought maybe I can override it with env_vars, but it does not have the desired effect.

What I tried:

env_vars:
  TEMPORAL_HOST: contra-airbyte-temporal
cron:
  env_vars:
    TEMPORAL_HOST: contra-airbyte-temporal
cron:
  extraEnv:
    - name: TEMPORAL_HOST
      value: contra-airbyte-temporal

Despite adding these, I don't see them reflected in the resulting pod pod/contra-airbyte-cron-6fc69b87c-k8jgn.

@gajus
Copy link
Author

gajus commented Dec 7, 2022

Worth emphasizing that looking at contra-airbyte-airbyte-env config-map, the value defined there is correct:

TEMPORAL_HOST: contra-airbyte-temporal:7233

seems like the issue is somewhere else

@gajus
Copy link
Author

gajus commented Dec 7, 2022

I cannot tell where this value is even coming from.

Looking at kubectl exec -it -n airbyte pod/contra-airbyte-cron-6fc69b87c-gpxsr -- env, I don't see any mention of airbyte-temporal.

@gajus
Copy link
Author

gajus commented Dec 7, 2022

Few things.

  1. extraEnv approach does actually work.
  2. TEMPORAL_HOST needs to include port

i.e. This does work:

airbyte:
  cron:
    extraEnv:
      - name: TEMPORAL_HOST
        value: contra-airbyte-temporal:7233

Still feels like a bug in the chart, the fact that I had to override it.

@gajus gajus closed this as completed Dec 7, 2022
@ramonvermeulen
Copy link
Contributor

ramonvermeulen commented Dec 8, 2022

@gajus Thanks a lot for finding this out, this saved my evening from hours of debugging.
After I upgraded the helm deployment from 0.40.X to version 0.42.4 I ran into exactly the same issues.. eventually I found this ticket, and indeed when I set TEMPORAL_HOST to the name of the k8s temporal service endpoint it fixed the issue for me.
However I would expect the cron deployment to retrieve this value from the k8s env configmap.

I have the feeling that there are quite some things in the charts that you have to set yourself, and they are changing constantly with the releases. However there isn't that much documentation about it or a change-log (as far as I can find). It would be nice to have that to make myself a bit more confident in upgrading to a newer version of the chart without too much issues.

@eabrouwer3
Copy link

+1 on getting this fixed in the chart. Another weird thing with the charts that you have to set yourself. I'd potentially reopen this @gajus to hopefully get it fixed... But up to you of course.

@gajus gajus reopened this Dec 9, 2022
@ramonvermeulen
Copy link
Contributor

ramonvermeulen commented Dec 9, 2022

#20299 I will update this later (currently quite busy), but I think this will fix it?

@sajarin sajarin added temporal area/platform issues related to the platform team/prod-eng team/platform-enablement and removed needs-triage team/tse Technical Support Engineers autoteam labels Jan 2, 2023
@sajarin sajarin closed this as completed Jan 2, 2023
@sajarin sajarin changed the title Wrong temporal link used by airbyte-cron Airbyte Temporal - "Failed to resolved name" (Wrong temporal link used by airbyte-cron) Jan 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants