Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Get pod logs when multiple containers are used #38

Closed
etorres opened this issue Apr 27, 2021 · 7 comments
Closed

Get pod logs when multiple containers are used #38

etorres opened this issue Apr 27, 2021 · 7 comments
Assignees
Labels
bug Something isn't working

Comments

@etorres
Copy link

etorres commented Apr 27, 2021

In short:

Hi,

We are using KubernetesLegacyJobOperator to manage our Spark-based applications with Airflow. In our setup we are allocating two containers (spark-kubernetes-driver and fluent-bit) within the same pod and this is causing the following error:

airflow_kubernetes_job_operator.kube_api.queries.GetPodLogs, Bad Request: a container name must be specified for pod application-name, choose one of: [spark-kubernetes-driver fluent-bit]

We managed to make it work by setting the value of the get_logs parameter to False, but we would like to have access to the logs in the airflow task.

Is there any way to get this? I looked for but I had no luck finding an answer.

Thanks in advance!

More information

A fragment of the stack trace:

kubernetes.client.exceptions.ApiException: (400)
Reason: Bad Request
HTTP response headers: HTTPHeaderDict({'Audit-Id': 'cef97a1d-ee58-4bc1-b220-63338af4f3ee', 'Cache-Control': 'no-cache, private', 'Content-Type': 'application/json', 'Date': 'Mon, 26 Apr 2021 11:14:05 GMT', 'Content-Length': '259'})
HTTP response body: {'kind': 'Status', 'apiVersion': 'v1', 'metadata': {}, 'status': 'Failure', 'message': 'a container name must be specified for pod application-y761881y, choose one of: [spark-kubernetes-driver fluent-bit]', 'reason': 'BadRequest', 'code': 400}

@LamaAni
Copy link
Owner

LamaAni commented Apr 27, 2021

Hi, Thank you for posting!

Would love to help but I would need an example dag, with minimal configuration if you can.

This seems like an error where a pod has multiple containers. Is that correct?

I'm not sure about the timeline, but I'm pretty confident we can resolve this issue.

@LamaAni
Copy link
Owner

LamaAni commented Apr 27, 2021

Hi, I was able to reproduce your error with (testpod.py),

from airflow import DAG
from airflow_kubernetes_job_operator import KubernetesJobOperator
from airflow.utils.dates import days_ago

default_args = {
    "depends_on_past": False,
    "start_date": days_ago(1),
}

dag = DAG(dag_id="test_multi_container", default_args=default_args, schedule_interval=None)

with dag:
    KubernetesJobOperator(
        task_id="test-multi-container",
        body_filepath="./testdag.yaml",
    )

if __name__ == "__main__":
    dag.schedule_interval = None
    dag.clear()
    dag.run()

Job configuration (testpod.yaml):

apiVersion: v1
kind: Pod
metadata:
  name: 'multi-container-test'
  labels:
    app: 'multi-container-test'
spec:
  restartPolicy: Never
  containers:
    - name: container1
      image: 'alpine:latest'
      command:
        - sh
        - -c
        - |
          echo starting sleep...
          sleep 10
          echo end
      resources:
        limits:
          cpu: 200m
          memory: 500Mi
        requests:
          cpu: 100m
          memory: 200Mi
    - name: container2
      image: 'alpine:latest'
      command:
        - sh
        - -c
        - |
          echo starting sleep...
          sleep 10
          echo end
      resources:
        limits:
          cpu: 200m
          memory: 500Mi
        requests:
          cpu: 100m
          memory: 200Mi

@LamaAni LamaAni self-assigned this Apr 27, 2021
@LamaAni LamaAni added the bug Something isn't working label Apr 27, 2021
@LamaAni
Copy link
Owner

LamaAni commented Apr 27, 2021

It may be a few days before I can get to this, apologies for that. The error lies in the Kuberentes api call, where it dose not specify to load the logs from all executing containers inside a pod. This of course can be changed. Error location is here:

Log reader needs to be able to define a container here:

Watcher needs to define a reader for each container here:

This is a nice catch. Thank you.

@LamaAni
Copy link
Owner

LamaAni commented Apr 27, 2021

Please see resolution PR here:
#39

@LamaAni
Copy link
Owner

LamaAni commented Apr 27, 2021

Please see resolution version:
https://github.com/LamaAni/KubernetesJobOperator/releases/tag/1.0.19

Please close this issue once validated. Thank you!

@etorres
Copy link
Author

etorres commented Apr 29, 2021

I'm so very grateful for your time. We switch to the new version and it works like a charm. I'm closing the issue.

@etorres etorres closed this as completed Apr 29, 2021
@LamaAni
Copy link
Owner

LamaAni commented Apr 29, 2021

Awesome, if you can add a testimonial, that would be fantastic. Here: #40

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants