Skip to content
This repository has been archived by the owner on Jan 23, 2024. It is now read-only.

A Few Issues found on Kubernetes #27

Open
ericxu10101 opened this issue Aug 8, 2020 · 5 comments
Open

A Few Issues found on Kubernetes #27

ericxu10101 opened this issue Aug 8, 2020 · 5 comments

Comments

@ericxu10101
Copy link

I am following k8s part on https://uber.github.io/fiber/getting-started/ and realized the following issues:

  1. seems like k8s 'Job' only works in 'default' namespace ?
    when I try on different namespace, the master pod keep failing and recreate.

  2. poolwork pods terminate with 'Failed' status, while master pod returns 'Success'. Any way to address that ?

  3. It looks like the k8s 'Job' must have explicit 'name' instead of 'generateName', otherwise master pod throws 'Pod not found' error. Is it known issue ?

Thanks

@calio
Copy link
Collaborator

calio commented Aug 11, 2020

Hi @ericxu10101 , let me reply those questions inline:

  1. seems like k8s 'Job' only works in 'default' namespace ? when I try on different namespace, the master pod keep failing and recreate.

That's true. Currently k8s jobs only work in 'default' namespace. It's a limitation of the current version. I'm planning to add that in the next version. Or if you have time, you can submit a PR for that.

  1. poolwork pods terminate with 'Failed' status, while master pod returns 'Success'. Any way to address that ?

The pods failed most likely due to the daemon thread inside each worker pod which terminates the worker when it lose connection to the master. It' doesn't mean the whole job failed, it only means the worker exited in a special way. And it shouldn't affect the final result (If you notice something otherwise, please create a new issue for it). I'm planning to address this issue in the next version to make sure the exit status of worker pods can be correctly set.

  1. It looks like the k8s 'Job' must have explicit 'name' instead of 'generateName', otherwise master pod throws 'Pod not found' error. Is it known issue ?

Currently 'name' is required for Fiber on K8s to work. Do you have a use case for 'generateName'? If it's general enough, it can be supported later.

@calio
Copy link
Collaborator

calio commented Aug 21, 2020

@ericxu10101 issues 1 and 2 have been fixed in #29 and #28. Feel free to install the newest version from the current master branch and test it out. Regarding issue 3, do you have a further explanation of the details of the issue?

@ericxu10101
Copy link
Author

ericxu10101 commented Sep 1, 2020

@calio thanks a lot. That's awesome ! 3 is not actually an issue, just wanted to confirm the behavior. So seems like giving an explicit name instead of using 'generateName' is the current limitation.

I think supporting 'generateName' probably can be a future feature. It would be handy when scheduling the Job.

Feel free to close this issue. And let me know where and how to track feature requests. Thanks.

@calio
Copy link
Collaborator

calio commented Sep 3, 2020

Hi @ericxu10101 , currently all features are tracked with Github issues. You can create a new issue with the tag "enhancement" and can track progress over there.

@kiran-italiya
Copy link

I'm getting urllib3 unable to establish connection in process.py with pi-estimation example. I am unable to resolve it. Tried giving all the permission but nothing seems to work. And also I'm getting this via running fiber cli.

  File "/usr/local/lib/python3.7/dist-packages/click/core.py", line 555, in invoke
    return callback(*args, **kwargs)
  File "/home/help_myjournal/.local/lib/python3.7/site-packages/fiber/cli.py", line 395, in run
    job = k8s_backend.create_job(job_spec)
  File "/home/help_myjournal/.local/lib/python3.7/site-packages/fiber/kubernetes_backend.py", line 171, in create_job
    raise e
  File "/home/help_myjournal/.local/lib/python3.7/site-packages/fiber/kubernetes_backend.py", line 168, in create_job
    self.default_namespace, body
  File "/home/help_myjournal/.local/lib/python3.7/site-packages/kubernetes/client/api/core_v1_api.py", line 7320, in create_namespaced_pod
    return self.create_namespaced_pod_with_http_info(namespace, body, **kwargs)  # noqa: E501
  File "/home/help_myjournal/.local/lib/python3.7/site-packages/kubernetes/client/api/core_v1_api.py", line 7429, in create_namespaced_pod_with_http_info
    collection_formats=collection_formats)
  File "/home/help_myjournal/.local/lib/python3.7/site-packages/kubernetes/client/api_client.py", line 353, in call_api
    _preload_content, _request_timeout, _host)
  File "/home/help_myjournal/.local/lib/python3.7/site-packages/kubernetes/client/api_client.py", line 184, in __call_api
    _request_timeout=_request_timeout)
  File "/home/help_myjournal/.local/lib/python3.7/site-packages/kubernetes/client/api_client.py", line 397, in request
    body=body)
  File "/home/help_myjournal/.local/lib/python3.7/site-packages/kubernetes/client/rest.py", line 280, in POST
    body=body)
  File "/home/help_myjournal/.local/lib/python3.7/site-packages/kubernetes/client/rest.py", line 233, in request
    raise ApiException(http_resp=r)
kubernetes.client.exceptions.ApiException: (501)
Reason: Unsupported method ('POST')
HTTP response headers: HTTPHeaderDict({'Server': 'BaseHTTP/0.3 Python/2.7.13', 'Date': 'Sun, 21 Feb 2021 11:26:45 GMT', 'Connection': 'close', 'Content-Type': 'text/html'})
HTTP response body: <head>
<title>Error response</title>
</head>
<body>
<h1>Error response</h1>
<p>Error code 501.
<p>Message: Unsupported method ('POST').
<p>Error code explanation: 501 = Server does not support this operation.
</body>

Anyone have idea what is happening here? Is it a bug?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants