Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kube-task process stuck and pod never terminates #6

Open
rumi-spock opened this issue Jun 22, 2020 · 11 comments
Open

kube-task process stuck and pod never terminates #6

rumi-spock opened this issue Jun 22, 2020 · 11 comments

Comments

@rumi-spock
Copy link

Hi

We are using kube-tasks to create Jenkins backup following the guidelines from Jenkins helm chart. Generally it woks great but I have noticed very often that our backup process gets stuck and process never exits leaving the pod in running state. Unless this stuck pod is deleted, no new backup job is triggered.

Another thing I noticed is, it always gets stuck when copying slave node files and it does throw an error

2020/06/19 02:30:35 error in Stream: command terminated with exit code 1  src: file: engineering/eng-jenkins-74775f7d68-285n8/jenkins/var/jenkins_home/nodes/jenkins.bugfix-ompl-1144.7-bq5dp-vbjn9/config.xml
2020/06/19 02:30:35 [011414/109652] done: k8s://engineering/eng-jenkins-74775f7d68-285n8/jenkins/var/jenkins_home/nodes/jenkins.bugfix-ompl-1144.7-bq5dp-vbjn9/config.xml -> s3://jenkins-engineering-tools-backup/20200619020006/nodes/jenkins.bugfix-ompl-1144.7-bq5dp-vbjn9/config.xml```

These are the last lines in logs.

One solution is to have an exclude paths option, so we pass another param with list of paths to be excluded.
@cmcga1125
Copy link

I just experienced the same error

@victtsl
Copy link

victtsl commented Oct 9, 2020

I'm experiencing the same problem.

2020/10/08 16:02:56 error in Stream: command terminated with exit code 1 src: file: jenkins/jenkins-7cff7d695d-8k5h4/jenkins/var/jenkins_home/support/support_2020-10-08_12.53.03.zip

I've restarted the jenkins-backup pod and the backup process gets stuck again with a different file.

2020/10/09 16:43:34 error in Stream: command terminated with exit code 1 src: file: jenkins/jenkins-7cff7d695d-8k5h4/jenkins/var/jenkins_home/support/support_2020-10-09_12.59.03.zip

@rumi-spock How do you pass param with list of paths to be excluded?

@taneishamitchell
Copy link

Seems this project has been adopted by @maorfr repo can be found here

@victtsl
Copy link

victtsl commented Oct 10, 2020

Unless I'm missing somethingn doesn't look like there's a way to submit issue to @maorfr repo.

@maorfr
Copy link
Contributor

maorfr commented Oct 11, 2020

hello!

PRs are welcome!

@YakobovLior
Copy link

Hey @maorfr
I also experience this issue and seems that the reason is files that being deleted due to jenkins job history lifecycle.
I got the error on build number 13 which is not exist anymore, so I guess that since file was deleted backup job got stuck.
Is there any way to avoid this behavior? I really wish to make this work instead of coming up with some workaround backup job (creating tar.gz for jenkins home dir and uploading to S3 on my own).
If there is a possibility to add flag for skip_changed or something similar it can be very helpful.
I believe that most people will prefer to not back up the changed/deleted files rather than losing the entire backup.

Thanks

@sunoce
Copy link

sunoce commented Apr 26, 2022

@ALL
If this is of relevance for anyone of you guys.

Since this repo seems to be no longer maintained I created a fork by my own and added a flag that allows you to skip files that produces errors. For me the issue was that files got deleted while the backup job runs. Since the backup job gathers all files at the start, this leads to errors copying files and terminates the job.

Therefore I added the flag and errors are logged but the job keeps running.

If anyone is interested use the fork:
https://github.com/sunoce/kube-tasks

You can find the docker image at:
https://hub.docker.com/r/sunoce/kube-tasks/tags

If the maintainer sees this and wants me to create a PR comment here and I will create one.

EDIT:

A I missed the comment of @maorfr - So I will create a PR for this. For this I have to adjust the skbn module

EDIT:

I created the PRs: maorfr#6

@sudhirnikhade
Copy link

sudhirnikhade commented Dec 6, 2022

Hey @sunoce ,
Thank you for this fix.
I am also experiencing the same in my back up job, so can it directly be used your docker image https://hub.docker.com/r/sunoce/kube-tasks/tags in the back up jobs as still your code is not merged in maofr repo.

Thanks for your reply in advance.

@sunoce
Copy link

sunoce commented Dec 6, 2022

Hey @sudhirnikhade

you can use the image. If you use the full repo you will have to use the fork aswell.

But I can also finish the Pull Request - I just forgot it.

Kind regards

@sudhirnikhade
Copy link

Hi @sunoce ,
We implemented this https://github.com/sunoce/kube-tasks and working. Thank you for that.
But I have one query in it, can we use any option to exclude any folder/files while copying data to and from k8s to s3 bucket. For ex. builds folder in jenkins etc.
Thanks for your time!!!!

Thank,
Sudhir

@danielmorillas
Copy link

Hi,

I am using this image (https://github.com/amerello/kube-tasks) that solve the problem:
"error in Stream: command terminated with exit code 1"

https://hub.docker.com/r/amerello/kube-tasks

It skips this kind of error and goes ahead, those files are empty.

Hope that helps!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants