Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle tombstones that were written before kubexit started. #11

Open
wants to merge 13 commits into
base: master
Choose a base branch
from

Conversation

karlkfi
Copy link
Owner

@karlkfi karlkfi commented Aug 12, 2020

This avoids a race condition where already dead death dependencies weren't causing kubexit to exit.

Fixes #8
Replaces #10

@karlkfi
Copy link
Owner Author

karlkfi commented Aug 12, 2020

@RemcodM can you please verify that this fixes your problem?

@RemcodM
Copy link
Contributor

RemcodM commented Aug 13, 2020

@karlkfi Sadly no, as this change introduces a new issue on side A (in the context of #8). When there is a timeout waiting for birth dependencies, no tombstone is written, because ShutdownNow() returns an error which causes fatalf to shutdown directly and not write a tombstone. Since no tombstone is written, the is no way to reproduce the environment in which #8 can occur.

Logs:

2020-08-13T05:29:32.169289349Z Name: a
2020-08-13T05:29:32.169306084Z Graveyard: /tools
2020-08-13T05:29:32.169309021Z Tombstone: /tools/a
2020-08-13T05:29:32.169311267Z Birth Deps: b
2020-08-13T05:29:32.169313371Z Death Deps: N/A
2020-08-13T05:29:32.169315510Z Birth Timeout: 30s
2020-08-13T05:29:32.169317603Z Grace Period: 30s
2020-08-13T05:29:32.169319668Z Pod Name: <POD_NAME_REDACTED>
2020-08-13T05:29:32.169321898Z Namespace: default
2020-08-13T05:29:32.169568257Z Watching pod updates...
2020-08-13T05:29:32.185562385Z Event Type: ADDED
2020-08-13T05:29:32.960856704Z Event Type: MODIFIED
2020-08-13T05:30:02.169724892Z Error: timed out waiting for birth deps to be ready: 30s
2020-08-13T05:30:02.169741025Z Error: failed to shutdown child process: child process not running

@RemcodM
Copy link
Contributor

RemcodM commented Aug 13, 2020

I am not entirely sure what would be a good fix. You could just always write a tombstone, even if ShutdownNow() returns an error. However, after ShutdownNow() has returned an error it is not always safe to call waitForChildExit(child). If ShutdownNow() has somehow failed due to it failing to send a SIGKILL to the process, waitForChildExit(child) might hang forever.

However, writing a tombstone without calling waitForChildExit(child) might miss out on an exit code. If waitForChildExit(child) is failing because the process is not running, it might be the case that the process has ran before. In that case waitForChildExit(child) would give back the exit code of the child. Without calling waitForChildExit(child), the exit code might not be written correctly.

Even though I am not completely sure about a good fix, I would propose to always write a tombstone anyway. If fatalf is called, kubexit needs to exit anyway. Writing a tombstone should always happen, regardless of the status of the child process.

@RemcodM
Copy link
Contributor

RemcodM commented Aug 13, 2020

I checked the code paths once more, and it seems there is only one case (at the moment at least) in which fatalf can be called while the child process has already exited (but has actually ran before). That is the case when the child process has been succesfully started but recording the birth fails (https://github.com/karlkfi/kubexit/blob/master/cmd/kubexit/main.go#L158).

If recording the birth fails, there might be a race condition with the child that also has immediatly finished after startup. In that case fatalf would call ShutdownNow() and this would fail with child process not running. In this case it would still be nice to record the exit code of the child (if possible at all, since recording the birth already failed).

@karlkfi
Copy link
Owner Author

karlkfi commented Aug 14, 2020

When there is a timeout waiting for birth dependencies, no tombstone is written, because ShutdownNow() returns an error which causes fatalf to shutdown directly and not write a tombstone. Since no tombstone is written, the is no way to reproduce the environment in which #8 can occur.

Doesn't that mean the original race condition is solved, because it can't happen any more? If the process never started, there shouldn't need to be a tombstone. The process never started nor died. Is there actually a case that might need one other than a transitive dependency?

If recording the birth fails, there might be a race condition with the child that also has immediatly finished after startup. In that case fatalf would call ShutdownNow() and this would fail with child process not running. In this case it would still be nice to record the exit code of the child (if possible at all, since recording the birth already failed).

The writes probably need retry loops, if Go doesn't do that already within the std lib.

@RemcodM
Copy link
Contributor

RemcodM commented Aug 15, 2020

Doesn't that mean the original race condition is solved, because it can't happen any more? If the process never started, there shouldn't need to be a tombstone. The process never started nor died. Is there actually a case that might need one other than a transitive dependency?

Wouldn't that defeat the purpose of kubexit in the first place? If service A has a birth dependency B. B fails to start within the timeout. A would in that case stop and never write a tombstone (because the proces never started). B then starts outside of the timeout, cannot detect that A has stopped and will continue to run.

Kubexit was created with Kubernetes Jobs with sidecar containers in mind. If B is the 'sidecar' container here, B will never stop if it has not started in time. The Kubernetes Job will run forever, which in my humble opinion defeats the purpose of this project in the first place.

@karlkfi
Copy link
Owner Author

karlkfi commented Aug 18, 2020

you're right. thanks for spelling it out.

we need to write a tombstone even if the process didn't start so that others can still have a death dependency on it.

i think we can also simulate this in a test with a sleep before running kubexit.

This avoids a race condition where already dead death dependencies
weren't causing kubexit to exit.
- Switch to json log formating
- Add log timestamps
- Add container_name to all log messsages
- Add support for Stackdriver severity field
- Pass the logger (with "global" fields) through the context
@RemcodM
Copy link
Contributor

RemcodM commented Aug 26, 2020

Any progress on this PR?

@karlkfi
Copy link
Owner Author

karlkfi commented Aug 27, 2020

Sorry, man. I got a test added and failing for the reasons you specified. I need to either adopt your change or figure out an alternative. But I’m going on vacation till next week, when I’m starting a new job. So you probably are gonna need to run your own patch for a while. Hopefully I can get back to it in a week or so.

Thanks so much for your feedback and PRs!

@karlkfi karlkfi closed this Jul 2, 2021
@karlkfi karlkfi reopened this Jul 2, 2021
@agrrh
Copy link

agrrh commented Jul 9, 2021

Just wanted to say that we've faced same issue.

How it looks to me:

  • For some reason, primary process starts seconds after dependent sidecar process
  • Sidecar process never exits 😞

Should we make some kind of wait-for loop for dependent processes to ensure primary tombstone is there?

@tekumara
Copy link

@RemcodM @karlkfi How did you go? I'd be happy to test this out if it helps?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Kubexit does not shutdown child when tombstone is written before it is started
4 participants