Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix: Exec provider potential deadlock #1457

Merged

Conversation

avin3sh
Copy link
Contributor

@avin3sh avin3sh commented Nov 18, 2023

The current use of WaitForExit() before trying to read standard output stream is problematic and may cause deadlock scenario. WaitForExit() should be generally called after all other methods have been called on Process (ref).

In existing scenario, if the child process has already written enough output to the stream, the parent process will be keep on waiting for the child process to exit and the child process would not exit as the stream has not been read from completely.

WaitForExit should have been called after all other methods are called on the process
Copy link

linux-foundation-easycla bot commented Nov 18, 2023

CLA Signed

The committers listed above are authorized under a signed CLA.

@k8s-ci-robot
Copy link
Contributor

Welcome @avin3sh!

It looks like this is your first PR to kubernetes-client/csharp 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes-client/csharp has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

@k8s-ci-robot k8s-ci-robot added cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Nov 18, 2023
@IvanJosipovic
Copy link
Contributor

Hey,

I think this PR might cause some issues. The ReadToEnd() call is blocking and will not end if only an STD Error is returned as the timeout is ignored.

See this comment and thread on why this was built this way, #1267 (comment)

@avin3sh
Copy link
Contributor Author

avin3sh commented Nov 18, 2023

Let me go through some test scenarios and get back with any tweaks if required. Exec hanging is what motivated this PR :)

@k8s-ci-robot k8s-ci-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Nov 20, 2023
@avin3sh
Copy link
Contributor Author

avin3sh commented Nov 20, 2023

The ReadToEnd() call is blocking and will not end if only an STD Error is returned as the timeout is ignored

I think I understood the concern pointed here. The TimeOut was indeed not being honored. @IvanJosipovic what do you think of the current design ? I had to read the stream asynchronously, I don't like the idea of using global event handler here but we are already doing it for exec error, so did not want to deviate by introducing thread.

This should also handle the case of streaming output that was pointed out in one of the PRs I went through.

The workflow diagram in #1267 (comment) was very handy and I looked at the KubeUI usage to understand the requirement better :)

@rrudduck
Copy link

rrudduck commented Nov 26, 2023

I can confirm that these changes fix the deadlock issue I was experiencing. Prior to the changes in this PR, KubernetesClientConfiguration.BuildDefaultConfig() would hang and eventually throw a timeout related exception. With these changes, it works as expected.

@IvanJosipovic
Copy link
Contributor

IvanJosipovic commented Nov 28, 2023

I tried this approach while working on #1267, using process.OutputDataReceived can cause some data to be lost as the BeginOutputReadLine call happens after the process is started. :(

See, dotnet/runtime#18789

@avin3sh
Copy link
Contributor Author

avin3sh commented Nov 28, 2023

@IvanJosipovic FWIW, I have been using changes from this PR extensively for past couple of days, and I haven't run into the issue. The linked runtime issue does suggest "correct" way to do this - i.e. calling the other overload afterwards. I will (1) verify this is indeed a problem here, and if it so (2) include the additional changes and verify the stability.

There is also dotnet/runtime#32456 which I will ensure we don't step into from my changes

Let me know if you have any other concerns.

@codecov-commenter
Copy link

codecov-commenter commented Dec 4, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

❗ No coverage uploaded for pull request base (master@1ace761). Click here to learn what that means.

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@            Coverage Diff            @@
##             master    #1457   +/-   ##
=========================================
  Coverage          ?   70.59%           
=========================================
  Files             ?       89           
  Lines             ?     2663           
  Branches          ?      555           
=========================================
  Hits              ?     1880           
  Misses            ?      781           
  Partials          ?        2           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@avin3sh
Copy link
Contributor Author

avin3sh commented Dec 13, 2023

/easycla

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. and removed cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. labels Dec 13, 2023
@avin3sh
Copy link
Contributor Author

avin3sh commented Dec 13, 2023

@tg123 @brendandburns @IvanJosipovic and anyone following

I am done with my changes and have been using it locally for a while - I haven't faced any problems and the issue with Exec freezing until the timeout has gone away. I do not have any more changes to make.

@tg123
Copy link
Member

tg123 commented Dec 13, 2023

overall LGTM
@IvanJosipovic @brendanburns any more suggestions?

@avin3sh
Copy link
Contributor Author

avin3sh commented Jan 3, 2024

@IvanJosipovic @brendanburns any more suggestions?

Happy new year all! 😄 Friendly ping for comments/approval 🤞

@avin3sh
Copy link
Contributor Author

avin3sh commented Jan 11, 2024

I am using my local build of the kubernetes client for monitoring our Windows Kubernetes Cluster. Our dependency is increasing and it would be nice to see this fix in a production release. I also believe this fix potentially resolves #1346.
@tg123 please let me know if there is anything else needed in my PR :)

@tg123
Copy link
Member

tg123 commented Jan 13, 2024

@avin3sh lets allow @IvanJosipovic @brendanburns more time to approve

@brendandburns
Copy link
Contributor

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 1, 2024
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: avin3sh, brendandburns

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 1, 2024
@k8s-ci-robot k8s-ci-robot merged commit 140d7de into kubernetes-client:master Feb 1, 2024
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants