-
Notifications
You must be signed in to change notification settings - Fork 234
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Early loop detector: dependency loop remover from cache #12
Conversation
Can I double Williams suggestion to squash commits, it will be much easier to review and understand them. Moreover it will make project history clean and maintainable. Please try give a try to |
@williamh, how can I print errors outside of librc? It requires to change it's API/ABI to pass back error strings. Is that ok? |
@qnikst, I've squashed all commits. |
Thanks. But all commits was definitely an overkill. |
@qnikst, I can break to a few commits. But I don't know how, because this commit presents only one feature and logically cannot be posed by parts. |
Then everything is ok, thanks. |
I've prepared presentation in PDF: https://github.com/xaionaro/documentation/raw/master/openrc/earlyloopdetector/early-loop-detection.pdf |
I hope @heroxbd will pass this patch into Debian's OpenRC. Also I hope that such test will convince Gentoo Developers about the patch efficiency :) |
I've updated the solver. The presentation is expected soon. I hope this's the last fix. |
I've updated the presentation [1]. |
19:00:56 < heroxbd> * Solving the loop by breaking single u> rmnologin. 19:00:56 < heroxbd> * Solving the loop by breaking single a> rmnologin.
Added option "rc_loopsolver_enable" option to be able to enable or disable the early dependency loop solver.
Here are my thoughts about taking this upstream. I see a lot of brace nesting that is not part of control structures like if/while/for/etc, why is that done? Adding the rc.conf switches the way they are implemented breaks the public API by changing the call to rc_deptree_update. Also, I don't know of a reason you would want this to be configurable. I recommend always having it on. Librc code uses fprintf(stderr, ...) to print some messages, so you can use that to print out warnings for the dependency issues; you don't need to include einfo.h I am not familiar with search.h, so is it available in environments other than glibc? What about on the bsds and Minix? |
Yes. Ok, you right. I'll do it.
It's just a way to:
I can remove it. It doesn't matter. It's just my vision of good coding practice :).
You mean option
Ok. :)
It's "SVr4, POSIX.1-2001" according to the manpage of "man 3 tsearch". |
I am referring to rc_loopsolver_enable and rc_loopsolver_warnings. Dependency issues can be severe enough that they can lock up your boot process if rc_parallel=yes, so I pretty strongly believe that this functionality should always be on so people get the warnings when the cache is built. |
One more thing I thought about. Please do not automatically break any dependencies. Instead, warn about them. I would rather let users break dependencies themselves than make the decision about how to break them. |
FYI, rc_loopsolver_warnings is introduced when there are too many loops in Debian LSB scripts around $all dependency. The boot console gets cluttered. It was a request from zigo. |
Closes Debian-bugs #763681, #770459.
It's not required anymore.
Here are more suggestions for cleaning up this code:
|
There is another option we could consider besides breaking loops |
Hm. It's harder. The patchs works only on stage of preparing deps cache. How to fail a service on this stage? It's required to add a code into the boot stage to recheck loops and fail services. I can do that in this year. However as for me we should leave possibility to solve loops automatically. I can make it configurable (no breaks / break only "uses" / break "uses" and "afters"). |
Hmm, let's not worry about trying to throw out the loops then. WRT breaking loops automatically, I'm still not quite with you, because breaking loops will definitely leave things in an inconsistent state. If a has after b and b has after a for example, there is no possible way to determine whether a or b should start first, so if you start a you could be wrong, and vice versa. The same is true for any other loop. |
I understand examples. Can you give a specific (concrete?) example when the loop solver will harm?
Disagree. Usually (in my experience) loops contains all dep types. So the "use" dependency will be broken, while "after" and "need" will be kept.
I think our discussion about this issue is not constructive. We both didn't changed our minds not a jot. May be the language barrier doesn't allow me to clearly explain my vision. Or may be I'm just to stupid to understand your vision. IDK. |
P.S.: I think you just should decide to remove the loop solver and don't waste your time on me. Honestly. :) I mean it's obvious that you known what is good for OpenRC much-much better, anyway. And you shouldn't waste you time on to proving something to somebody. I'll respect any dicision. |
I don't feel like I'm wasting time; we both agree that there is definitely a problem that needs to be fixed, so hang in there with me. :-) Have you been able to put together the text description of how your code works? That may make it make more sense to me. William |
@xaionaro, Hey Dimitry, long time no see. MIssed you a lot. Glad to tell that we have OpenRC 0.18.3 in Debian. Your updates has resolved all the build issues on ppc64el and kFreeBSD. @williamh, William, sorry for being idle for long. The point of breaking a loop is not only for Debian, but to make OpenRC more robust against configuration mistakes. Such a mistake as dependency loop can be fetal (unable to boot). Dimitry has made the point that the loop solver breaks the weakest dependence in the loop, which makes sense. The only concern for me is the code complexity, which I think, yes, it should be addressed cooperatively. |
Hello. Sorry that I still didn't fix the coding style of the patch as @williamh requested long time ago. I just have no time. It's quite bad years in my university right now (partly due to geopolitical problems), so I have to work a lot and more. :( |
I suppose we should do something with this pull request. |
I'm against trying to remove loops behind the sceens; we have no way of knowing for sure what the relationships of the services should be. I would rather just warn about them and allow users to fix them. |
For me it's ok but I worry about "rc_parallel=no" by default. It's not normal for modern init system. For my opinion, you can reject this pull request. |
I can add an option to not break loops automatically, today if required. On the one hand this patch was written for Debian and it works there well enough, so as for me mission accomplished. On the other hand, I'd like to be able to say people that OpenRC can safely boot in parallel mode (and do not hang), so I'd like to see this patch be merged to the upstream code. |
At least it's good functional. But I don't know what thinking @williamh about it.
If init system don't know about relationships then who knows? My main goal enables parallel by default. |
@xaionaro I don't want to break loops at all, so please just have this patch print error messages if loops are detected; I think loops indicate bugs which should be fixed by the init script authors. Also, please rebase the patch on current master. I'm definitely interested in this, so I'll take a look once it is updated. @stalkerg What I meant above is, if we get into a deadlock situation, there is not really a smart way to resolve it automatically. We don't really know what services do, so we can't really decide which service to start first to break the deadlock. I think it is a better choice to warn that there is a loop when we are building the deptree but let the script author take care of it. |
Maybe it's destiny but I have catch deadlock with parallel=on after upgrade to 0.24. |
I am uncomfortable with this pr because we should warn admins about loops and make them resolve the loops themselves rather than randomly drop dependencies. If you want to do that, feel free to update this and re-open it. Also, please avoid breaking the API. Thanks much. |
Related to: https://bugs.gentoo.org/show_bug.cgi?id=391945
This detector just calculates loops, solves weak loops (with at least one use or after dependency), and print information about all found loops (including unsolvable).
Solving is just breaking the loop with removing one or few use/after dependencies.
This detector cannot solve loops fully based on "need" dependencies. To prevent hang up in this case, it's need to use the later loop detector (#13) or solve it manually (sysadmin will be notified about any loop).
Presentation in PDF: https://github.com/xaionaro/documentation/raw/master/openrc/earlyloopdetector/early-loop-detection.pdf