-
Notifications
You must be signed in to change notification settings - Fork 123
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[decisions] sequences of kdbGet and kdbSet operations #4574
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As stated previously in #4554 (comment), I think this decision must also define the meaning for all allowed sequences. Basically, we just need to define a call kdbSet(..., ks, parent)
always means replace everything below parent
with the contents of ks
(below parent
), modulo mountpoints (*). We use a more complex definition, but this covers the current situation and is pretty clear.
(*) To be technically accurate have to take mountpoints into account. Only where a mountpoint exists will data be replaced, everything that falls outside a mountpoint is ignored or causes an error.
IMO option 2 or 3 is the best in terms of effort-to-benefit ratio. Option 1 is easy, but also puts a lot of burden on developers using Elektra and allows easy mistakes. Option 4 might be the best overall IMO, but it is a lot of effort to implement.
@markus2330 what are your thoughts about this issue? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I still don't fully understand the problem. There were some inconsistencies with terminology, which distracted me. And an example would be very valuable (of concrete KeySets (a), (b) and (c) and the changesets). Please also specifically explain why the change set is calculated from the KeySet as passed from the user and not as received from the plugin.
doc/decisions/operation_sequences.md
Outdated
@@ -0,0 +1,92 @@ | |||
# KDB operation sequences |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be same as file. I think Illegal Sequences
are a better name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not. This decision is not pureley about illegal sequences, but as @kodebach mentiones also about the semantical meaning of allowed sequences.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure if this is a good idea. If there is also unclear semantics, this should be in separate decisions. Are these semantics also of relevance for #4554? It is important that we find out the connections between the decisions, so that we solve them in the right order.
Btw. you are doing a great job here. This is really an operation in the heart of Elektra. 💟
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See my recent response in the root thread of this PR. The "semantics" don't refer to the semantics of kdbGet
/kdbSet
on their own, but purely to the semantics of sequences, i.e. how they interact when used on the same handle.
If this is not the case, `kdbSet` will abort and report an error in `parentKey`. | ||
Developers might still wrongly mix the sequences, but they will get an error and have to fix it. | ||
|
||
- Adding a `Key * lastGetParent` to `struct _KDB` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't like this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you maybe give a reason as well? Just expressing a feeling doesn't help anybody.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like that we get too much state/too many illegal sequences. To e.g. force the same parent key or keyset breaks the whole idea of separating these 3 classes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not too bad. If you always do a kdbGet
, then modify the keyset and then do kdbSet
it works fine. In fact reducing the time between kdbGet
and kdbSet
is good for avoiding conflicts anyway. There just can't be an unrelated kdbGet
in between.
However, I agree, if there is a better solution, we should avoid additional restrictions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree it is, from the given options, the least-annoying restriction. But IIUC kdbSet would always need a ksDup on the user-side, so that a 3-way merge can be done (and IIUC we normally won't have failures anymore in kdbSet but a successful kdbGet would tell us about conflicts...).
@atmaxinger can you please give an example how you imagined normal kdbSet usage (with conflict handling) to be with this restriction? Are my "IIUC" statements correct?
In fact reducing the time between kdbGet and kdbSet is good for avoiding conflicts anyway.
I wouldn't recommend it, as it only decreases the likelyhood but doesn't give you any guarantee.
What is recommended (and should be better documented) is:
- always immediately call kdbGet on external changes (e.g. dbus notifications)
- always immediately call kdbSet on internal changes (e.g. the user changed the config)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So conflict handling would become more complicated, as there are two scenarios.
I don't see what changed. Everything you described should also apply equally to the current situation and to the new situation using option 2 or 3 from this decision. The only difference is in the "new situation" there is a new way for kdbSet
to fail. When the keyset/parentKey doesn't match. But that is resolved just like the existing conflicts you described. Call kdbGet
again (this time use the correct keyset) and try kdbSet
once more.
See also #4574 (comment)
It is wrong as it is, as the loop only makes sense if kdbSet failed because of an conflict, which isn't necessarily the case in this code.
But that doesn't actually make the while
wrong. That only means there is a missing if (hasConflict)
around the while
.
But there are even two ksDup in this example:
Line 63 in 664bc1d
theirs = ksDup (ours); And this second one is needed to have theirs and ours.
- This comment chain is still on option 2, where the
parentKey
must match the previouskdbSet
. Which makes the solution much simpler than for option 3. But again, see [decisions] sequences of kdbGet and kdbSet operations #4574 (comment) for a solution that should also work for this option 2. - I'm 99% sure in the current
master
situation the example doesn't need thisksDup
at all. At least not for 3-way merge reasons. It could just betheirs = ksNew (0, KS_END)
ifkdbGet
would work as expected (It doesn't seedoc/decisions/internal_cache.md
).
We couldn't get it within the loop if we would have done the kdbGet already before kdbSet, at least in situation (1) of process examples above.
I think #4574 (comment) should also cover this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very messy discussion (also my fault), @atmaxinger please update the decision and let us go from there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A good example of why I suggested in #4562 (comment) that longer discussions should be moved to the root thread.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh god ... where to even begin with summarizing this ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh god ... where to even begin with summarizing this ...
This is why it is so important to constantly update and also lead discussions to directions that actually clarify questions relevant for you.
- Doing `keyCopy (handle->lastGetParent, parentKey, KEY_CP_NAME)` in `kdbGet` | ||
- And checking `keyIsSameOrBelow (handle->lastGetParent, parentKey) == 1` in `kdbSet` | ||
|
||
3. Enforce that `ks` used in `kdbSet` is the same as in the last `kdbGet` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This ties a KDB to one ks, which is imho a too strong coupling.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it just ties kdbSet
to the KeySet returned by the previous kdbGet
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This ties a KDB to one ks, which is imho a too strong coupling.
Can you give more details on this opinion? IMO tied a KDB
to exactly one KeySet *
(via option 4) make things much simpler for everybody and makes the API very clear to users. The data is owned by the KDB
instance, so you shouldn't expect anything that KDB
doesn't tell you.
Are there specific use cases where you using a separate KDB
per KeySet
is not possible?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Of course it would be always possible if performance and memory does not matter. One problem is that kdbOpen is expensive.
The other problem is: It is quite often the case to have two different configs (or more), e.g., one which might get written (could be called state) and another one only read at startup. I think it is a nice design if these different configs are actually in different KeySets.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is quite often the case to have two different configs (or more), e.g., one which might get written (could be called state) and another one only read at startup. I think it is a nice design if these different configs are actually in different KeySets.
There is nothing preventing you from doing this in this alternative. You only have to call kdbGet
for your state immediately (*) before you call kdbSet
to write it, that's all.
I think you both are mixing this up with alternative 4 .... (you commented on alternative 3)
(*) by immediately I mean that you can not do a kdbGet
for another part of the KDB in the meantime
@markus2330 the gist of the problem is, that plugins may (and in the case of notification plugins DO) assume that a So in reality, as illustrated by the example, |
Btw. no need for explanations in the comments. Please rewrite and I will reread everything. |
You've asked this question now multiple times. I since I tried to give some explanations previously, I don't think I understand the question. @markus2330 Can you maybe explain what you want to know here? Of course the change-set must be based on the data passed to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@atmaxinger Can you please check, whether using KeySet * keys
in the struct _BackendData
values of the KeySet * backends
in struct _KDB
is possible?
I think before the first backendsDivide
in the keys in those keysets should still be a deepdup copy of the data from the last kdbGet
(divided by backend). If I remember correctly, those keys are replaced during kdbGet
and kdbSet
, but only below the parentKey
. So (if I'm correct) in the sequence
Key * keyA = ...;
Key * keyB = ...;
// assume that neither A below B nor B below A
KeySet * a = ...;
KeySet * b = ...;
kdbGet(kdb, a, keyA); // (A)
kdbGet(kdb, b, keyB); // (B)
kdbSet(kdb, a, keyA); // (C)
At (C) we should see the keys loaded by (A) and (B), with (B) overwriting any overlapping parts. If that's the case, we can use the data from all the KeySet * keys
(with a backendsMerge
) before the first backendsDivide
to know what was read during kdbGet
.
- Doing `keyCopy (handle->lastGetParent, parentKey, KEY_CP_NAME)` in `kdbGet` | ||
- And checking `keyIsSameOrBelow (handle->lastGetParent, parentKey) == 1` in `kdbSet` | ||
|
||
3. Enforce that `ks` used in `kdbSet` is the same as in the last `kdbGet` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This ties a KDB to one ks, which is imho a too strong coupling.
Can you give more details on this opinion? IMO tied a KDB
to exactly one KeySet *
(via option 4) make things much simpler for everybody and makes the API very clear to users. The data is owned by the KDB
instance, so you shouldn't expect anything that KDB
doesn't tell you.
Are there specific use cases where you using a separate KDB
per KeySet
is not possible?
If this is not the case, `kdbSet` will abort and report an error in `parentKey`. | ||
Developers might still wrongly mix the sequences, but they will get an error and have to fix it. | ||
|
||
- Adding a `Key * lastGetParent` to `struct _KDB` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you maybe give a reason as well? Just expressing a feeling doesn't help anybody.
Yes, obviously the changeset calucation algorithm uses the data from the user in the case of kdbSet. My question is of which relevance the KeySet passed from the user in the case of kdbGet is? Isn't only data as emitted from the storage plugin of relevance? This should be clearly explained in the problem. |
The data passed into But from "the storage plugin" is also wrong, because in all likelihood there are multiple backends involved and the data may change in However, we do need to know the data that was returned by the Of course we could define differently, how the diff is calculated (e.g. between |
As this seems to get nowhere, I just openly propose to do option Nr 1 - document that it isn't allowed to do |
About a week ago I said that the problem in both decisions are still unclear. I am still waiting for you to correct it and inform me when it is done.
We didn't have change tracking up to now, which is a big difference. Please investigate carefully of what internal and external notification does, what the differences to your proposed change tracking are, how a single algorithm could serve as basis for all three and which consequences this has on various parts of the framework. Right decisions in the framework is the center of your thesis. I'll definitely not agree to any rash decisions here. The process is quite short (two iteration rounds) but it obviously depends on clearly written decisions, which simply was not the case when I read this decision the last time. |
I think it might help everyone, if you ask specific questions about the things that are (still) unclear. Especially we both have tried to explain a few things since you said this (e.g. what KeySet the changes are calculated from). I have to side with atmaxinger here, because from my past experience with other decisions, it can be quite hard to work with some of your reviews. If you just state "I don't fully understand, things are unclear" without any specific questions, we don't really have any idea what to explain more. We can only randomly guess what to explain more. In this particular case, I think it would be very helpful if this #4574 (comment) answered your questions. I guess because you didn't respond to this at all, @atmaxinger said "this seems to get nowhere". Maybe this #4574 (comment) also helps... |
I wrote quite a few specific points where I want clarification. As said in #4554 (comment) I need to refuse to read any explanations, otherwise the decisions will not get into a state where it is understandable by anyone except the ones who participated in the discussions. We want to avoid documentation where external information is needed to understand what is written. If it is unclear to @atmaxinger what to correct, it makes more sense if he asks me directly. |
True, but I had to chime in, because I've had pretty much the same issue with you in the past.
AFAICT we addressed everything you asked for. Not in the decision file. That's true. But like I said, it would be really helpful, if you could just read the comments and say "Yes, that explains it. That needs to be in the decision file" or "No, it's still unclear". In particular, I think you should say whether this suggestion is good enough, or you need more.
To use a positive example from the past as illustration: In #4246 (comment) I suggested a new sentence to clear up some confusion. You said, "yep, that's good" and I knew and I can just add that information to the file. |
@markus2330 I think the major issue in this decision is, that neither I nor @kodebach know what is unclear for you. For me, it is clearly stated that currently there can occur problematic sequences of We have also provided a good, minimal example showcasing such a problematic sequence, and added a text that explains in detail what goes wrong, where and why. From my POV we can't make it anymore obvious.
The only thing I could find where you had a specific question was the textual explaination of the example. And yes, that has been updated. For your other comments to the alternatives ... "Don't like this" is a fine and valid comment, but I have no clue on how to address this? It's a considered alternative, so we put that in there. But there is really nothing I can do to address "Don't like this".
Yes we did. It was implemented multiple times in the
That's what I'm doing currently. I even replaced the current change-tracking implementations in |
This is the nature of something being unclear. As long as I don't understand what you understand, I could only give interpretations of what you want to say. This way you would learn what I think, and not how to think/speak more clearly yourself. So I don't do it. I don't require you to proof the absence of problematic sequences. I mainly requested a complete example of what exactly is broken in which situations. This shouldn't be too difficult. It is like writing a bug report. And I asked that this gets updated in the PR, not said in discussions. I assumed that you will inform me when it is done. To avoid the assumption: Please inform me when I should reread.
My fault, I should have stopped reading after the unclear problem. |
This is absolute and utter bullshit. Sorry to be so blunt, but it's true... We don't know what is unclear to you, so we can only guess and add random extra bits of information that may or may not help. Worst case the extra info confuses even more and adding it was a massive waste of time. You, however, can tell us much more than you are doing.
"unclear" is just a far too vague word to help anyone. What would you do, if I just told you Also, your idea, of "I don't tell you what I understood, because then you'll just learn to explain things to me and not be more clear in general" sounds good, but is bullshit. First of all, there are already two people (atmaxinger and I) who understand the current text. So unless there somebody else reads the text and also doesn't understand it, you are the one we need to cater to. Therefore, we actually need to know how you think, because clearly it's different from how we think. Second, we need to know what the readers (including you) interpreted from the text, to learn what we need to improve. Finally, the idea is basically the same as a compiler printing only "Syntax Error" without any context and expecting the programmer to fix it.
There is already one example in the document that shows a problematic sequence. This was always there. Since your last review, it has been expanded a bit. That was the suggestion (#4574 (comment)) I made and you refused to comment on. @atmaxinger has merged this suggestion since then and even told you in his last comment about it:
Honestly, I'm starting to believe you want to make us suffer and intentionally only read half of the comments.
NO, oh god no. Stopping to read doesn't help at all. Reading further is a good thing. Maybe the order of the text is just a bit wrong and there things get clearer when you read further. And if you do stop at some point. At least tell us, where you stopped so we know the problem is before that point. |
@markus2330 I think we have now clarified the example a bit better. I have also illustrated it with a new example. I have also added a short text to explain WHY plugins may choose to store the result of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are slowly coming to an understandable problem. Several points, and in particular also scope are still unclear.
abda4a0
to
fe976d1
Compare
@markus2330 I have now updated the decision with all your remarks. I also included real, runnable code (including setup setps) that exaclty reproduce the problem as described. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One specific problem is now clear but the presentation of the problem still could be better. Scope is still not clear, why is it not specific to dbus, what would need to be changed that it would work with dbus (as @kodebach requested something like this for 5.) and how would a framework, which implements this fix, look like?
To address the 3-way merge use case in general: AFAICT only option 3 and 4 might affect the 3-way merge case. For option 4 (internal keyset and // tries to do a kdbSet
// on conflict uses internal data in kdb to get base for merge then calls kdbGet to retrieve theirs
int kdbMerge (KDB * kdb, KeySet * ours, Key * parent, int strategy); In fact creating such a function would be possible and might make sense for all options. For option 3, 3-way merge also works. But the code needs to be slightly different. Here is a pseudocode example: kdbGet (kdb, data, parent);
base = ksDup(data);
// do other things, may include kdbGet with other keyset
// and modify data, e.g.
ksAppendKey (data, newKey);
kdbSet (kdb, data, parent);
// Note: conflict could also be kdbGet with different keyset inbetween
if (has_conflict) {
ours = ksDup (data);
while (should_merge) {
kdbGet (kdb, data, parent);
theirs = ksDup (data);
merged = elektraMerge (ours, theirs, base, parent, strategy, parent);
// move from merged to data
ksClear (data);
ksAppend (data, merged);
kdbSet (kdb, data, parent);
// not done in the current example, but replacing `ours` with `merged` and
// replacing `base` with `theirs` at this point might make sense
}
} I may have missed something, but basically the trick is the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem is clear to me now (always was, but I think the text also makes it very clear now). I have few minor corrections, but IMO this can be merged now.
@markus2330 What do you think? Is the problem clear enough that this PR can be merged?
271cf18
to
3223b04
Compare
Co-authored-by: Markus Raab <markus2330@users.noreply.github.com> Co-authored-by: Klemens Böswirth <23529132+kodebach@users.noreply.github.com>
3223b04
to
5524826
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great, problem is very clearly described now! Should we first merge this or #4632?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice and detailed description! 👍
I think it's best we merge and move further discussion to a new PR.
Co-authored-by: Markus Raab <markus2330@users.noreply.github.com>
Any reason why this is still draft? |
@markus2330 no, I fixed it now |
@markus2330 who merges this? |
Actually nothing should be merged as the builds fail now and errors can creep in easily.. Help for #4627 is very much appreciated. Furthermore, I still didn't get an answer if #4632 should be merged first. (#4574 (review)) But as you seemingly feel that you are blocked by this, I merge it. |
Basics
(added as entry in
doc/news/_preparation_next_release.md
whichcontains
_(my name)_
)Please always add something to the release notes.
(first line should have
module: short statement
syntax)close #X
, are in the commit messages.doc/news/_preparation_next_release.md
scripts/dev/reformat-all
Checklist
(not in the PR description)
Review
Labels