-
Notifications
You must be signed in to change notification settings - Fork 936
RELEASE the list only when list HEAD reference count is 1 #944
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@rppendya I think this looks fine. I'd only ask that you add a comment in the code that describes the behavior, and include something about why we are checking for a refcount of 1 instead of 0. |
|
This basically makes the LIST_RELEASE equivalent to a simple OBJ_RELEASE if the list has multiple refcounts. This double behavior is extremely confusing especially for those that pay attention to the refcounts of their lists (but do not prevent other layers from holding references to them. The scenario you describes in the PR seems to indicate that you try to find a way to not use the refcount for the objects in the list, but to rely on the refcount of the list itself. I suggest that instead of altering the default behavior of the OPAL list, you create your own class derived from opal_list_t, but where the dectructor releases the individual objects (instead of doing nothing as the opal_list_t). |
|
Hmmm...I see your point, but I'm not sure it is quite correct to say the LIST_RELEASE is equivalent to OBJ_RELEASE for opal_list_t. We have always treated opal_list_t somewhat differently than we have other objects in that it doesn't release anything on the list when the opal_list_t is itself released. The only thing that LIST_RELEASE did was to provide a convenience macro for those cases where you do in fact want to release all items on the list as well as the list object itself. So this PR doesn't change anything with respect to that difference. It actually restores the default behavior of our object system - i.e., that the list shouldn't be released until the refcount goes to zero. The fact that LIST_RELEASE wasn't conforming to that default behavior was the source of the confusion, and this PR is intended to resolve that. Bottom line is that this PR doesn't alter the default behavior of OPAL list - it restores it in those cases where LIST_RELEASE is used. It has no impact whatsoever outside of those cases. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't use the terminology "HEAD" to describe OPAL lists, and I'm not quite sure what you mean by "static lists." I'd change the new part of this comment to:
This macro will examine the reference count of the list itself. If it is 1, each object in the list will be OBJ_RELEASEd. Finally, the list itself will be OBJ_RELEASEd.
|
@rppendya Can you please squash down to one commit? @bosilca I tend to agree with @rhc54's comments: this is a convenience macro that was, in fact, incorrect: it always OBJ_RELEASEd items on the list (regardless of whether the list was to be released or not). I'd say that was a bug. If you don't want your list items to be release, you can just OBJ_RELEASE the list (and not use this OPAL_LIST_RELEASE macro). |
|
@jsquyres and @rhc54 I consider your scenario as a broken usage. I do expect a convenience macro where the name contains RELEASE to decrease the ref count of the list itself while also removing (and decreasing the refcount) of all the objects on the list. The result after a call to OPAL_LIST_RELEASE is that the list becomes empty. With the proposed modification the list remains valid as long as there are references on it. In this case the proposed implementation is wrong. Instead, as suggested above, one should modify the destructor of the list and release the elements in the list only when the destructor is called. |
|
@bosilca I guess I disagree with your assessment as I believe you are incorrectly defining "the list". The list consists of an opal_list_t struct plus all the objects attached to it. This is a complete entity and should be treated that way. So when I refount the opal_list_t, I am refounting the complete entity - not just the struct at the top. We could create another convenience macro that has the corrected behavior, or rename this one, but that just means we'll wind up doing a global search_replace to catch the places where it is currently used. I'm not sure what that accomplishes. Perhaps a longer comment that more fully explains the behavior as described in these comments would resolve the concern? |
|
I really think neither the original nor your approach are the correct ones. Each object that gets added to the list, should have its refcount increased by one, and when it gets removed it's refcount should be decreased. Then the list object itself is detached from the objects that are inside, until it gets destroyed and it release it's refcount on the objects contained inside. I am not a fan of making complicated comments to describe corner-cases of objects behavior. Instead I would rather support a more generic fix of the list class itself. |
|
Guess I'm a tad confused. We already do adjust the refcount of the objects being added to a list - that isn't the issue we are addressing here. The problem we are trying to solve is the list object itself disappearing while objects are still "attached" to it. So perhaps I'm misunderstanding your proposal. Are you suggesting that we increment the refcount on opal_list_t whenever something is added to it, and decrease its refcount when things are removed? I can see that as making sense, I guess - if a list has all its objects removed, then we release it once someone calls OBJ_RELEASE on it. Might take a little safety checking in case someone calls release on the opal_list_t while objects are still attached, but that could be done. Is that your proposal? |
|
No, this is not what I proposed. I propose that instead of making another explicit call to remove (and decrease their refcount) for all objects from a list, we put this in the list destructor. As we already handle correctly the object refcount when they get added and removed from the list, when the last reference to the list is released, all objects will be automatically detached and their refcount decreased. If nobody else has a reference to them they will therefore disappear. |
|
Hmmm...when we originally created LIST_RELEASE, you and Brian were opposed because you said there are places in the code where you attach static list items to lists. In those cases, you needed to call OBJ_RELEASE on the opal_list_t, but it absolutely couldn't call release on the individual list items. If we do as you suggest, I agree that it will work for me as we just remove LIST_RELEASE and replace it with OBJ_RELEASE. However, it would require that you explicitly remove any static items from the list prior to calling release. Looking at one example in the code, it looks like you do that anyway - but is that going to be okay everywhere? If so, then I'm fine with changing OBJ_RELEASE for opal_list_t - we were trying to avoid impacting everyone else. |
|
We were cautious because of the possibility to add static objects to these lists. However, if one uses these lists correctly, aka. don't put inside static objects, things would work as expected. I have the impression that by removing the LIST_RELEASE and replacing it with OBJ_RELEASE we are getting a more traditional usage scenario for these objects. I volunteer to make the required changes in the OMPI layer. |
|
kewl - thanks! @rppendya could you please revise this change? Basically, we take the loop in LIST_RELEASE that releases the attached items and move it to the opal_list_t destructor. We can then get rid of the LIST_RELEASE and LIST_DESTRUCT macros, maybe letting people replace it over time so this doesn't get to be too huge of an effort right away. |
…ks-calc-fix v1.10: usnic: fix calculation for number of blocks
To enhance the OPAL_LIST_RELEASE API functionality, we will release the LIST HEAD and it's objects only if the List HEAD's reference count is 1. This helps us to retain the list by increasing the LIST HEAD's reference count in multiple places in code and ultimately release the LIST completely when the LIST HEAD's reference count is 1