-
Notifications
You must be signed in to change notification settings - Fork 477
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HDDS-3249: renew ContainerCache.INSTANCE in order to test it in a fresh state #705
Conversation
Hi, thank you for figuring the problem out and posting a JIRA for it. I am on the other hand completely -1 on this approach of fixing it, at least if I understand the problem properly, so please let me know if I misunderstand something. (It would be nice to have a description in the PR we have a standard format for that which you can follow to help us understand better what the PR is aiming for). So my understanding is that the TestContainerCache fails in some environment when it is running after TestBlockDeletingService. Is that correct? For me the TestContainerCache does not fail and runs fine alone, so I do not see why it should be modified at all, as I don't see any reason to change the ContainerCache itself either. If the problem is that the TestBlockDeletingService is messing up the state before TestContainerCache is running, then TestBlockDeletingService should clean up the state properly on its teardown. |
…erPersistence also set defaultCache to null on BlockUtils.shutdownCache
Hi, Thank you for your comprehensive response. sorry for my mistake in creating PR, I'm new to this community. |
Oh, that gives some better understanding the problem, which is the size of the cache, thank you for enlighten me on that, and for the follow up. Also thank you for adding the PR description. This approach is way more better, but I still have some problems but their roots are not in the fix you provided.
From here this is just my opinion, and I would like to understand what do you think about it. I would argue whether we need this test at all... My argument is that in these cases we test the fact that LRUMap is behaving as it is documented, and this is not in the scope of our test. Our TestContainerCache should ensure that we extended the functionality of LRUMap correctly, so we should test the removeLRU method, which is basically the test of ReferenceCountedDB.cleanup, which again is not in the scope for this class, however test that a reference we did not closed is not removable is good to have, to ensure this invariant is met even if functionality changes. SideNote:
|
…s to chaos testing. (apache#438)
Your description is very helpful to me. thanks a lot. I agree with you that testing isFull for ContainerCache is somehow wired as it related to the LRUMap which is out of scope for this test.
|
Great, thank you for the follow up. +1 for the current state. (non-binding) I completely agree we can fix this problem by just removing the isfull assertion, as that is testing the LRUMap functionality, and follow up in further tickets for additional improvements. About LoadingCache: With LoadingCache which just have a size restriction, we can have the same cache functionality, if we get rid of reference counting which I think we do not need, as we use the instances just for short periods of time, so most of the time they are in the cache with a zero reference count, and can be evicted. If we still want to ensure that in extreme cases when the cache is pressured, and all items in the cache are referenced, we can use weakValues() in the guava cache to prevent eviction in this case, but this does not seem to be relevant based on the usage of the cached items. |
…out SCM. (apache#656)" This reverts commit 281faf3.
Thanks for your explanation. This would be great. I open the ticket in Jira here: |
…OzoneChaosCluster. (apache#711)
…tainer commands to Datanode. (apache#712)
…s to chaos testing. (apache#438)
…out SCM. (apache#656)" This reverts commit 281faf3.
…OzoneChaosCluster. (apache#711)
…tainer commands to Datanode. (apache#712)
…erPersistence also set defaultCache to null on BlockUtils.shutdownCache
I mistakenly rebase master onto this branch, I think it would be better to create another PR #737 |
What changes were proposed in this pull request?
In BlockUtils.shutdownCache(ContainerCache), set default cache instance in ContainerCache to null.
clean up ContainerCache in tear down of TestContainerPersistence and TestBlockDeletingService.
What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-3249