Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GEODE-6652 : Change in checks for next dir in disk store #3729

Merged
merged 3 commits into from Jun 26, 2019

Conversation

alb3rtobr
Copy link
Contributor

Thank you for submitting a contribution to Apache Geode.

In order to streamline the review of the contribution we ask you
to ensure the following steps have been taken:

For all changes:

  • Is there a JIRA ticket associated with this PR? Is it referenced in the commit message?

  • Has your PR been rebased against the latest commit within the target branch (typically develop)?

  • Is your initial contribution a single, squashed commit?

  • Does gradlew build run cleanly?

  • Have you written or updated unit tests to verify your changes?

  • [n/a ] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?

Note:

Please ensure that once the PR is submitted, check Concourse for build issues and
submit an update to your PR as soon as possible. If you need help, please send an
email to dev@geode.apache.org.

Extra info:

After implementing the test, I added some logs to PersistenOplogSet which are not included in the commit:

diff --git a/geode-core/src/main/java/org/apache/geode/internal/cache/PersistentOplogSet.java b/geode-core/src/main/java/org/apache/geode/internal/cache/PersistentOplogSet.java
index a59bde6552..47949fa813 100644
--- a/geode-core/src/main/java/org/apache/geode/internal/cache/PersistentOplogSet.java
+++ b/geode-core/src/main/java/org/apache/geode/internal/cache/PersistentOplogSet.java
@@ -610,13 +610,17 @@ public class PersistentOplogSet implements OplogSet {
         dirCounter = (++dirCounter) % parent.dirLength;
         // if the current directory has some space, then quit and
         // return the dir
+        logger.info("GEODE-6652 Evaluating dirHolder : " + dirHolder.toString());
+        logger.info("GEODE-6652 minAvailableSpace=" + minAvailableSpace);
         if (dirHolder.getAvailableSpace() >= minAvailableSpace) {
           if (checkForWarning && !parent.isDirectoryUsageNormal(dirHolder)) {
             if (logger.isDebugEnabled()) {
               logger.debug("Ignoring directory {} due to insufficient disk space", dirHolder);
             }
+            logger.info("GEODE-6652 Ignoring : " + dirHolder.toString());
           } else {
             selectedHolder = dirHolder;
+            logger.info("GEODE-6652 selectedHolder : " + selectedHolder.toString());
             break;
           }
         }

With this info, I saw the following info when the test failed:

[info 2019/06/19 15:02:19.877 CEST <Test worker> tid=0xb] GEODE-6652 Evaluating dirHolder : dir=/tmp/junit6240170980751459624/dir2 maxSpace=5242880 usedSpace=3777220 availableSpace=1465660
[info 2019/06/19 15:02:19.877 CEST <Test worker> tid=0xb] GEODE-6652 minAvailableSpace=53
[info 2019/06/19 15:02:19.877 CEST <Test worker> tid=0xb] GEODE-6652 selectedHolder : dir=/tmp/junit6240170980751459624/dir2 maxSpace=5242880 usedSpace=3777220 availableSpace=1465660
[info 2019/06/19 15:02:19.879 CEST <Test worker> tid=0xb] Created oplog#8 drf for disk store testDiskStore.
[error 2019/06/19 15:02:19.880 CEST <Test worker> tid=0xb] A DiskAccessException has occurred while writing to the disk for region /__PR/_B__testRegion_96. The cache will be closed.
org.apache.geode.cache.DiskAccessException: For DiskStore: testDiskStore: Could not pre-allocate file /tmp/junit6240170980751459624/dir2/BACKUPtestDiskStore_8.crf with size=1887436, caused by java.io.IOException: not enough space left to pre-blow, available=1465400, required=1887436
	at org.apache.geode.internal.cache.Oplog.preblow(Oplog.java:1045)

Here you can see that the dir2 is wrongly selected, because although it has enough space to store the required space (minAvailableSpace), when it tries to create the new oplog file,
an exception is raised due to the available free space is smaller than the oplog files being created (defined by max oplog size).

Copy link
Contributor

@jhuynh1 jhuynh1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this change looks fine. I'm not as familiar with this part of the code base... @upthewaterspout or @dschneider-pivotal do you know who would be better at approving this (if not yourselves).

Copy link
Contributor

@jhuynh1 jhuynh1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm approving this change but I'll wait to give others (with more context) a chance to review.

Copy link
Contributor

@upthewaterspout upthewaterspout left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this a good improvement!

I agree with Jason that it would be nice to at least have a comment in the test explaining what the assertions are doing.

I wonder if we could just use FileUtils.size() on each directory to assert that dir2 only contains < 4 MBs of oplogs. Just a thought, I think the current test shows that as well.

DirectoryHolder dirHolder = null;
DirectoryHolder selectedHolder = null;
if (minAvailableSpace < parent.getMaxOplogSizeInBytes()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change looks okay but I don't like us checking SET_IGNORE_PREALLOCATE here since this is a testing only flag.
I also don't like the old contract on this method passing in minAvailableSpace. This is basically always going to be a small number (MIN_DIR_SIZE or the size of one operation). So I think it is fair to say we only want to use a directory if it has room for one oplog (which is parent.getMaxOplogSizeInBytes()). Keep in mind that if we can't find a directory like that we then call this method again with checkForWarning set to false in which case it will just use the next directory (if compaction is enabled (which is the default)). If we end up with no directories with room for one oplog and compaction is disabled then it should fail with "disk is full" exception.

So I'd recommend that you get rid of the minAvailableSpace parameter and always use parent.getMaxOplogSizeInBytes() as the minAvailableSpace in this method.

But I'm also okay with you just keeping your current changes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont like checking that flag either, as you said we are changing our code due to tests... I think the code should be changed to get rid of that flag, but I think that could be done in other PR.
Also, I will write another ticket for the removal of minAvailableSpace: I tried removing it, and it breaks more than 30 integration tests, so I think it deserves its own PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it really true that minAVailableSpace will always be smaller than the maxOplogSize? It looks like that param comes from the size of the value, which I think could in theory be bigger than the oplog.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it can happen, this is why I want to separate the removal from this PR, and check if it can be done. I would bet I saw a test where a big operation was performed and it was split between several oplog files, so I would say in that case minAvailableSpace was bigger. But I want to double check.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doing it in another PR sounds like a good idea.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method with an op size is only used if we have already decided to create a new oplog. I think Dan is right that it is possible for the first record of an oplog to be bigger than the max oplog size. In that case I think we just write it out and allow the oplog file to be bigger than the configured max. Users can configure a smaller max oplog size and if they do this and have large values then that first record could exceed the max size. Hopefully we test this. I don't know if any logic in the oplog code to split an operation and write it to two oplogs. I know we have logic that says if the op can not fit in the current oplog file (based on its max size) then start a new oplog.

@upthewaterspout upthewaterspout merged commit 384e90f into apache:develop Jun 26, 2019
@upthewaterspout
Copy link
Contributor

Merged this change in. Thanks for the contribution!

@alb3rtobr alb3rtobr deleted the feature/GEODE-6652 branch February 17, 2020 11:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants