Call FSDataOutputStream.setDropBehind for WAL files#3076
Call FSDataOutputStream.setDropBehind for WAL files#3076dlmarion wants to merge 2 commits intoapache:mainfrom
Conversation
See description of HDFS property dfs.datanode.drop.cache.behind.writes for a full explanation of what this does. This will tell the datanode to drop this file from the page cache when done writing.
ctubbsii
left a comment
There was a problem hiding this comment.
Seems fine, but I'm curious if this was related to an observed performance issue. The documentation says this only has an effect if the native libraries are used. I think they mean libhdfs.so. The only native libraries I've used are those for compression, which I think are in libhadoop.so. I'd be curious to know what the measurable impact of this change is for a typical deployment.
If this is intended to address an observed performance issue, I recommend changing the base branch to apply this to 2.1 instead of main. We can merge it into main later.
server/tserver/src/main/java/org/apache/accumulo/tserver/log/DfsLogger.java
Show resolved
Hide resolved
|
This change is not in response to a new reported issue, but something that I have seen on busy clusters in the past. On busy clusters I have seen the case where the page cache flush process is very active. I think there is an opportunity here to help alleviate the cache pressure by telling the operating system that it doesn't need to cache the WAL after we write it, because we aren't going to be reading it. In fact, there is likely another commit here that I could make that is the opposite side of this - we can tell the operating system to drop or not cache the WAL during recovery as we are going to read it once and then be done with it. IIRC, under the hood this calls posix_fadvise and the native code is located here. |
ctubbsii
left a comment
There was a problem hiding this comment.
I was going to suggest the same corollary.
I'm not sure what the failing test is all about, but it seems like we would want these changes in 2.1.1, assuming it's not the cause of the test failure.
To change the base branch for the PR, click the "Edit" button near the top right next to the PR title and select the 2.1 branch from the now-enabled drop-down below the PR title. In case that switch causes GitHub to try to bring along commits from the main branch, you may need to cherry-pick or rebase/replay your commits for this PR onto the 2.1 branch and force push to your branch that is backing this PR.
|
Failed test is due to connection timeout downloading Maven dependency |
|
IIRC in previous situations like this, you had suggested to people that changes should be made in the most recent version and then cherry-picked backwards. |
The circumstances differ here from that suggestion in a several important ways:
In this case, however:
Under circumstances like this, it's a matter of considering what is most efficient for tooling. If this were applied to the main branch, we'd cherry-pick the changes directly to the 2.1 branch, then merge forward back into the main branch, resulting in two identical commits in the history of the main branch. There's no reason to do that here, since the end result is the same. It's better under these circumstances to apply to the 2.1 branch and do one merge into the main branch, leaving only the one substantive commit and one trivial merge commit. |
|
Also, we don't need to label 3.0.0 as a target version, because unless 3.0.0 is released prior to 2.1.1, everything in a patch release is always included in the next |
|
Closed in favor of #3077 |
|
FWIW, if you just force push to your branch, you don't need to create a new PR. |
See description of HDFS property dfs.datanode.drop.cache.behind.writes for a full explanation of what this does. This will tell the datanode to drop this file from the page cache when done writing.