Call FSDataOutputStream.setDropBehind for WAL files by dlmarion · Pull Request #3076 · apache/accumulo

dlmarion · 2022-11-14T17:09:43Z

See description of HDFS property dfs.datanode.drop.cache.behind.writes for a full explanation of what this does. This will tell the datanode to drop this file from the page cache when done writing.

ctubbsii

Seems fine, but I'm curious if this was related to an observed performance issue. The documentation says this only has an effect if the native libraries are used. I think they mean libhdfs.so. The only native libraries I've used are those for compression, which I think are in libhadoop.so. I'd be curious to know what the measurable impact of this change is for a typical deployment.

If this is intended to address an observed performance issue, I recommend changing the base branch to apply this to 2.1 instead of main. We can merge it into main later.

server/tserver/src/main/java/org/apache/accumulo/tserver/log/DfsLogger.java

dlmarion · 2022-11-14T18:51:41Z

This change is not in response to a new reported issue, but something that I have seen on busy clusters in the past. On busy clusters I have seen the case where the page cache flush process is very active. I think there is an opportunity here to help alleviate the cache pressure by telling the operating system that it doesn't need to cache the WAL after we write it, because we aren't going to be reading it. In fact, there is likely another commit here that I could make that is the opposite side of this - we can tell the operating system to drop or not cache the WAL during recovery as we are going to read it once and then be done with it.

IIRC, under the hood this calls posix_fadvise and the native code is located here.

ctubbsii

I was going to suggest the same corollary.

I'm not sure what the failing test is all about, but it seems like we would want these changes in 2.1.1, assuming it's not the cause of the test failure.

To change the base branch for the PR, click the "Edit" button near the top right next to the PR title and select the 2.1 branch from the now-enabled drop-down below the PR title. In case that switch causes GitHub to try to bring along commits from the main branch, you may need to cherry-pick or rebase/replay your commits for this PR onto the 2.1 branch and force push to your branch that is backing this PR.

dlmarion · 2022-11-14T19:33:59Z

Failed test is due to connection timeout downloading Maven dependency

dlmarion · 2022-11-14T19:35:13Z

IIRC in previous situations like this, you had suggested to people that changes should be made in the most recent version and then cherry-picked backwards.

ctubbsii · 2022-11-14T19:51:38Z

IIRC in previous situations like this, you had suggested to people that changes should be made in the most recent version and then cherry-picked backwards.

The circumstances differ here from that suggestion in a several important ways:

Both 2.x and 1.x versions had releases, and the proposed change was only being considered for the earlier 1.x version without consideration of the newer 2.x versions,
The code had already changed substantially in 2.x, which would have required a different design from the proposed 1.x change, meaning we couldn't just consider the change in 1.x; consideration of the change in 2.x may have informed design choices for the 1.x version (it would be bad if 1.x went one direction, and 2.x went a different direction, so we needed to consider 2.x in order to ensure the changes in 1.x were headed in the same direction),
The proposed changes then were non-trivial, and were still being polished, and it would have been premature to apply to an older stable release without fully understanding its impact, and
Depending on how things went after considering the changes in 2.x, we may have decided we didn't want to make the changes in 1.x

In this case, however:

there is no newer released version to consider,
the code between the branches is still identical,
the changes are trivial, and
we know up-front that we want to apply the changes to the earlier branch

Under circumstances like this, it's a matter of considering what is most efficient for tooling. If this were applied to the main branch, we'd cherry-pick the changes directly to the 2.1 branch, then merge forward back into the main branch, resulting in two identical commits in the history of the main branch. There's no reason to do that here, since the end result is the same. It's better under these circumstances to apply to the 2.1 branch and do one merge into the main branch, leaving only the one substantive commit and one trivial merge commit.

ctubbsii · 2022-11-14T19:55:42Z

Also, we don't need to label 3.0.0 as a target version, because unless 3.0.0 is released prior to 2.1.1, everything in a patch release is always included in the next .0 release.

dlmarion · 2022-11-14T20:05:42Z

Closed in favor of #3077

ctubbsii · 2022-11-14T20:07:03Z

FWIW, if you just force push to your branch, you don't need to create a new PR.

Call FSDataOutputStream.setDropBehind for WAL files

69adf56

See description of HDFS property dfs.datanode.drop.cache.behind.writes for a full explanation of what this does. This will tell the datanode to drop this file from the page cache when done writing.

dlmarion requested a review from keith-turner November 14, 2022 17:09

dlmarion self-assigned this Nov 14, 2022

ctubbsii reviewed Nov 14, 2022

View reviewed changes

server/tserver/src/main/java/org/apache/accumulo/tserver/log/DfsLogger.java Show resolved Hide resolved

Call setDropBehind on WAL when we are going to read it for recovery

7287907

ctubbsii reviewed Nov 14, 2022

View reviewed changes

dlmarion changed the base branch from main to 2.1 November 14, 2022 19:50

dlmarion changed the base branch from 2.1 to main November 14, 2022 19:50

dlmarion closed this Nov 14, 2022

dlmarion deleted the wal_drop_behind_writes branch November 18, 2022 14:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Call FSDataOutputStream.setDropBehind for WAL files#3076

Call FSDataOutputStream.setDropBehind for WAL files#3076
dlmarion wants to merge 2 commits intoapache:mainfrom
dlmarion:wal_drop_behind_writes

dlmarion commented Nov 14, 2022

Uh oh!

ctubbsii left a comment

Uh oh!

Uh oh!

dlmarion commented Nov 14, 2022

Uh oh!

ctubbsii left a comment

Uh oh!

dlmarion commented Nov 14, 2022

Uh oh!

dlmarion commented Nov 14, 2022

Uh oh!

ctubbsii commented Nov 14, 2022

Uh oh!

ctubbsii commented Nov 14, 2022

Uh oh!

dlmarion commented Nov 14, 2022

Uh oh!

ctubbsii commented Nov 14, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

dlmarion commented Nov 14, 2022

Uh oh!

ctubbsii left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

dlmarion commented Nov 14, 2022

Uh oh!

ctubbsii left a comment

Choose a reason for hiding this comment

Uh oh!

dlmarion commented Nov 14, 2022

Uh oh!

dlmarion commented Nov 14, 2022

Uh oh!

ctubbsii commented Nov 14, 2022

Uh oh!

ctubbsii commented Nov 14, 2022

Uh oh!

dlmarion commented Nov 14, 2022

Uh oh!

ctubbsii commented Nov 14, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants