Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

keep more metaslabs loaded #9197

Merged
merged 6 commits into from Aug 29, 2019
Merged

keep more metaslabs loaded #9197

merged 6 commits into from Aug 29, 2019

Conversation

pcd1193182
Copy link
Contributor

Motivation and Context

With the other metaslab changes loaded onto a system, we can significantly reduce the memory usage of each loaded metaslab and unload them on demand if there is memory pressure. However, none of those changes actually result in us keeping more metaslabs loaded. If we don't keep more metaslabs loaded, we will still have to wait for demand-loading to finish when no loaded metaslab can satisfy our allocation, which can cause ZIL performance issues. In addition, performance is traditionally measured by IOs per unit time, while unloading is currently done on a txg-count basis. Txgs can take a widely varying range of times, from tenths of a second to several seconds. This can result in confusing, hard to predict behavior.

Description

This change simply adds a time-based component to metaslab unloading. A metaslab will remain loaded for one minute and 8 txgs (by default) after it was last used, unless it is evicted due to memory pressure.

How Has This Been Tested?

zfs-test and zloop, and performance testing to verify no regressions.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation (a change to man pages or other documentation)

Checklist:

Signed-off-by: Paul Dagnelie <pcd@delphix.com>
@pcd1193182 pcd1193182 added Status: Code Review Needed Ready for review and testing Type: Performance Performance improvement or performance problem labels Aug 21, 2019
@ahrens
Copy link
Member

ahrens commented Aug 21, 2019

What would you think about taking this even further and including DLPX-65016? I think that would make sense assuming this lands after #9181.

@codecov
Copy link

codecov bot commented Aug 22, 2019

Codecov Report

Merging #9197 into master will decrease coverage by 1.39%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff            @@
##           master    #9197     +/-   ##
=========================================
- Coverage   79.22%   77.82%   -1.4%     
=========================================
  Files         400      389     -11     
  Lines      122001   121530    -471     
=========================================
- Hits        96656    94585   -2071     
- Misses      25345    26945   +1600
Flag Coverage Δ
#kernel 77.13% <100%> (-2.62%) ⬇️
#user 66.92% <87.5%> (-0.12%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a9ebdfd...0556e8a. Read the comment docs.

@pcd1193182
Copy link
Contributor Author

@ahrens I'll fold in both of those changes as well.

module/zfs/metaslab.c Show resolved Hide resolved
man/man5/zfs-module-parameters.5 Outdated Show resolved Hide resolved
man/man5/zfs-module-parameters.5 Outdated Show resolved Hide resolved
man/man5/zfs-module-parameters.5 Outdated Show resolved Hide resolved
module/zfs/metaslab.c Outdated Show resolved Hide resolved
Signed-off-by: Paul Dagnelie <pcd@delphix.com>
man/man5/zfs-module-parameters.5 Outdated Show resolved Hide resolved
Signed-off-by: Paul Dagnelie <pcd@delphix.com>
@behlendorf behlendorf requested a review from ahrens August 27, 2019 19:26
Copy link
Member

@ahrens ahrens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll merge this after #9181 ?

@@ -15,7 +15,7 @@
#

#
# Copyright (c) 2018 by Delphix. All rights reserved.
# Copyright (c) 2018, 2019 by Delphix. All rights reserved.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nit] looks like you can revert this

@behlendorf
Copy link
Contributor

We'll merge this after #9181 ?

I talked with Paul about this and he felt it would be fine to merge them in either order. So I suggest we go ahead with this one once the last nits have been resolved.

Signed-off-by: Paul Dagnelie <pcd@delphix.com>
@behlendorf behlendorf added Status: Accepted Ready to integrate (reviewed, tested) and removed Status: Code Review Needed Ready for review and testing labels Aug 28, 2019
@behlendorf behlendorf merged commit eef0f4d into openzfs:master Aug 29, 2019
@behlendorf behlendorf added this to New OpenZFS 2.0 Features (0.8->2.0) in OpenZFS 2.0 Feb 10, 2020
allanjude pushed a commit to KlaraSystems/zfs that referenced this pull request Apr 28, 2020
With the other metaslab changes loaded onto a system, we can
significantly reduce the memory usage of each loaded metaslab and
unload them on demand if there is memory pressure. However, none
of those changes actually result in us keeping more metaslabs loaded.
If we don't keep more metaslabs loaded, we will still have to wait
for demand-loading to finish when no loaded metaslab can satisfy our
allocation, which can cause ZIL performance issues. In addition,
performance is traditionally measured by IOs per unit time, while
unloading is currently done on a txg-count basis. Txgs can take a
widely varying range of times, from tenths of a second to several
seconds. This can result in confusing, hard to predict behavior.

This change simply adds a time-based component to metaslab unloading.
A metaslab will remain loaded for one minute and 8 txgs (by default)
after it was last used, unless it is evicted due to memory pressure.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Matt Ahrens <mahrens@delphix.com>
Signed-off-by: Paul Dagnelie <pcd@delphix.com>
External-issue: DLPX-65016
External-issue: DLPX-65047
Closes openzfs#9197
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Accepted Ready to integrate (reviewed, tested) Type: Performance Performance improvement or performance problem
Projects
No open projects
OpenZFS 2.0
  
New OpenZFS 2.0 Features (0.8->2.0)
Development

Successfully merging this pull request may close these issues.

None yet

3 participants