Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpenZFS 9337 - zfs get all is slow due to uncached metadata #7668

Closed
wants to merge 1 commit into from

Conversation

alek-p
Copy link
Contributor

@alek-p alek-p commented Jun 29, 2018

This is a port of the metadata dbuf cache.

Description

OpenZFS-issue: https://illumos.org/issues/9337
OpenZFS-commit: openzfs/openzfs@7dec52f

Porting notes:

I've added a couple kstats to track metadata cache size and overflow. Since overflow is now tracked in kstats the dbuf_metadata_cache_overflow global was removed during the port.
You'll also see that an unrelated function - dataset_name_hidden was pulled from two different files and added to the zcommon utils file. The compiler was complaining about static declaration of dataset_name_hidden() follows non-static declaration so I fixed that.

What follows is the description from the OpenZFS commit:

This project's goal is to make read-heavy channel programs and zfs(1m)
administrative commands faster by caching all the metadata that they will
need in the dbuf layer. This will prevent the data from being evicted, so
that any future call to i.e. zfs get all won't have to go to disk (very
much). There are two parts:

The dbuf_metadata_cache. We identify what to put into the cache based on
the object type of each dbuf.  Caching objset properties os
{version,normalization,utf8only,casesensitivity} in the objset_t. The reason
these needed to be cached is that although they are queried frequently,
they aren't stored in a dbuf type which we can easily recognize and cache in
the dbuf layer; instead, we have to explicitly store them. There's already
existing infrastructure for maintaining cached properties in the objset
setup code, so I simply used that.

Performance Testing:

 - Disabled kmem_flags
 - Tuned dbuf_cache_max_bytes very low (128K)
 - Tuned zfs_arc_max very low (64M)

Created test pool with 400 filesystems, and 100 snapshots per filesystem.
Later on in testing, added 600 more filesystems (with no snapshots) to make
sure scaling didn't look different between snapshots and filesystems.

Results:

    | Test                   | Time (trunk / diff) | I/Os (trunk / diff) |
    +------------------------+---------------------+---------------------+
    | zpool import           |     0:05 / 0:06     |    12.9k / 12.9k    |
    | zfs get all (uncached) |     1:36 / 0:53     |    16.7k / 5.7k     |
    | zfs get all (cached)   |     1:36 / 0:51     |    16.0k / 6.0k     |

Motivation and Context

Project's goal is to make read-heavy channel programs and zfs(1m) administrative commands faster by caching all the metadata that they will need in the dbuf layer. This will prevent the data from being evicted, so that any future call to i.e. zfs get all won't have to go to disk (very much).

How Has This Been Tested?

See above for performance test done in OpenZFS

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation (a change to man pages or other documentation)

Checklist:

  • My code follows the ZFS on Linux code style requirements.
  • I have updated the documentation accordingly.
  • I have read the contributing document.
  • I have added tests to cover my changes.
  • All new and existing tests passed.
  • All commit messages are properly formatted and contain Signed-off-by.
  • Change has been approved by a ZFS on Linux member.

Copy link
Contributor

@behlendorf behlendorf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for porting this! A few comments inline.

Note when you refresh this you'll want to rebase on master. The will resolve the enspc test failures and should result in a cleaner test run.

dbuf_metadata_cache_max_bytes) {
DBUF_STAT_BUMP(metadata_cache_overflow);
DTRACE_PROBE1(dbuf__metadata__cache__overflow,
dmu_buf_impl_t *, db);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/include/zfs/sys/trace.h:46:2: error: implicit declaration of function 'trace_zfs_dbuf__metadata__cache__overflow'; did you mean 'trace_zfs_dbuf__evict__one_enabled'? [-Werror=implicit-function-declaration] trace_zfs_##name((arg1))

You'll need to add a new tracepoint to trace_dbuf.h to address the kernel.org built-in failure which enables all the tracepoints. Or alternately, we could just drop the tracepoint since you've added a kstat for this.

@@ -4444,6 +4590,10 @@ MODULE_PARM_DESC(dbuf_cache_lowater_pct,
"Percentage below dbuf_cache_max_bytes when the evict thread stops "
"evicting dbufs.");

module_param(dbuf_metadata_cache_max_bytes, ulong, 0644);
MODULE_PARM_DESC(dbuf_metadata_cache_max_bytes,
"Maximum size in bytes of the dbuf metadata cache.");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's also add dbuf_metadata_cache_shift as a module parameter and document it.

@alek-p alek-p force-pushed the zfs-41 branch 2 times, most recently from 7d29d57 to a56f290 Compare July 3, 2018 21:40
@alek-p
Copy link
Contributor Author

alek-p commented Jul 4, 2018

seems like ztest is showing a real issue?

db->db_caching_status == DB_NO_CACHE (0x0 == 0xffffffffffffffff)
ASSERT at ../../module/zfs/dbuf.c:2601:dbuf_destroy()/usr/sbin/ztest[0x40999f]

I probably won't have time to look at this untill next week though.

dbuf_evict_notify();
}
Copy link
Contributor

@behlendorf behlendorf Jul 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The !evicting condition needs to be dropped here, which likely explains the ztest failures you were seeing. There was a conflict which needed to be resolved here since we pulled 1fac63e before this change, they were applied in reverse order to OpenZFS.


if (db->db_caching_status == DB_DBUF_CACHE) {
        dbuf_evict_notify();
}

@alek-p alek-p force-pushed the zfs-41 branch 3 times, most recently from f20af5a to 8496320 Compare July 10, 2018 17:10
This project's goal is to make read-heavy channel programs and zfs(1m)
administrative commands faster by caching all the metadata that they will
need in the dbuf layer. This will prevent the data from being evicted, so
that any future call to i.e. zfs get all won't have to go to disk (very
much). There are two parts:

The dbuf_metadata_cache. We identify what to put into the cache based on
the object type of each dbuf.  Caching objset properties os
{version,normalization,utf8only,casesensitivity} in the objset_t. The reason
these needed to be cached is that although they are queried frequently,
they aren't stored in a dbuf type which we can easily recognize and cache in
the dbuf layer; instead, we have to explicitly store them. There's already
existing infrastructure for maintaining cached properties in the objset
setup code, so I simply used that.

Performance Testing:

 - Disabled kmem_flags
 - Tuned dbuf_cache_max_bytes very low (128K)
 - Tuned zfs_arc_max very low (64M)

Created test pool with 400 filesystems, and 100 snapshots per filesystem.
Later on in testing, added 600 more filesystems (with no snapshots) to make
sure scaling didn't look different between snapshots and filesystems.

Results:

    | Test                   | Time (trunk / diff) | I/Os (trunk / diff) |
    +------------------------+---------------------+---------------------+
    | zpool import           |     0:05 / 0:06     |    12.9k / 12.9k    |
    | zfs get all (uncached) |     1:36 / 0:53     |    16.7k / 5.7k     |
    | zfs get all (cached)   |     1:36 / 0:51     |    16.0k / 6.0k     |

Authored by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Thomas Caputi <tcaputi@datto.com>
Approved by: Richard Lowe <richlowe@richlowe.net>
Ported-by: Alek Pinchuk <apinchuk@datto.com>
Signed-off-by: Alek Pinchuk <apinchuk@datto.com>

OpenZFS-issue: https://illumos.org/issues/9337
OpenZFS-commit: openzfs/openzfs@7dec52f
@codecov
Copy link

codecov bot commented Jul 11, 2018

Codecov Report

Merging #7668 into master will increase coverage by 0.03%.
The diff coverage is 99.09%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #7668      +/-   ##
==========================================
+ Coverage   78.11%   78.15%   +0.03%     
==========================================
  Files         368      368              
  Lines      111830   111890      +60     
==========================================
+ Hits        87354    87443      +89     
+ Misses      24476    24447      -29
Flag Coverage Δ
#kernel 78.71% <97.08%> (-0.02%) ⬇️
#user 67.12% <90.32%> (+0.08%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 00c405b...2b41120. Read the comment docs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants