Skip to content

Conversation

@sdimitro
Copy link
Contributor

@sdimitro sdimitro commented Jun 3, 2020

Motivation

A previous commit enabled the tracking of object allocations
in Linux-backed caches from the SPL layer for debuggability.
The commit is: 9a170fc

Unfortunately, it also introduced minor performance regressions
that were highlighted by the ZFS perf test-suite. Within Delphix
we found that the regression would be from -1%, all the way up
to -8% for some workloads.

Description

This commit brings performance back up to par by creating a
separate counter for those caches and making it a percpu in
order to avoid lock-contention.

Testing

The initial performance testing was done by myself, and the
final round was conducted by @tonynguien who was also the one
that discovered the regression and highlighted the culprit.

@richardelling
Copy link
Contributor

LGTM, but wondering why you're using the Linux-only percpu_* rather than aggsum_* as we're using in arc.c?

@amotin
Copy link
Member

amotin commented Jun 3, 2020

It seems this code uses the counter almost solely for write, reading it only sporadically on user request and not requiring atomicity. For such usage aggsum is a waste of resources, since it does provide atomic reads. I think there should be introduced some platform-independent KPI for write-only counters, likely just a wrapper over whatever provided by each platform.

@sdimitro sdimitro changed the title Use percpu_counter for obj_alloc counter of Linux-backed caches [DRAFT] Use percpu_counter for obj_alloc counter of Linux-backed caches Jun 3, 2020
@sdimitro
Copy link
Contributor Author

sdimitro commented Jun 3, 2020

I did the mistake and opened this as a proper review. I should have opened it as a draft since it seems like I'd need to iterate a bit more to make it compile in all environments.

@richardelling I did think about using the aggsum code but that code is licensed under CDDL and the code that I'm changing is in the SPL which is GPL. I think not many people have touched the aggsum code so we could relicense it if we wanted to but it just seemed too much for this change.

@amotin I'd be interested in working with others in such counters. That said this undertaking is tangential to this PR.

@amotin
Copy link
Member

amotin commented Jun 3, 2020

For reference this is a counter(9) KPI available on FreeBSD to be used in write-mostly cases: https://www.freebsd.org/cgi/man.cgi?query=counter&sektion=9&manpath=freebsd-release-ports

@behlendorf behlendorf added the Status: Code Review Needed Ready for review and testing label Jun 3, 2020
@codecov
Copy link

codecov bot commented Jun 4, 2020

Codecov Report

Merging #10397 into master will decrease coverage by 16.02%.
The diff coverage is n/a.

Impacted file tree graph

@@             Coverage Diff             @@
##           master   #10397       +/-   ##
===========================================
- Coverage   79.55%   63.53%   -16.03%     
===========================================
  Files         391      309       -82     
  Lines      123830   106544    -17286     
===========================================
- Hits        98518    67691    -30827     
- Misses      25312    38853    +13541     
Flag Coverage Δ
#kernel ?
#user 63.53% <ø> (-2.00%) ⬇️
Impacted Files Coverage Δ
module/zfs/objlist.c 0.00% <0.00%> (-100.00%) ⬇️
module/zfs/pathname.c 0.00% <0.00%> (-100.00%) ⬇️
include/sys/dmu_redact.h 0.00% <0.00%> (-100.00%) ⬇️
include/sys/dmu_traverse.h 0.00% <0.00%> (-100.00%) ⬇️
module/zfs/zfs_rlock.c 0.00% <0.00%> (-96.36%) ⬇️
module/lua/ltablib.c 2.34% <0.00%> (-95.32%) ⬇️
module/zfs/bqueue.c 0.00% <0.00%> (-94.45%) ⬇️
module/zcommon/zfs_deleg.c 0.00% <0.00%> (-92.46%) ⬇️
module/zfs/dmu_diff.c 0.00% <0.00%> (-87.88%) ⬇️
module/zfs/zcp_set.c 0.00% <0.00%> (-87.10%) ⬇️
... and 237 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 221e670...81e2ff3. Read the comment docs.

@ahrens ahrens marked this pull request as draft June 8, 2020 15:19
@behlendorf behlendorf added Status: Work in Progress Not yet ready for general review and removed Status: Code Review Needed Ready for review and testing labels Jun 16, 2020
@sdimitro sdimitro force-pushed the ozfs_fix_perf_counters branch from 49a41f2 to b48052d Compare June 25, 2020 01:08
@sdimitro sdimitro marked this pull request as ready for review June 25, 2020 06:15
@sdimitro sdimitro added Status: Code Review Needed Ready for review and testing and removed Status: Work in Progress Not yet ready for general review labels Jun 25, 2020
@sdimitro sdimitro requested review from ahrens and behlendorf June 25, 2020 06:17
@sdimitro
Copy link
Contributor Author

cc @tonynguien

A previous commit enabled the tracking of object allocations
in Linux-backed caches from the SPL layer for debuggability.
The commit is: 9a170fc

Unfortunately, it also introduced minor performance regressions
that were highlighted by the ZFS perf test-suite. Within Delphix
we found that the regression would be from -1%, all the way up
to -8% for some workloads.

This commit brings performance back up to par by creating a
separate counter for those caches and making it a percpu in
order to avoid lock-contention.

The initial performance testing was done by myself, and the
final round was conducted by @tonynguien who was also the one
that discovered the regression and highlighted the culprit.

Signed-off-by: Serapheim Dimitropoulos <serapheim@delphix.com>
@sdimitro sdimitro force-pushed the ozfs_fix_perf_counters branch from b48052d to 81e2ff3 Compare June 25, 2020 06:35
@sdimitro sdimitro changed the title [DRAFT] Use percpu_counter for obj_alloc counter of Linux-backed caches Use percpu_counter for obj_alloc counter of Linux-backed caches Jun 25, 2020
@behlendorf behlendorf added Status: Accepted Ready to integrate (reviewed, tested) and removed Status: Code Review Needed Ready for review and testing labels Jun 26, 2020
@behlendorf behlendorf merged commit ec1fea4 into openzfs:master Jun 27, 2020
happyaron pushed a commit to happyaron/zfs that referenced this pull request Dec 29, 2020
A previous commit enabled the tracking of object allocations
in Linux-backed caches from the SPL layer for debuggability.
The commit is: 9a170fc

Unfortunately, it also introduced minor performance regressions
that were highlighted by the ZFS perf test-suite. Within Delphix
we found that the regression would be from -1%, all the way up
to -8% for some workloads.

This commit brings performance back up to par by creating a
separate counter for those caches and making it a percpu in
order to avoid lock-contention.

The initial performance testing was done by myself, and the
final round was conducted by @tonynguien who was also the one
that discovered the regression and highlighted the culprit.

Backported-by: Aron Xu <aron@debian.org>
Signed-off-by: Aron Xu <aron@debian.org>
Closes openzfs#10397
jsai20 pushed a commit to jsai20/zfs that referenced this pull request Mar 30, 2021
A previous commit enabled the tracking of object allocations
in Linux-backed caches from the SPL layer for debuggability.
The commit is: 9a170fc

Unfortunately, it also introduced minor performance regressions
that were highlighted by the ZFS perf test-suite. Within Delphix
we found that the regression would be from -1%, all the way up
to -8% for some workloads.

This commit brings performance back up to par by creating a
separate counter for those caches and making it a percpu in
order to avoid lock-contention.

The initial performance testing was done by myself, and the
final round was conducted by @tonynguien who was also the one
that discovered the regression and highlighted the culprit.

Reviewed-by: Matt Ahrens <matt@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Serapheim Dimitropoulos <serapheim@delphix.com>
Closes openzfs#10397
sempervictus pushed a commit to sempervictus/zfs that referenced this pull request May 31, 2021
A previous commit enabled the tracking of object allocations
in Linux-backed caches from the SPL layer for debuggability.
The commit is: 9a170fc

Unfortunately, it also introduced minor performance regressions
that were highlighted by the ZFS perf test-suite. Within Delphix
we found that the regression would be from -1%, all the way up
to -8% for some workloads.

This commit brings performance back up to par by creating a
separate counter for those caches and making it a percpu in
order to avoid lock-contention.

The initial performance testing was done by myself, and the
final round was conducted by @tonynguien who was also the one
that discovered the regression and highlighted the culprit.

Reviewed-by: Matt Ahrens <matt@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Serapheim Dimitropoulos <serapheim@delphix.com>
Closes openzfs#10397
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Status: Accepted Ready to integrate (reviewed, tested)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants