Skip to content

Conversation

@tonyhutter
Copy link
Contributor

Motivation and Context

Patchset for zfs-0.7.12

Description

0.7.12 will fix the list_del corruption issue (#7933) for some people, and will support the latest kernels.

Requires-spl: openzfs/spl#710

How Has This Been Tested?

Test build

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation (a change to man pages or other documentation)

Checklist:

Tom Caputi and others added 2 commits October 26, 2018 01:22
Currently, dnode_check_slots_free() works by checking dn->dn_type
in the dnode to determine if the dnode is reclaimable. However,
there is a small window of time between dnode_free_sync() in the
first call to dsl_dataset_sync() and when the useraccounting code
is run when the type is set DMU_OT_NONE, but the dnode is not yet
evictable, leading to crashes. This patch adds the ability for
dnodes to track which txg they were last dirtied in and adds a
check for this before performing the reclaim.

This patch also corrects several instances when dn_dirty_link was
treated as a list_node_t when it is technically a multilist_node_t.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tom Caputi <tcaputi@datto.com>
Closes openzfs#7147
Closes openzfs#7388
When doing a read from disk, ZFS creates 3 ZIO's: a zio_null(), the
logical zio_read(), and then a physical zio. Currently, each of these
results in a separate taskq_dispatch(zio_execute).

On high-read-iops workloads, this causes a significant performance
impact. By processing all 3 ZIO's in a single taskq entry, we reduce the
overhead on taskq locking and context switching.  We accomplish this by
allowing zio_done() to return a "next zio to execute" to zio_execute().

This results in a ~12% performance increase for random reads, from
96,000 iops to 108,000 iops (with recordsize=8k, on SSD's).

Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed by: George Wilson <george.wilson@delphix.com>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
External-issue: DLPX-59292
Closes openzfs#7736
@codecov
Copy link

codecov bot commented Nov 1, 2018

Codecov Report

Merging #8078 into zfs-0.7-release will increase coverage by 0.03%.
The diff coverage is 90.44%.

@@                 Coverage Diff                 @@
##           zfs-0.7-release    #8078      +/-   ##
===================================================
+ Coverage            72.49%   72.52%   +0.03%     
===================================================
  Files                  289      289              
  Lines                89767    89802      +35     
===================================================
+ Hits                 65077    65132      +55     
+ Misses               24690    24670      -20

log_must zdb -d $TESTPOOL/$TESTFS
log_must zdb -d $TESTPOOL/$TESTFS@snap

log_must zpool checkpoint $TESTPOOL
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0.7 doesn't have checkpoint functionality, so test fails.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should be able to just remove this block from tests/zfs-tests/tests/functional/mmp/mmp_on_zdb.ksh

log_must zpool checkpoint $TESTPOOL
log_must zdb -kd $TESTPOOL
log_must zdb -kd $TESTPOOL/
log_must zdb -kd $TESTPOOL/$TESTFS
log_must zdb -kd $TESTPOOL/$TESTFS@snap
log_must zpool checkpoint -d $TESTPOOL

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That block is there purely to test that zdb does the right thing when checkpoints are present. Since they cannot be present in 0.7, because the functionality does not exist there, that test scenario has no purpose.

Also, removal of that block should not affect what comes after it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is trivial to check if a pool property or feature exists and do the right thing, see function get_pool_prop

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is trivial to check if a pool property or feature exists and do the right thing, see function get_pool_prop

It's good to know about get_pool_prop. In this case, since it's a back port to a branch that we know will never have checkpoints, my thinking was that adding the conditional logic to the test would be misleading, in that it would imply that checkpoints might exist. Do you think that's the wrong choice?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can also have the test respond with skip, instead of pass/fail

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really have an opinion either way, so I'll defer to @ofaaland and delete the section of code, since it's his commit.

ofaaland and others added 9 commits November 1, 2018 18:19
Since zdb opens the pools read-only, it cannot damage the pool in the
event the pool is already imported either on the same host or on
another one.

If the pool vdev structure is changing while zdb is importing the
pool, it may cause zdb to crash.  However this is unlikely, and in any
case it's a user space process and can simply be run again.

For this reason, zdb should disable the multihost activity test on
import that is normally run.

This commit fixes a few zdb code paths where that had been overlooked.
It also adds tests to ensure that several common use cases handle this
properly in the future.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Gu Zheng <guzheng2331314@163.com>
Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Closes openzfs#7797
Closes openzfs#7801
When handling a 32-bit statfs() system call the returned fields,
although 64-bit in the kernel, must be limited to 32-bits or an
EOVERFLOW error will be returned.

This is less of an issue for block counts since the default
reported block size in 128KiB. But since it is possible to
set a smaller block size, these values will be scaled as
needed to fit in a 32-bit unsigned long.

Unlike most other filesystems the total possible file counts
are more likely to overflow because they are calculated based
on the available free space in the pool. In order to prevent
this the reported value must be capped at 2^32-1. This is
only for statfs(2) reporting, there are no changes to the
internal ZFS limits.

Reviewed-by: Andreas Dilger <andreas.dilger@whamcloud.com>
Reviewed-by: Richard Yao <ryao@gentoo.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue openzfs#7927
Closes openzfs#7122
Closes openzfs#7937
Bandwidth and iops are average per second while *_wait are averages
per request for latency or, for queue depths, an instantaneous
measurement at the end of an interval (according to man zpool).

When calculating the first two it makes sense to do
x/interval_duration (x being the increase in total bytes or number of
requests over the duration of the interval, interval_duration in
seconds) to 'scale' from amount/interval_duration to amount/second.

But applying the same math for the latter (*_wait latencies/queue) is
wrong as there is no interval_duration component in the values (these
are time/requests to get to average_time/request or already an
absulute number).

This bug leads to the only correct continuous *_wait figures for both
latencies and queue depths from 'zpool iostat -l/q' being with
duration=1 as then the wrong math cancels itself (x/1 is a nop).

This removes temporal scaling from latency and queue depth figures.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Gregor Kopka <gregor@kopka.net>
Closes openzfs#7945 
Closes openzfs#7694
torvalds/linux@59b57717f ("blkcg: delay blkg destruction until
after writeback has finished") added a refcount_t to the blkcg
structure. Due to the refcount_t compatibility code, zfs_refcount_t
was used by mistake.

Resolve this by removing the compatibility code and replacing the
occurrences of refcount_t with zfs_refcount_t.

Reviewed-by: Franz Pletz <fpletz@fnordicwalking.de>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tim Schumacher <timschumi@gmx.de>
Closes openzfs#7885
Closes openzfs#7932
Recent changes in the Linux kernel made it necessary to prefix
the refcount_add() function with zfs_ due to a name collision.

To bring the other functions in line with that and to avoid future
collisions, prefix the other refcount functions as well.

Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tim Schumacher <timschumi@gmx.de>
Closes openzfs#7963
Update arc_release to use arc_buf_size().  This hunk was accidentally
dropped when porting compressed send/recv, 2aa3438.

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Signed-off-by: Tom Caputi <tcaputi@datto.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes openzfs#8000
It's helpful if there are pools with same names,
but you need to use only one of them.

Main case is twin servers, meanwhile some software
requires the same name of pools (e.g. Proxmox).

Reviewed-by: Kash Pande <kash@tripleback.net>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Igor ‘guardian’ Lidin of Moscow, Russia
Closes openzfs#8052
In CentOS 7.5 the kernel provided a compatibility wrapper to support
O_TMPFILE.  This results in the test setup script correctly detecting
kernel support.  But the ZFS module was built without O_TMPFILE
support due to the non-standard CentOS kernel interface.

Handle this case by updating the setup check to fail either when
the kernel or the ZFS module fail to provide support.  The reason
will be clearly logged in the test results.

Reviewed-by: Chunwei Chen <tuxoko@gmail.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes openzfs#7528
From, https://lintlyci.github.io/Flake8Rules/rules/W605.html

As of Python 3.6, a backslash-character pair that is not a valid
escape sequence now generates a DeprecationWarning. Although this
will eventually become a SyntaxError, that will not be for several
Python releases.

Note 'float_pobj' was simply removed from arcstat.py since it
was entirely unused.

Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Richard Elling <Richard.Elling@RichardElling.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes openzfs#8056
This adds a BuildRequires for gcc, make, and elfutils-libelf-devel
into our spec files.  gcc has been a packaging requirement for
awhile now:

https://fedoraproject.org/wiki/Packaging:C_and_C%2B%2B

These additional BuildRequires allow us to mock build in
Fedora 29.

Reviewed-by: Neal Gompa <ngompa@datto.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by:  Tony Hutter <hutter2@llnl.gov>
Closes openzfs#8095
Closes openzfs#8102
META file and changelog updated.

Signed-off-by: Tony Hutter <hutter2@llnl.gov>

Requires-spl: refs/pull/710/head
@tonyhutter
Copy link
Contributor Author

@tonyhutter tonyhutter closed this Nov 13, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants