Skip to content

Commit

Permalink
Fix 'zfs send/recv' hang with 16M blocks
Browse files Browse the repository at this point in the history
When using 16MB blocks the send/recv queue's aren't quite big
enough.  This change leaves the default 16M queue size which a
good value for most pools.  But it additionally ensures that the
queue sizes are at least twice the allowed zfs_max_recordsize.

Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #7365 
Closes #7404
  • Loading branch information
behlendorf authored Apr 9, 2018
1 parent 7b47628 commit 3b0d992
Show file tree
Hide file tree
Showing 2 changed files with 37 additions and 4 deletions.
25 changes: 25 additions & 0 deletions man/man5/zfs-module-parameters.5
Original file line number Diff line number Diff line change
Expand Up @@ -1919,6 +1919,31 @@ Allow sending of corrupt data (ignore read/checksum errors when sending data)
Use \fB1\fR for yes and \fB0\fR for no (default).
.RE

.sp
.ne 2
.na
\fBzfs_send_queue_length\fR (int)
.ad
.RS 12n
The maximum number of bytes allowed in the \fBzfs send\fR queue. This value
must be at least twice the maximum block size in use.
.sp
Default value: \fB16,777,216\fR.
.RE

.sp
.ne 2
.na
\fBzfs_recv_queue_length\fR (int)
.ad
.RS 12n
.sp
The maximum number of bytes allowed in the \fBzfs receive\fR queue. This value
must be at least twice the maximum block size in use.
.sp
Default value: \fB16,777,216\fR.
.RE

.sp
.ne 2
.na
Expand Down
16 changes: 12 additions & 4 deletions module/zfs/dmu_send.c
Original file line number Diff line number Diff line change
Expand Up @@ -61,8 +61,8 @@

/* Set this tunable to TRUE to replace corrupt data with 0x2f5baddb10c */
int zfs_send_corrupt_data = B_FALSE;
int zfs_send_queue_length = 16 * 1024 * 1024;
int zfs_recv_queue_length = 16 * 1024 * 1024;
int zfs_send_queue_length = SPA_MAXBLOCKSIZE;
int zfs_recv_queue_length = SPA_MAXBLOCKSIZE;
/* Set this tunable to FALSE to disable setting of DRR_FLAG_FREERECORDS */
int zfs_send_set_freerecords_bit = B_TRUE;

Expand Down Expand Up @@ -1142,7 +1142,8 @@ dmu_send_impl(void *tag, dsl_pool_t *dp, dsl_dataset_t *to_ds,
goto out;
}

err = bqueue_init(&to_arg.q, zfs_send_queue_length,
err = bqueue_init(&to_arg.q,
MAX(zfs_send_queue_length, 2 * zfs_max_recordsize),
offsetof(struct send_block_record, ln));
to_arg.error_code = 0;
to_arg.cancel = B_FALSE;
Expand Down Expand Up @@ -3831,7 +3832,8 @@ dmu_recv_stream(dmu_recv_cookie_t *drc, vnode_t *vp, offset_t *voffp,
goto out;
}

(void) bqueue_init(&rwa->q, zfs_recv_queue_length,
(void) bqueue_init(&rwa->q,
MAX(zfs_recv_queue_length, 2 * zfs_max_recordsize),
offsetof(struct receive_record_arg, node));
cv_init(&rwa->cv, NULL, CV_DEFAULT, NULL);
mutex_init(&rwa->mutex, NULL, MUTEX_DEFAULT, NULL);
Expand Down Expand Up @@ -4242,4 +4244,10 @@ dmu_objset_is_receiving(objset_t *os)
#if defined(_KERNEL)
module_param(zfs_send_corrupt_data, int, 0644);
MODULE_PARM_DESC(zfs_send_corrupt_data, "Allow sending corrupt data");

module_param(zfs_send_queue_length, int, 0644);
MODULE_PARM_DESC(zfs_send_queue_length, "Maximum send queue length");

module_param(zfs_recv_queue_length, int, 0644);
MODULE_PARM_DESC(zfs_recv_queue_length, "Maximum receive queue length");
#endif

2 comments on commit 3b0d992

@fling-
Copy link
Contributor

@fling- fling- commented on 3b0d992 Apr 11, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@behlendorf is it fine to cherry-pick the fix on top of zfs-0.7.6 tag?

@andrewhowdencom
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fling- did this get cherry picked on to the 0.7.x series in the end? GitHub only displays it in the 0.8.x series.

Please sign in to comment.