Skip to content

Commit

Permalink
Illumos #3552
Browse files Browse the repository at this point in the history
3552 condensing one space map burns 3 seconds of CPU in spa_sync()
     thread (fix race condition)

References:
  https://www.illumos.org/issues/3552
  illumos/illumos-gate@03f8c36

Ported-by: Richard Yao <ryao@gentoo.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>

Porting notes:

This fixes an upstream regression that was introduced in commit
e51be06, which
ported the Illumos 3552 changes. This fix was added to upstream
rather quickly, but at the time of the port, no one spotted it and
the race was rare enough that it passed our regression tests. I
discovered this when comparing our metaslab.c to the illumos
metaslab.c.

Without this change it is possible for metaslab_group_alloc() to
consume a large amount of cpu time.  Since this occurs under a
mutex in a rcu critical section the kernel will log this to the
console as a self-detected cpu stall as follows:

  INFO: rcu_sched self-detected stall on CPU { 0}
  (t=60000 jiffies g=11431890 c=11431889 q=18271)

Closes #1687
Closes #1720
Closes #1731
Closes #1747
  • Loading branch information
grwilson authored and behlendorf committed Oct 18, 2013
1 parent a6ce1ea commit 7a61440
Showing 1 changed file with 17 additions and 10 deletions.
27 changes: 17 additions & 10 deletions module/zfs/metaslab.c
Expand Up @@ -1388,6 +1388,13 @@ metaslab_group_alloc(metaslab_group_t *mg, uint64_t psize, uint64_t asize,
mutex_exit(&mg->mg_lock);
return (-1ULL);
}

/*
* If the selected metaslab is condensing, skip it.
*/
if (msp->ms_map->sm_condensing)
continue;

was_active = msp->ms_weight & METASLAB_ACTIVE_MASK;
if (activation_weight == METASLAB_WEIGHT_PRIMARY)
break;
Expand Down Expand Up @@ -1427,16 +1434,6 @@ metaslab_group_alloc(metaslab_group_t *mg, uint64_t psize, uint64_t asize,

mutex_enter(&msp->ms_lock);

/*
* If this metaslab is currently condensing then pick again as
* we can't manipulate this metaslab until it's committed
* to disk.
*/
if (msp->ms_map->sm_condensing) {
mutex_exit(&msp->ms_lock);
continue;
}

/*
* Ensure that the metaslab we have selected is still
* capable of handling our request. It's possible that
Expand All @@ -1463,6 +1460,16 @@ metaslab_group_alloc(metaslab_group_t *mg, uint64_t psize, uint64_t asize,
continue;
}

/*
* If this metaslab is currently condensing then pick again as
* we can't manipulate this metaslab until it's committed
* to disk.
*/
if (msp->ms_map->sm_condensing) {
mutex_exit(&msp->ms_lock);
continue;
}

if ((offset = space_map_alloc(msp->ms_map, asize)) != -1ULL)
break;

Expand Down

0 comments on commit 7a61440

Please sign in to comment.