Skip to content

Commit 2175bf5

Browse files
josefbacikgregkh
authored andcommitted
btrfs: fix possible free space tree corruption with online conversion
commit 2f96e40 upstream. While running btrfs/011 in a loop I would often ASSERT() while trying to add a new free space entry that already existed, or get an EEXIST while adding a new block to the extent tree, which is another indication of double allocation. This occurs because when we do the free space tree population, we create the new root and then populate the tree and commit the transaction. The problem is when you create a new root, the root node and commit root node are the same. During this initial transaction commit we will run all of the delayed refs that were paused during the free space tree generation, and thus begin to cache block groups. While caching block groups the caching thread will be reading from the main root for the free space tree, so as we make allocations we'll be changing the free space tree, which can cause us to add the same range twice which results in either the ASSERT(ret != -EEXIST); in __btrfs_add_free_space, or in a variety of different errors when running delayed refs because of a double allocation. Fix this by marking the fs_info as unsafe to load the free space tree, and fall back on the old slow method. We could be smarter than this, for example caching the block group while we're populating the free space tree, but since this is a serious problem I've opted for the simplest solution. CC: stable@vger.kernel.org # 4.9+ Fixes: a5ed918 ("Btrfs: implement the free space B-tree") Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
1 parent f343bf1 commit 2175bf5

File tree

3 files changed

+21
-2
lines changed

3 files changed

+21
-2
lines changed

fs/btrfs/block-group.c

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -639,7 +639,15 @@ static noinline void caching_thread(struct btrfs_work *work)
639639
mutex_lock(&caching_ctl->mutex);
640640
down_read(&fs_info->commit_root_sem);
641641

642-
if (btrfs_fs_compat_ro(fs_info, FREE_SPACE_TREE))
642+
/*
643+
* If we are in the transaction that populated the free space tree we
644+
* can't actually cache from the free space tree as our commit root and
645+
* real root are the same, so we could change the contents of the blocks
646+
* while caching. Instead do the slow caching in this case, and after
647+
* the transaction has committed we will be safe.
648+
*/
649+
if (btrfs_fs_compat_ro(fs_info, FREE_SPACE_TREE) &&
650+
!(test_bit(BTRFS_FS_FREE_SPACE_TREE_UNTRUSTED, &fs_info->flags)))
643651
ret = load_free_space_tree(caching_ctl);
644652
else
645653
ret = load_extent_tree_free(caching_ctl);

fs/btrfs/ctree.h

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -146,6 +146,9 @@ enum {
146146
BTRFS_FS_STATE_DEV_REPLACING,
147147
/* The btrfs_fs_info created for self-tests */
148148
BTRFS_FS_STATE_DUMMY_FS_INFO,
149+
150+
/* Indicate that we can't trust the free space tree for caching yet */
151+
BTRFS_FS_FREE_SPACE_TREE_UNTRUSTED,
149152
};
150153

151154
#define BTRFS_BACKREF_REV_MAX 256

fs/btrfs/free-space-tree.c

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1152,6 +1152,7 @@ int btrfs_create_free_space_tree(struct btrfs_fs_info *fs_info)
11521152
return PTR_ERR(trans);
11531153

11541154
set_bit(BTRFS_FS_CREATING_FREE_SPACE_TREE, &fs_info->flags);
1155+
set_bit(BTRFS_FS_FREE_SPACE_TREE_UNTRUSTED, &fs_info->flags);
11551156
free_space_root = btrfs_create_tree(trans,
11561157
BTRFS_FREE_SPACE_TREE_OBJECTID);
11571158
if (IS_ERR(free_space_root)) {
@@ -1173,11 +1174,18 @@ int btrfs_create_free_space_tree(struct btrfs_fs_info *fs_info)
11731174
btrfs_set_fs_compat_ro(fs_info, FREE_SPACE_TREE);
11741175
btrfs_set_fs_compat_ro(fs_info, FREE_SPACE_TREE_VALID);
11751176
clear_bit(BTRFS_FS_CREATING_FREE_SPACE_TREE, &fs_info->flags);
1177+
ret = btrfs_commit_transaction(trans);
11761178

1177-
return btrfs_commit_transaction(trans);
1179+
/*
1180+
* Now that we've committed the transaction any reading of our commit
1181+
* root will be safe, so we can cache from the free space tree now.
1182+
*/
1183+
clear_bit(BTRFS_FS_FREE_SPACE_TREE_UNTRUSTED, &fs_info->flags);
1184+
return ret;
11781185

11791186
abort:
11801187
clear_bit(BTRFS_FS_CREATING_FREE_SPACE_TREE, &fs_info->flags);
1188+
clear_bit(BTRFS_FS_FREE_SPACE_TREE_UNTRUSTED, &fs_info->flags);
11811189
btrfs_abort_transaction(trans, ret);
11821190
btrfs_end_transaction(trans);
11831191
return ret;

0 commit comments

Comments
 (0)