Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

btrfs-convert "no-holes": extent-tree.c:2764: alloc_tree_block: BUG_ON ret triggered, value -28 #123

Closed
denji opened this issue Apr 13, 2018 · 32 comments
Labels
Milestone

Comments

@denji
Copy link

denji commented Apr 13, 2018

❯ btrfs version
btrfs-progs v4.15.1

❯ uname -r
4.15.15-63.current

❯ udisksctl unlock -b /dev/sdb1

❯ udisksctl info -b /dev/dm-4 | grep -E 'Id(Type)?'
    Id:                         by-id-dm-name-luks-b5dc009f-53a6-4484-869d-b5e38e90c31e
    IdLabel:                    500GB
    IdType:                     ext4
    IdUUID:                     ec052715-146c-4f92-9c02-2f7c7d392414
    IdUsage:                    filesystem
    IdVersion:                  1.0

❯ free -m
              total        used        free      shared  buff/cache   available
Mem:          16052        3201       11438          57        1412       12869
Swap:             0           0           0

❯ sudo btrfs-convert -p -O extref,skinny-metadata,no-holes /dev/mapper/luks-b5dc009f-53a6-4484-869d-b5e38e90c31e
create btrfs filesystem:
	blocksize: 4096
	nodesize:  16384
	features:  extref, skinny-metadata, no-holes
creating ext2 image file
Unable to find block group for 0
Unable to find block group for 0
Unable to find block group for 0
extent-tree.c:2764: alloc_tree_block: BUG_ON `ret` triggered, value -28
btrfs-convert[0x41b2d6]
btrfs-convert(btrfs_alloc_free_block+0x1ff)[0x42097f]
btrfs-convert[0x413aa3]
btrfs-convert(btrfs_search_slot+0x2a8)[0x4149d8]
btrfs-convert(btrfs_csum_file_block+0x48f)[0x42695f]
btrfs-convert[0x40c7a4]
btrfs-convert(main+0x19d1)[0x40bc61]
/usr/lib/libc.so.6(__libc_start_main+0xea)[0x7f7235e8d00a]
btrfs-convert(_start+0x2a)[0x40c34a]
fish: 'sudo btrfs-convert -p -O extref…' terminated by signal SIGABRT (Abort)

https://bugs.launchpad.net/ubuntu/+source/btrfs-progs/+bug/1774794
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=854489

@denji denji changed the title btrfs-convert SIGABRT: extent-tree.c:2764: alloc_tree_block: BUG_ON ret triggered, value -28 btrfs-convert "no-holes": extent-tree.c:2764: alloc_tree_block: BUG_ON ret triggered, value -28 Apr 13, 2018
@adam900710
Copy link
Collaborator

This means btrfs can't allocate new tree block due to no available space. (indicated by -ENOSPC).

The good news is, all modification is out of used ext* data, so your fs should be still be completely safe.

Unfortunately, we don't have a good way to know if there is enough free space of ext*.
So what we can do is only to enhance the error handler and remove the BUG_ON.

@adam900710
Copy link
Collaborator

BTW, would you please provide the df output of that ext4 (mounted) fs?

Current btrfs-convert can't handle fragmented free space of ext4 well, due to btrfs metadata/data chunk size limitation.

If you really want to convert it to btrfs, I'd recommend to remove some files in that ext4 filesystem, to ensure enough continuous free space, or try to skip datasum to save some metadata space.

@denji
Copy link
Author

denji commented Aug 3, 2018

This is no longer possible, the test cluster is already formatted.
In addition, the disk was not completely full, still more than ~104GB free.

@elliotclee
Copy link

Just ran into this problem with btrfs-convert 4.17.1 when running it on a 1.3T ext4 filesystem. About 850G of the space was used... Am trying to do a defragment-via-resize2fs to see if that helps at all.

@elliotclee
Copy link

/dev/md126p6 1.3T 821G 424G 66% /repo

resize2fs down to 830G and back to 1.3T does not fix the problem.

I also noticed that there are three messages about "Unable to find block group for 0" before the error.

@adam900710
Copy link
Collaborator

adam900710 commented Sep 5, 2018

@elliotclee Great thanks for the info.

If shrinked then appended doesn't work, it may be the problem of csum taking too many space while file tree operation doesn't get its chance to allocate new chunks.

The "unable to find block group for 0" is just some error message unhanled for such case.

Would you please convert without csum nor inline data? (btrfs-convert -d -n)
This should greatly reduce metadata usage, thus has better chance to finish convert.

Or you could try this branch https://github.com/adam900710/btrfs-progs/tree/chunk_allocation
It will allow chunk allocation as long as the caller isn't extent tree.
It may make convert a little unstable, but the design principle keeps original data untouched, it should still be pretty safe for original fs.

@elliotclee
Copy link

Tried it with just -n and it still errored out. With -d -n it completes successfully.

@elliotclee
Copy link

Still happens with btrfs-progs 4.17.1. In btrfs-progs 4.19.1 from Fedora Rawhide, it gives this error (which looks like the same thing with code that got refactored...)

ctree.c:2244: split_leaf: BUG_ON 1 triggered, value 1
btrfs-convert(+0x16477)[0x555b867b5477]
btrfs-convert(btrfs_search_slot+0xee7)[0x555b867b69d7]
btrfs-convert(btrfs_csum_file_block+0x48f)[0x555b867c856f]
btrfs-convert(+0xe8b4)[0x555b867ad8b4]
btrfs-convert(main+0x18d9)[0x555b867acb89]
/lib64/libc.so.6(__libc_start_main+0xf3)[0x7f151979a413]
btrfs-convert(_start+0x2e)[0x555b867ad45e]
Aborted (core dumped)

@elliotclee
Copy link

When I ran 4.19.1 with -d, I got:

ctree.c:2244: split_leaf: BUG_ON 1 triggered, value 1
btrfs-convert(+0x16477)[0x5632a34f3477]
btrfs-convert(btrfs_search_slot+0xee7)[0x5632a34f49d7]
btrfs-convert(btrfs_insert_empty_items+0x98)[0x5632a34f5688]
btrfs-convert(btrfs_insert_inline_extent+0x7c)[0x5632a350603c]
btrfs-convert(+0x11019)[0x5632a34ee019]
btrfs-convert(+0x11eda)[0x5632a34eeeda]
btrfs-convert(main+0xf74)[0x5632a34ea224]
/lib64/libc.so.6(__libc_start_main+0xf3)[0x7fec0bb7d413]
btrfs-convert(_start+0x2e)[0x5632a34eb45e]
Aborted (core dumped)

When I ran 4.19.1 with -d -n, I got:
ctree.c:2244: split_leaf: BUG_ON 1 triggered, value 1
btrfs-convert(+0x16477)[0x559a33c4e477]
btrfs-convert(btrfs_search_slot+0xee7)[0x559a33c4f9d7]
btrfs-convert(btrfs_insert_empty_items+0x98)[0x559a33c50688]
btrfs-convert(+0x28583)[0x559a33c60583]
btrfs-convert(btrfs_insert_xattr_item+0xa9)[0x559a33c607d9]
btrfs-convert(+0x1169b)[0x559a33c4969b]
btrfs-convert(+0x12246)[0x559a33c4a246]
btrfs-convert(main+0xf74)[0x559a33c45224]
/lib64/libc.so.6(__libc_start_main+0xf3)[0x7f64bde03413]
btrfs-convert(_start+0x2e)[0x559a33c4645e]
Aborted (core dumped)

@elliotclee
Copy link

4.17.1 can do convert with just -d OK, so it seems the bug got even worse in 4.18 or 4.19.

@kdave kdave added this to the v5.0 milestone Mar 5, 2019
@kdave
Copy link
Owner

kdave commented Mar 5, 2019

I've added Qu's fix to devel branch, scheduled for 5.0 release.

kdave pushed a commit that referenced this issue Mar 5, 2019
… extent tree

In github issues, one user reports unexpected ENOSPC error if enabling
datasum druing convert.  After some investigation, it looks like that
during ext2_saved/image creation, we could create large file extent
whose size can be 128M (max data extent size).

In that case, its csum block will be at least 128K. Under certain case
we need to allocate extra metadata chunks to fulfill such space
requirement.

However we only do metadata prealloc if we're reserving extents for fs
trees.  (we use btrfs_root::ref_cows to determine whether we should do
metadata prealloc, and that member is only set for fs trees).

There is no explaination on why we only do metadata prealloc for file
trees, but at least from my investigation, it could be related to avoid
nested extent tree modication.

At least extent reservation for csum tree shouldn't be a problem with
metadata block group preallocation.

So change the metadata block group preallocation check from
"root->ref_cow" to "root->root_key.objectid !=
BTRFS_EXTENT_TREE_OBJECTID", and add some comment for it.

Issue: #123
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
kdave pushed a commit that referenced this issue Apr 15, 2019
… extent tree

In github issues, one user reports unexpected ENOSPC error if enabling
datasum druing convert.  After some investigation, it looks like that
during ext2_saved/image creation, we could create large file extent
whose size can be 128M (max data extent size).

In that case, its csum block will be at least 128K. Under certain case
we need to allocate extra metadata chunks to fulfill such space
requirement.

However we only do metadata prealloc if we're reserving extents for fs
trees.  (we use btrfs_root::ref_cows to determine whether we should do
metadata prealloc, and that member is only set for fs trees).

There is no explaination on why we only do metadata prealloc for file
trees, but at least from my investigation, it could be related to avoid
nested extent tree modication.

At least extent reservation for csum tree shouldn't be a problem with
metadata block group preallocation.

So change the metadata block group preallocation check from
"root->ref_cow" to "root->root_key.objectid !=
BTRFS_EXTENT_TREE_OBJECTID", and add some comment for it.

Issue: #123
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
adam900710 added a commit to adam900710/btrfs-progs that referenced this issue Apr 16, 2019
In github issues, one user reports unexpected ENOSPC error if enabling
datasum druing convert.  After some investigation, it looks like that
during ext2_saved/image creation, we could create large file extent
whose size can be 128M (max data extent size).

In that case, its csum block will be at least 128K. Under certain case
we need to allocate extra metadata chunks to fulfill such space
requirement.

However we only do metadata prealloc if we're reserving extents for fs
trees.  (we use btrfs_root::ref_cows to determine whether we should do
metadata prealloc, and that member is only set for fs trees).

There is no explaination on why we only do metadata prealloc for file
trees, but from my educated guess, it could be related to avoid nested
extent/chunk tree modication.

At least extent reservation for csum tree shouldn't be a problem with
metadata block group preallocation.

So adding new condition for metadata preallocate to avoid unexpected
ENOSPC problem.

Issue: kdave#123
Signed-off-by: Qu Wenruo <wqu@suse.com>
@adam900710
Copy link
Collaborator

@elliotclee Would you please try with latest devel branch?

And could you provide the image for us to reproduce?

@elliotclee
Copy link

Hi,

I just downloaded and compiled btrfs-progs from the devel branch as of this evening, and here's what a convert produced:

[root@shiny btrfs-progs-devel]# ./btrfs-convert -L /dev/md126p6
create btrfs filesystem:
blocksize: 4096
nodesize: 16384
features: extref, skinny-metadata (default)
creating ext2 image file
Unable to find block group for 0
Unable to find block group for 0
Unable to find block group for 0
ctree.c:2245: split_leaf: BUG_ON 1 triggered, value 1
./btrfs-convert[0x410a4a]
./btrfs-convert[0x414440]
./btrfs-convert(btrfs_search_slot+0x2fc)[0x415001]
./btrfs-convert(btrfs_csum_file_block+0x482)[0x42756c]
./btrfs-convert[0x40ad7e]
./btrfs-convert(main+0x1a59)[0x40cd86]
/lib64/libc.so.6(__libc_start_main+0xf3)[0x7f602f0e2f33]
./btrfs-convert(_start+0x2e)[0x40a96e]
Aborted (core dumped)

I also ran it again under gdb and got this stack trace:
#0 0x00007ffff7d36eb5 in raise () from /lib64/libc.so.6
#1 0x00007ffff7d21895 in abort () from /lib64/libc.so.6
#2 0x000000000041517f in bugon_trace (assertion=0x471a9b "1", filename=0x4718c5 "ctree.c", func=0x4721f0 <func.9096> "split_leaf", line=2245, val=1) at kerncompat.h:115
#3 0x000000000041b691 in split_leaf (trans=0x5a7100, root=0x4c0cf0, ins_key=0x7fffffffd1e0, path=0x2e4f110, data_size=4, extend=0) at ctree.c:2245
#4 0x0000000000418e37 in btrfs_search_slot (trans=0x5a7100, root=0x4c0cf0, key=0x7fffffffd1e0, p=0x2e4f110, ins_len=4, cow=1) at ctree.c:1217
#5 0x00000000004387da in btrfs_csum_file_block (trans=0x5a7100, root=0x4c0cf0, alloc_end=41808887808, bytenr=41780957184,
data=0x2c02150 "\234D\221\070\341\232\314\301Ьn\234\265.\256\366\251\rH?\323d\345_\257\244\355\213", <incomplete sequence \350\250>, len=4096) at file-item.c:257
#6 0x000000000040b25a in csum_disk_extent (trans=0x5a7100, root=0x5a2530, disk_bytenr=41674670080, num_bytes=134217728) at convert/main.c:182
#7 0x000000000040b6ec in create_image_file_range (trans=0x5a7100, root=0x5a2530, used=0x7fffffffd408, inode=0x7fffffffd4a0, ino=257, bytenr=41674670080, ret_len=0x7fffffffd3f8, convert_flags=15)
at convert/main.c:320
#8 0x000000000040c8bd in create_image (root=0x5a2530, cfg=0x7fffffffd5d0, cctx=0x7fffffffd750, fd=4, size=1429364277248, name=0x470671 "image", convert_flags=15) at convert/main.c:840
#9 0x000000000040d72b in do_convert (devname=0x7fffffffdd56 "/dev/md126p6", convert_flags=15, nodesize=16384, fslabel=0x7fffffffd850 "&\260be", progress=1, features=320) at convert/main.c:1195
#10 0x000000000040ec2e in main (argc=3, argv=0x7fffffffda88) at convert/main.c:1857

(I also btrfs-convert with the -d -n flags, but it still errored out in btrfs_search_slot .)

Unfortunately this filesystem is 840GB of files on a 1.3T partition, and even if I had a place to upload the image to, it would take 48 days at my current internet connection speed. I might be able to set up a time to provide temporary remote access to someone to debug it that way - let me know if that would be helpful.

@kdave kdave closed this as completed in 4ab95eb May 17, 2019
@elliotclee
Copy link

FWIW this issue shouldn't be closed, as it still occurs in btrfs-progs-5.1, even with -d -n.

@adam900710
Copy link
Collaborator

@elliotclee With your latest output, it still looks like it's csum tree causing the problem.

For that case, we have patch to fix that (at least in theory).

Would you please also show the calltrace using -d and -n?

For the worst case, it's the ext* which has too fragmented free space that btrfs can't take much use of.

It would provide great help if I can access the image.

@elliotclee
Copy link

@adam900710 as mentioned in a previous comment, this fs is too large for me to be able to share it, but I can arrange to provide you with root access to the system if you want.

I just did a resize2fs to defragment the free space. Here is the output of btrfs-convert under gdb:
(gdb) b abort
Breakpoint 1 at 0x40a0e0
(gdb) r -d -n -L /dev/md126p6
Starting program: /root/btrfs-progs-5.1/btrfs-convert -d -n -L /dev/md126p6
Missing separate debuginfos, use: dnf debuginfo-install glibc-2.29-12.fc30.x86_64
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
create btrfs filesystem:
blocksize: 4096
nodesize: 16384
features: extref, skinny-metadata (default)
creating ext2 image file
creating btrfs metadata
[New Thread 0x7ffff48a2700 (LWP 4535)]
Unable to find block group for 0064629]
Unable to find block group for 0
Unable to find block group for 0
ctree.c:2245: split_leaf: BUG_ON 1 triggered, value 1
/root/btrfs-progs-5.1/btrfs-convert[0x410a10]
/root/btrfs-progs-5.1/btrfs-convert[0x414409]
/root/btrfs-progs-5.1/btrfs-convert(btrfs_search_slot+0x2fc)[0x414fca]
/root/btrfs-progs-5.1/btrfs-convert(btrfs_insert_empty_items+0x82)[0x41660b]
/root/btrfs-progs-5.1/btrfs-convert(btrfs_inc_extent_ref+0x433)[0x41db73]
/root/btrfs-progs-5.1/btrfs-convert(btrfs_record_file_extent+0x467)[0x423040]
/root/btrfs-progs-5.1/btrfs-convert(record_file_blocks+0x122)[0x40eef2]
/root/btrfs-progs-5.1/btrfs-convert[0x40f3dd]
/root/btrfs-progs-5.1/btrfs-convert[0x410304]
/root/btrfs-progs-5.1/btrfs-convert(main+0x1efd)[0x40d22a]
/lib64/libc.so.6(__libc_start_main+0xf3)[0x7ffff7d22f33]
/root/btrfs-progs-5.1/btrfs-convert(_start+0x2e)[0x40a96e]

Thread 1 "btrfs-convert" hit Breakpoint 1, 0x00007ffff7d2176e in abort () from /lib64/libc.so.6
Missing separate debuginfos, use: dnf debuginfo-install e2fsprogs-libs-1.44.6-1.fc30.x86_64 libblkid-2.33.2-1.fc30.x86_64 libcom_err-1.44.6-1.fc30.x86_64 libgcc-9.1.1-1.fc30.x86_64 libuuid-2.33.2-1.fc30.x86_64
(gdb) where
#0 0x00007ffff7d2176e in abort () from /lib64/libc.so.6
#1 0x000000000041440e in bugon_trace (val=1, line=2245, func=, filename=0x44d790 "ctree.c", assertion=0x44f3b6 "1") at kerncompat.h:115
#2 split_leaf (trans=trans@entry=0x5bca50, root=root@entry=0x4cabc0, ins_key=ins_key@entry=0x7fffffffceb0, path=path@entry=0x242cda0, data_size=data_size@entry=53, extend=extend@entry=0) at ctree.c:2245
#3 0x0000000000414fca in btrfs_search_slot (trans=, root=root@entry=0x4cabc0, key=key@entry=0x7fffffffceb0, p=p@entry=0x242cda0, ins_len=ins_len@entry=53, cow=cow@entry=1) at ctree.c:1217
#4 0x000000000041660b in btrfs_insert_empty_items (trans=trans@entry=0x5bca50, root=root@entry=0x4cabc0, path=path@entry=0x242cda0, cpu_key=cpu_key@entry=0x7fffffffceb0,
data_size=data_size@entry=0x7fffffffceac, nr=nr@entry=1) at ctree.c:2612
#5 0x000000000041db73 in btrfs_insert_empty_item (data_size=, key=0x7fffffffceb0, path=0x242cda0, root=0x4cabc0, trans=0x5bca50) at ctree.h:2640
#6 insert_extent_data_ref (refs_to_add=1, offset=18446744073640984576, owner=657521, root_objectid=5, parent=0, bytenr=11207245824, path=0x242cda0, root=0x4cabc0, trans=0x5bca50) at extent-tree.c:737
#7 insert_extent_backref (refs_to_add=1, offset=18446744073640984576, owner=657521, root_objectid=5, parent=0, bytenr=11207245824, path=0x242cda0, root=0x4cabc0, trans=0x5bca50) at extent-tree.c:1338
#8 btrfs_inc_extent_ref (trans=trans@entry=0x5bca50, root=root@entry=0x4e5720, bytenr=bytenr@entry=11207245824, num_bytes=num_bytes@entry=134217728, parent=parent@entry=0, root_objectid=5, owner=657521,
offset=18446744073640984576) at extent-tree.c:1409
#9 0x0000000000423040 in __btrfs_record_file_extent (ret_num_bytes=, disk_bytenr=11275812864, file_pos=0, inode=0x7fffffffd2d0, objectid=657521, root=0x4e5720, trans=0x5bca50)
at extent-tree.c:3909
#10 btrfs_record_file_extent (trans=0x5bca50, root=0x4e5720, objectid=657521, inode=0x7fffffffd2d0, file_pos=file_pos@entry=0, disk_bytenr=disk_bytenr@entry=11275812864, num_bytes=4096) at extent-tree.c:3938
#11 0x000000000040eef2 in record_file_blocks (data=data@entry=0x7fffffffd160, file_block=, disk_block=, num_blocks=) at convert/source-fs.c:285
#12 0x000000000040f3dd in ext2_create_file_extents (trans=trans@entry=0x5bca50, root=root@entry=0x4e5720, objectid=objectid@entry=657521, btrfs_inode=btrfs_inode@entry=0x7fffffffd2d0,
ext2_fs=ext2_fs@entry=0x47a8b0, ext2_ino=ext2_ino@entry=657267, convert_flags=12) at convert/source-ext2.c:334
#13 0x0000000000410304 in ext2_copy_single_inode (convert_flags=12, ext2_inode=0x7fffffffd380, ext2_ino=657267, ext2_fs=0x47a8b0, objectid=657521, root=0x4e5720, trans=0x5bca50) at convert/source-ext2.c:758
#14 ext2_copy_inodes (cctx=, root=0x4e5720, convert_flags=12, p=) at convert/source-ext2.c:824
#15 0x000000000040d22a in copy_inodes (p=0x7fffffffd670, convert_flags=12, root=0x4e5720, cctx=0x7fffffffd5a0) at convert/main.c:1217
#16 do_convert (features=, progress=, fslabel=0x7fffffffd860 "", nodesize=, convert_flags=, devname=) at convert/main.c:1217
#17 main (argc=, argv=) at convert/main.c:1857

@elliotclee
Copy link

@adam900710 also as mentioned before, I can get convert with '-d -n' to work successfully with btrfs-progs 4.17.1, but starting with 4.19 even that errors out.

@adam900710
Copy link
Collaborator

@elliotclee Please try the following diff

diff --git a/convert/main.c b/convert/main.c
index 68f76f71..c50ed6cb 100644
--- a/convert/main.c
+++ b/convert/main.c
@@ -102,6 +102,7 @@
 #include "convert/common.h"
 #include "convert/source-fs.h"
 #include "fsfeatures.h"
+#include "print-tree.h"
 
 extern const struct btrfs_convert_operations ext2_convert_ops;
 extern const struct btrfs_convert_operations reiserfs_convert_ops;
@@ -864,6 +865,13 @@ out:
 	free_extent_cache_tree(&used_tmp);
 	btrfs_release_path(&path);
 	btrfs_commit_transaction(trans, root);
+
+	printf("%s finished\n", __func__);
+	btrfs_print_bg_usage(root->fs_info);
+	printf("dev tree:\n");
+	btrfs_print_tree(root->fs_info->dev_root->node, 1, 0);
+	printf("chunk tree:\n");
+	btrfs_print_tree(root->fs_info->chunk_root->node, 1, 0);
 	return ret;
 }
 
diff --git a/extent-tree.c b/extent-tree.c
index 8c9cdeff..48b23cc7 100644
--- a/extent-tree.c
+++ b/extent-tree.c
@@ -2523,8 +2523,15 @@ int btrfs_reserve_extent(struct btrfs_trans_handle *trans,
 			       search_start, search_end, hint_byte, ins,
 			       trans->alloc_exclude_start,
 			       trans->alloc_exclude_nr, profile);
-	if (ret < 0)
+	if (ret < 0) {
+		printf("%s failed, ret=%d\n", __func__, ret);
+		btrfs_print_bg_usage(root->fs_info);
+		printf("dev tree:\n");
+		btrfs_print_tree(root->fs_info->dev_root->node, 1, 0);
+		printf("chunk tree:\n");
+		btrfs_print_tree(root->fs_info->chunk_root->node, 1, 0);
 		return ret;
+	}
 	clear_extent_dirty(&info->free_space_cache,
 			   ins->objectid, ins->objectid + ins->offset - 1);
 	return ret;
diff --git a/print-tree.c b/print-tree.c
index ab774637..74a14c0f 100644
--- a/print-tree.c
+++ b/print-tree.c
@@ -1543,3 +1543,24 @@ void btrfs_print_tree(struct extent_buffer *eb, bool follow, int traverse)
 		bfs_print_children(eb);
 	return;
 }
+
+void btrfs_print_bg_usage(struct btrfs_fs_info *fs_info)
+{
+	struct btrfs_block_group_cache *cache;
+	char flags_str[256];
+
+	cache = btrfs_lookup_first_block_group(fs_info, 0);
+	if (!cache)
+		return;
+	while (cache) {
+		memset(flags_str, 0, sizeof(flags_str));
+		bg_flags_to_str(cache->flags, flags_str);
+		printf("bg start=%llu len=%llu flags=%s usage=%f%%\n",
+			cache->key.objectid, cache->key.offset,
+			flags_str,
+			btrfs_block_group_used(&cache->item) * 100.00 /
+				cache->key.offset);
+		cache = btrfs_lookup_first_block_group(fs_info,
+			cache->key.objectid + cache->key.offset);
+	}
+}
diff --git a/print-tree.h b/print-tree.h
index d4721b60..2e56a3a6 100644
--- a/print-tree.h
+++ b/print-tree.h
@@ -39,4 +39,6 @@ void print_chunk_item(struct extent_buffer *eb, struct btrfs_chunk *chunk);
 void print_extent_item(struct extent_buffer *eb, int slot, int metadata);
 void print_objectid(FILE *stream, u64 objectid, u8 type);
 void print_key_type(FILE *stream, u64 objectid, u8 type);
+
+void btrfs_print_bg_usage(struct btrfs_fs_info *fs_info);
 #endif
>objectid, ins->objectid + ins->offset - 1);
        return ret;

The diff will output the chunk and dev extents layout along with bg usage at 2 time points:

  1. After the initial ext2 image creation
  2. When ENOSPC happens

My initial guess is, after the initial ext2 image creation we're already running out of space, if so the diff would show that.
Another guess is, either we're unable to alloc new system chunks or initial chunk layout is already bad to have too little space to spare, either way, the diff would output enough info for us to check.

Please note that, the output may be large, please redirect it and save the output.

With the diff there is no need for gdb output.

@elliotclee
Copy link

@adam900710 Here you go.
convert-output.txt

@adam900710
Copy link
Collaborator

It shows that during that ENOSPC, we have a running transaction which is large and never reached disk.

This means:

  1. the chunk preallocate code doesn't work as expected
    Or we should have new metadata chunks allocated already.
  2. The way we commit current transaction is not proper
    Currently, we use trans->block_used, but that's unreliable.

For problem 1, I need to dig further to pin down the problem, but for 2. would you please try the following diff? (Can be applied separately along previous debug one)

diff --git a/convert/source-ext2.c b/convert/source-ext2.c
index a136e5652898..9e91e723c2f8 100644
--- a/convert/source-ext2.c
+++ b/convert/source-ext2.c
@@ -829,7 +829,7 @@ static int ext2_copy_inodes(struct btrfs_convert_context *cctx,
                pthread_mutex_unlock(&p->mutex);
                if (ret)
                        return ret;
-               if (trans->blocks_used >= 4096) {
+               if (trans->blocks_used >= 4096 || p->cur_copy_inodes % 16 == 0) {
                        ret = btrfs_commit_transaction(trans, root);
                        BUG_ON(ret);
                        trans = btrfs_start_transaction(root, 1);

If it works, please also try without -d/-n to see if it solves the problem.

Please note this is just a proof-of-concept modification, for the real fix we need to take a trade-off between performance and false ENOSPC.

Thanks.

@adam900710
Copy link
Collaborator

adam900710 commented May 23, 2019

Confirmed that after the introduction of delayed ref, we don't have reliable method to know if a block group is near full, so chunk preallocator never works for metadata block group.

The root fix would go that direction. For now, we can commit more frequently as a fast workaround.

@elliotclee
Copy link

@adam900710 The workaround seems to be...working, at least with -d -n flags :-) Is there any other information I can provide?

@adam900710
Copy link
Collaborator

@elliotclee No extra info needed for a while.

As the root cause is located, I'll keep you updated for incoming fix.

Thanks for all your feedback!

adam900710 added a commit to adam900710/btrfs-progs that referenced this issue May 24, 2019
[BUG]
There is a bug report of unexpected ENOSPC from btrfs-convert.
kdave#123

After some debug, even when we have enough unallocated space, we still
hit ENOSPC at btrfs_reserve_extent().

[CAUSE]
Btrfs-progs relies on chunk preallocator to make enough space for
data/metadata.

However after the introduction of delayed-ref, it's no longer reliable
to relie on btrfs_space_info::bytes_used and
btrfs_space_info::bytes_pinned to calculate used metadata space.

For a running transaction with a lot of allocated tree blocks,
btrfs_space_info::bytes_used stays its original value, and will only be
updated when running delayed ref.

This makes btrfs-progs chunk preallocator completely useless. And for
btrfs-convert/mkfs.btrfs --rootdir, if we're going to have enough
metadata to fill a metadata block group in one transaction, we will hit
ENOSPC no matter whether we have enough unallocated space.

[FIX]
This patch will introduce btrfs_space_info::bytes_reserved to trace how
many space we have reserved but not yet committed to extent tree.

To support this change, this commit also introduces the following
modification:
- More comment on btrfs_space_info::bytes_*
  To make code a little easier to read

- Export update_space_info() to preallocate empty data/metadata space
  info for mkfs.
  For mkfs, we only have a temporary fs image with SYSTEM chunk only.
  Export update_space_info() so that we can preallocate empty
  data/metadata space info before we start a transaction.

- Proper btrfs_space_info::bytes_reserved update
  The timing is the as kernel (except we don't need to update
  bytes_reserved for data extents)
  * Increase bytes_reserved when call alloc_reserved_tree_block()
  * Decrease bytes_reserved when running delayed refs
    With the help of head->must_insert_reserved to determine whether we
    need to decrease.

Issue: kdave#123
Signed-off-by: Qu Wenruo <wqu@suse.com>
@adam900710
Copy link
Collaborator

@elliotclee Would you please try my latest branch to see if it solves your problem?

If nothing wrong happens, it would be the root fix.

Thanks.

@elliotclee
Copy link

(This is with the workaround. Will try your latest patch in a minute.)

Hmm, I didn't let it run to completion earlier. Now I ran into the following farther along when it was running:

creating btrfs metadata
copy inodes [O] [ 623/ 1064629]
Failed to find [1428278558720, 168, 16384]
btrfs unable to find ref byte nr 1428278837248 parent 0 root 2 owner 0 offset 0
convert/source-ext2.c:834: ext2_copy_inodes: BUG_ON ret triggered, value -5
./btrfs-convert[0x41096c]
./btrfs-convert(main+0x205d)[0x40d439]
/lib64/libc.so.6(__libc_start_main+0xf3)[0x7f6d3261ef33]
./btrfs-convert(_start+0x2e)[0x40a96e]

#0 0x00007f6d32632eb5 in raise () from /lib64/libc.so.6
#1 0x00007f6d3261d895 in abort () from /lib64/libc.so.6
#2 0x0000000000410980 in bugon_trace (val=, line=834, func=, filename=0x44d494 "convert/source-ext2.c", assertion=0x44f559 "ret") at ./kerncompat.h:115
#3 ext2_copy_inodes (cctx=, root=0x1a2d720, convert_flags=12, p=) at convert/source-ext2.c:834
#4 0x000000000040d439 in copy_inodes (p=0x7ffe923c4610, convert_flags=12, root=0x1a2d720, cctx=0x7ffe923c4540) at convert/main.c:1273
#5 do_convert (features=, progress=, fslabel=0x7ffe923c4800 "", nodesize=, convert_flags=, devname=) at convert/main.c:1273
#6 main (argc=, argv=) at convert/main.c:1915

This happens at the place where it commits a transaction every so often when copying inodes.

@elliotclee
Copy link

With your root-cause-fix, it works like a champ.

While you're working in convert, can you look at the patch on /issues/175? It just speeds up the ext2 image creation step of convert and adds a progress indicator for it, mainly useful when data csumming is on.

@adam900710
Copy link
Collaborator

BTW, is that without any other modification just the root fix?

If so, I think it's time for the fix to go upstream.

And a quick glance into the unable to find ref bug, it is more complex than I think.
That is related to delayed ref, and may have some other factors involved.

I'll add extra upstream debug to see what we can do.

@elliotclee
Copy link

Bleah, so the root-cause-fix only worked with -d -n. When I use it without any extra flags, I get:

(gdb) where
#0 0x00007f38dfb3beb5 in raise () from /lib64/libc.so.6
#1 0x00007f38dfb26895 in abort () from /lib64/libc.so.6
#2 0x00000000004145ad in bugon_trace (val=1, line=2245, func=, filename=0x44d7d0 "ctree.c", assertion=0x44ee6e "1") at kerncompat.h:115
#3 split_leaf (trans=trans@entry=0x608650, root=root@entry=0x51b2b0, ins_key=ins_key@entry=0x7ffc306682f0, path=path@entry=0x2eeb320, data_size=data_size@entry=4, extend=extend@entry=0) at ctree.c:2245
#4 0x0000000000415169 in btrfs_search_slot (trans=trans@entry=0x608650, root=root@entry=0x51b2b0, key=key@entry=0x7ffc306682f0, p=p@entry=0x2eeb320, ins_len=ins_len@entry=4, cow=cow@entry=1) at ctree.c:1217
#5 0x0000000000427976 in btrfs_csum_file_block (trans=trans@entry=0x608650, root=0x51b2b0, alloc_end=alloc_end@entry=41808887808, bytenr=bytenr@entry=41793650688,
data=data@entry=0x2bb1000 "\224rY\275\254h\327\272\005c\301\020\232*\002\n\207QZ3\265p.\272\006[\301\020P\242\212V$\220\257=\032?1\006[\302\030\b\006(-\017P:\252\342\316\065\350{DBZ\343\365\373\037",
len=len@entry=4096) at file-item.c:257
#6 0x000000000040ae2d in csum_disk_extent (trans=trans@entry=0x608650, root=root@entry=0x603920, disk_bytenr=disk_bytenr@entry=41674670080, num_bytes=num_bytes@entry=134217728) at convert/main.c:213
#7 0x000000000040cec7 in create_image_file_range (convert_flags=15, ret_len=, bytenr=41674670080, ino=, inode=0x7ffc306686a0, used=0x7ffc30668440, root=0x603920,
trans=0x608650) at convert/main.c:351
#8 create_image (ctx=0x7ffc306684d0, convert_flags=15, name=0x44cd0b "image", size=1429364277248, fd=4, cctx=0x7ffc30668550, cfg=0x7ffc30668740, root=0x603920) at convert/main.c:872
#9 do_convert (features=, progress=, fslabel=0x7ffc30668810 "", nodesize=, convert_flags=, devname=) at convert/main.c:1239
#10 main (argc=, argv=) at convert/main.c:1907

The output with your debugging patch is attached.
convert-output.txt

adam900710 added a commit to adam900710/btrfs-progs that referenced this issue May 24, 2019
In github issues, one user reports unexpected ENOSPC error if enabling
datasum druing convert.  After some investigation, it looks like that
during ext2_saved/image creation, we could create large file extent
whose size can be 128M (max data extent size).

In that case, its csum block will be at least 128K. Under certain case
we need to allocate extra metadata chunks to fulfill such space
requirement.

However we only do metadata prealloc if we're reserving extents for fs
trees.  (we use btrfs_root::ref_cows to determine whether we should do
metadata prealloc, and that member is only set for fs trees).

There is no explaination on why we only do metadata prealloc for file
trees, but from my educated guess, it could be related to avoid nested
extent/chunk tree modication.

At least extent reservation for csum tree shouldn't be a problem with
metadata block group preallocation.

So adding new condition for metadata preallocate to avoid unexpected
ENOSPC problem.

Issue: kdave#123
Signed-off-by: Qu Wenruo <wqu@suse.com>
@adam900710
Copy link
Collaborator

Oh, we still need one existing patch for that.

The branch updated to include that patch, to allow csum tree modification to allocate new chunk.

Sorry for the inconvenience, would you please fetch the latest branch and test again?

Thanks.

@elliotclee
Copy link

OK, with the latest, running with no flags, it completes the creation of the filesystem image, but hits that "unable to find ref" bug again when copying inodes. There are also some warnings about leaking reserved space:

[root@shiny btrfs-progs-chunk_preallocator]# ./btrfs-convert -L /dev/md126p6
create btrfs filesystem:
blocksize: 4096
nodesize: 16384
features: extref, skinny-metadata (default)
creating ext2 image file
create filesystem image [O] [ 1331GiB/ 1331GiB]
creating btrfs metadata
WARNING: reserved space leaked, transid=4 flag=0x4 bytes_reserved=32768
WARNING: reserved space leaked, transid=5 flag=0x4 bytes_reserved=32768
WARNING: reserved space leaked, transid=6 flag=0x4 bytes_reserved=32768
Failed to find [1429288337408, 168, 16384]
btrfs unable to find ref byte nr 1429288583168 parent 0 root 2 owner 0 offset 0
convert/source-ext2.c:834: ext2_copy_inodes: BUG_ON ret triggered, value -5
./btrfs-convert[0x410941]
./btrfs-convert(main+0x1fdc)[0x40d3b8]
/lib64/libc.so.6(__libc_start_main+0xf3)[0x7f93bb7d2f33]
./btrfs-convert(_start+0x2e)[0x40a96e]

Stack trace:
(gdb) where
#0 0x00007f93bb7e6eb5 in raise () from /lib64/libc.so.6
#1 0x00007f93bb7d1895 in abort () from /lib64/libc.so.6
#2 0x0000000000410955 in bugon_trace (val=, line=834, func=, filename=0x44d454 "convert/source-ext2.c", assertion=0x44ef84 "ret") at ./kerncompat.h:115
#3 ext2_copy_inodes (cctx=, root=0x170b7c0, convert_flags=15, p=) at convert/source-ext2.c:834
#4 0x000000000040d3b8 in copy_inodes (p=0x7ffd9c009ad0, convert_flags=15, root=0x170b7c0, cctx=0x7ffd9c009a00) at convert/main.c:1265
#5 do_convert (features=, progress=, fslabel=0x7ffd9c009cc0 "", nodesize=, convert_flags=, devname=) at convert/main.c:1265
#6 main (argc=, argv=) at convert/main.c:1907

So I think issue #123 is probably fixed, but now that other bug is showing up.

@adam900710
Copy link
Collaborator

@elliotclee Would you please try the latest branch again?
I've added a new debug output, so when you hit the problem again, at least we could have some clue.

Thanks.

@elliotclee
Copy link

elliotclee commented May 27, 2019 via email

kdave pushed a commit that referenced this issue Jun 5, 2019
In github issues, one user reports unexpected ENOSPC error if enabling
datasum druing convert.  After some investigation, it looks like that
during ext2_saved/image creation, we could create large file extent
whose size can be 128M (max data extent size).

In that case, its csum block will be at least 128K. Under certain case
we need to allocate extra metadata chunks to fulfill such space
requirement.

However we only do metadata prealloc if we're reserving extents for fs
trees.  (we use btrfs_root::ref_cows to determine whether we should do
metadata prealloc, and that member is only set for fs trees).

There is no explaination on why we only do metadata prealloc for file
trees, but from my educated guess, it could be related to avoid nested
extent/chunk tree modication.

At least extent reservation for csum tree shouldn't be a problem with
metadata block group preallocation.

So adding new condition for metadata preallocate to avoid unexpected
ENOSPC problem.

Issue: #123
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
kdave pushed a commit that referenced this issue Jun 5, 2019
In github issues, one user reports unexpected ENOSPC error if enabling
datasum druing convert.  After some investigation, it looks like that
during ext2_saved/image creation, we could create large file extent
whose size can be 128M (max data extent size).

In that case, its csum block will be at least 128K. Under certain case
we need to allocate extra metadata chunks to fulfill such space
requirement.

However we only do metadata prealloc if we're reserving extents for fs
trees.  (we use btrfs_root::ref_cows to determine whether we should do
metadata prealloc, and that member is only set for fs trees).

There is no explaination on why we only do metadata prealloc for file
trees, but from my educated guess, it could be related to avoid nested
extent/chunk tree modication.

At least extent reservation for csum tree shouldn't be a problem with
metadata block group preallocation.

So adding new condition for metadata preallocate to avoid unexpected
ENOSPC problem.

Issue: #123
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
kdave pushed a commit that referenced this issue Jun 5, 2019
In github issues, one user reports unexpected ENOSPC error if enabling
datasum druing convert.  After some investigation, it looks like that
during ext2_saved/image creation, we could create large file extent
whose size can be 128M (max data extent size).

In that case, its csum block will be at least 128K. Under certain case
we need to allocate extra metadata chunks to fulfill such space
requirement.

However we only do metadata prealloc if we're reserving extents for fs
trees.  (we use btrfs_root::ref_cows to determine whether we should do
metadata prealloc, and that member is only set for fs trees).

There is no explaination on why we only do metadata prealloc for file
trees, but from my educated guess, it could be related to avoid nested
extent/chunk tree modication.

At least extent reservation for csum tree shouldn't be a problem with
metadata block group preallocation.

So adding new condition for metadata preallocate to avoid unexpected
ENOSPC problem.

Issue: #123
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
@kdave kdave modified the milestones: v5.1, v5.1.1 Jun 11, 2019
kdave pushed a commit that referenced this issue Jun 18, 2019
[BUG]
There is a bug report of unexpected ENOSPC from btrfs-convert, issue #123.

After some debugging, even when we have enough unallocated space, we
still hit ENOSPC at btrfs_reserve_extent().

[CAUSE]
Btrfs-progs relies on chunk preallocator to make enough space for
data/metadata.

However after the introduction of delayed-ref, it's no longer reliable
to rely on btrfs_space_info::bytes_used and
btrfs_space_info::bytes_pinned to calculate used metadata space.

For a running transaction with a lot of allocated tree blocks,
btrfs_space_info::bytes_used stays its original value, and will only be
updated when running delayed ref.

This makes btrfs-progs chunk preallocator completely useless. And for
btrfs-convert/mkfs.btrfs --rootdir, if we're going to have enough
metadata to fill a metadata block group in one transaction, we will hit
ENOSPC no matter whether we have enough unallocated space.

[FIX]
This patch will introduce btrfs_space_info::bytes_reserved to track how
many space we have reserved but not yet committed to extent tree.

To support this change, this commit also introduces the following
modification:

- More comment on btrfs_space_info::bytes_*
  To make code a little easier to read

- Export update_space_info() to preallocate empty data/metadata space
  info for mkfs.
  For mkfs, we only have a temporary fs image with SYSTEM chunk only.
  Export update_space_info() so that we can preallocate empty
  data/metadata space info before we start a transaction.

- Proper btrfs_space_info::bytes_reserved update
  The timing is the as kernel (except we don't need to update
  bytes_reserved for data extents)
  * Increase bytes_reserved when call alloc_reserved_tree_block()
  * Decrease bytes_reserved when running delayed refs
    With the help of head->must_insert_reserved to determine whether we
    need to decrease.

Issue: #123
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
kdave pushed a commit that referenced this issue Jul 3, 2019
[BUG]
There is a bug report of unexpected ENOSPC from btrfs-convert, issue #123.

After some debugging, even when we have enough unallocated space, we
still hit ENOSPC at btrfs_reserve_extent().

[CAUSE]
Btrfs-progs relies on chunk preallocator to make enough space for
data/metadata.

However after the introduction of delayed-ref, it's no longer reliable
to rely on btrfs_space_info::bytes_used and
btrfs_space_info::bytes_pinned to calculate used metadata space.

For a running transaction with a lot of allocated tree blocks,
btrfs_space_info::bytes_used stays its original value, and will only be
updated when running delayed ref.

This makes btrfs-progs chunk preallocator completely useless. And for
btrfs-convert/mkfs.btrfs --rootdir, if we're going to have enough
metadata to fill a metadata block group in one transaction, we will hit
ENOSPC no matter whether we have enough unallocated space.

[FIX]
This patch will introduce btrfs_space_info::bytes_reserved to track how
many space we have reserved but not yet committed to extent tree.

To support this change, this commit also introduces the following
modification:

- More comment on btrfs_space_info::bytes_*
  To make code a little easier to read

- Export update_space_info() to preallocate empty data/metadata space
  info for mkfs.
  For mkfs, we only have a temporary fs image with SYSTEM chunk only.
  Export update_space_info() so that we can preallocate empty
  data/metadata space info before we start a transaction.

- Proper btrfs_space_info::bytes_reserved update
  The timing is the as kernel (except we don't need to update
  bytes_reserved for data extents)
  * Increase bytes_reserved when call alloc_reserved_tree_block()
  * Decrease bytes_reserved when running delayed refs
    With the help of head->must_insert_reserved to determine whether we
    need to decrease.

Issue: #123
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants