New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unable to run btrfs send after deduplication #50
Comments
Hi to be honest this sounds like some sort of file system corruption. Have you been able to run btrfsck against the disk in question? What does it report? |
btrfs check --repair /dev/sdc I am still able to successfully run send on a snapshot taken immediately prior to running duperemove so the issue certainly seems to be related to the deduplication process. I've tested this several times with the same result. FWIW if I do a Not sure if this is related to the issue I'm having or not but trying to give you as much information to work with as possible. |
The time change means that we did indeed run on the files so that helps, thanks :) So for some reason I didn't realize that this was happening to you right after a dedupe run. That is definitely not expected behavior though I'm not sure it's the duperemove userspace causing it. Give me some time (it's nearing the end of my day) to try to reproduce this locally on my machine. In the meantime, what kernel are you using? The output of 'uname -a' would tell me this. Also if you were to cut and paste the exact commands you use to reproduce this it would speed things up on my end. |
On the Ubuntu 14.04.1 machine: To reproduce the issue on a newly created btrfs volume (mounted at /drive/2) I run the following commands:
At this point running btrfs send on .backup/pre will succeed however running it on .backup/post will fail with the error posted above. |
I also have the problem...please tell me howto fix the send/receive. |
ok I fixed it by doing a full balance...but this couldn´t be right or ? |
I can also confirm that performing a balance allows me to use send as normal. Hopefully this can provide some clue as to what's occurring... :) |
Personally, this sounds like a problem with the in-kernel implementation of IOC_BTRFS_EXTENT_SAME. Has anyone reported this on linux-btrfs@vger.kernel.org? |
I'm also pretty sure it's a kernel bug, there's nothing userspace should be able to do to fail something like this. The extent same ioctl uses the clone code beneath some safety checks. Clone is also used to do reflinks, so we could exercise (mostly) the same path just by making 'cp --reflink=always' copies of a file then trying to btrfs end that subvolume. If you can do that and tell me whether btrfs send still breaks, that would help narrow it down a bit. Re the btrfs list, feel free to send them a bug report. You might want to CC me if you do that so I can help out with it. |
Hi Mark,
Running btrfs send on .backup/reflink completes without error. Hope this helps narrow things down. |
a new info.....btrfs balance does only work if no other snapshots are present..that´s bad...how could i migrate snapshots in btrfs ? |
new info....the btrfs send and receive works now with btrfs-progs 4.0 on both sides :) even after (un)successful dedup |
@mac-linux-free: unfortunately I cannot confirm this! Running on linux kernel 4.0.1 using btrfs-progs-4.0, I still get the error when trying to send a snapshot of a volume after having run dupremove on that volume. |
ok ... we are waiting for 4.1 :) |
it seams that kernel 4.1 and btrfs-progs 4.1 lead to a successful deduplication...but you have to balance your source btrfs pool first. |
I've just balanced my 6TB backup target... it took weeks :/ |
...hoping I don´t have to do it on my 110TB source :\ |
This sounds like it was fixed upstream - and to my knowledge the fix wasn't directly related to dedupe (let me know if otherwise). Going to close for now. |
Sorry to open this again. This is still not fixed with kernel 4.1.2 and btrfs-progs 4.1. After an unsuccessful dedup: Kernel processed data (excludes target files): 32.0G the send / receive does not work: BTRFS error (device vdb): did not find backref in send_root. inode=17588, offset=1703936, disk_byte=20751888384 found extent=20751888384 I fixed send/receive again with a full balance. But how to dedup? |
Same here. Ran a dedup and got the send_root bug. The problem is that balance currently randomly crashes my machine (with a hard lock). I am deleting the offending inodes one by one ... |
Ooops ok I'm going to keep this one open, with the same comment I gave in issue#87: "Regarding the btrfs error you're seeing during send, there isn't anything that duperemove is doing directly which would cause this behavior. My guess is that send is broken (again) or that the clone code (used in the kernel for dedupe) corrupted something on disk. Have you asked about this on the btrfs list?" I'll try to reproduce this week (on vacation) and take it to the list if I can't figure out immediately what's causing it. |
Ok I tried this a few times on 4.2-rc7 and was not able to reproduce the issue. Here's an example of what I was doing (I tried a few combinations of subvolumes and file trees)
Does something like this reproduce it for you all or did I miss a step? |
Sorry to say it is still not working. Linux fbo-fs-02 4.2.0-1.el7.elrepo.x86_64 #1 SMP Sun Aug 30 21:25:29 EDT 2015 x86_64 x86_64 x86_64 GNU/Linux #dmesg This error occurs on the fileserver which I had deduped before. |
Could it be that the error is due to the previous dedup ? By the way, it seems that btrfs-progs 4.1.2 can recover from these errors. I had one filesystem with those, and running a btrfs check --repair removed them |
No I did not dedup until the Kernel 4.2 was finally released. And I do a send/receive backup to a different server nightly. After the dedup my scripted backup failed. I also know how to recover the error just with btrfs balance online. |
I tried my test on 4.1 and didn't hit anything. Can anyone here give e a test case which reproduces this from a fresh file system? I doubt I'll be able to find this without a reproducer :( |
I do not understand this. It is happening on one server only. I had 2 new installations last week and this errors did not occur on the new servers. Perhaps I shoud try the btrfs check --repair option instead of balancing only. How could I debug ? |
I'd start with btrfsck if you haven't already. |
Now I have one more info. If duperemove runs (in background) you should avoid to run btrfs send/receive. And I found that on big filesystems sometimes occured OOM-errors. Both of these things are leading to the error above. |
the problem still exists on kernel 4.2.3 and btrfsprogs 4.2.2...I do have many snapshots and run duperemove with the -x switch. Is the the right way to do it or do I have to run duperemove on the whole pool? |
Perhaps I should dedup on the pool level and not at the mounted subvol? ( /mnt/btrfs/files instead of /mnt/files) ... I´m testing. |
I raised a duplicate bug for this issue at: Filipe mentioned that this was fixed in the 4.3 kernel via: |
Closing as this was fixed in the upstream kernel, thanks to Filipe Manana. |
This does not seem to be fixed. At least I just encountered it. I ran duperemove and now send causes OOM which then causes a kernel panic after it kills every single program and cannot kill any more. Running rebalance and hopefully that will get things working again. |
@samcv which kernel version are you using? I'd suggest raising the issue on the linux-btrfs vger.kernel.org mailing list if you're hitting it with a recent kernel. |
@ddiss I am using 4.15.10 I just did a full rebalance and now send/receive works fine. |
Hi there,
After running "duperemove -drv" on a btrfs filesystem and taking a readonly snapshot I am no longer able to back it up off disk using btrfs send. The following error is thrown: "ERROR: send ioctl failed with -5: Input/output error"
Checking dmesg shows a corresponding "BTRFS error (device sdc): did not find backref in send_root. inode=708396, offset=131072, disk_byte=379846254592 found extent=379846254592"
The send command still works for the previous snapshot (prior to deduplication) and btrfs scrub shows no errors.
I have tested under Ubuntu 14.04.1 and 14.10 using btrfs-tools versions 3.12 and 3.14.1, respectively, and the issue persists. I am running duperemove v0.09.1.
Is this a known issue? Is there a workaround? Cheers
The text was updated successfully, but these errors were encountered: