Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Transforming reiserfs can lead to "kernel BUG at fs/reiserfs/journal.c:3039!" or "unable to handle page fault for address" #54

Closed
audibleptr opened this issue Sep 21, 2023 · 1 comment
Assignees

Comments

@audibleptr
Copy link

Steps to reproduce

  1. # apt-get install -y fstransform reiserfsprogs

Repeat the following steps:

  1. Format a partition into ext2 (4 GB is enough).
  2. Mount the partition.
  3. Create a couple random files on the partition:
# for i in `seq 1 2`; do head -c 1G /dev/urandom > "random_$i.txt"; done
  1. # fstransform /path/to/partition reiserfs
  2. # fstransform /path/to/partition ext2

The issue doesn't reproduce every time (I'd say approximately 1/30 tries), so the steps need to be repeated.

Actual result

fstransform exits with an error, because a program it calls exits with "Segmentation fault" or "Killed":

fstransform: starting version 0.9.4, checking environment
fstransform: checking for which...         '/usr/bin/which'
[...]
12:02:32 fstransform: environment check passed.
12:02:32 fstransform: saving output of this execution into /var/tmp/fstransform/fstransform.log.15019
12:02:32 fstransform: preparing to transform device '/dev/sdb1' to file-system type 'ext2'
12:02:32 fstransform: device is mounted at '/test-mount' with file-system type 'reiserfs'
12:02:32 fstransform: device raw size = 4292870144 bytes
12:02:32 fstransform: creating sparse loop file '/test-mount/.fstransform.loop.15019' inside device '/dev/sdb1'...
Killed

12:02:32 ERROR! fstransform: failed to create or truncate '/test-mount/.fstransform.loop.15019' to zero bytes
                maybe device '/dev/sdb1' is full or mounted read-only?

The program can exit at different stages, e.g. at moving '/dev/sdb1' contents into the loop file:

Show
fstransform: starting version 0.9.4, checking environment
fstransform: checking for which...         '/usr/bin/which'
fstransform: checking for expr...         '/usr/bin/expr'
fstransform: checking for id...         '/usr/bin/id'
fstransform: parsing command line arguments
fstransform: checking for stat...         '/usr/bin/stat'
fstransform: checking for mkfifo...         '/bin/mkfifo'
fstransform: checking for blockdev...         '/sbin/blockdev'
fstransform: checking for losetup...         '/sbin/losetup'
fstransform: checking for fsck...         '/sbin/fsck'
fstransform: checking for mkfs...         '/sbin/mkfs'
fstransform: checking for mount...         '/bin/mount'
fstransform: checking for umount...         '/bin/umount'
fstransform: checking for mkdir...         '/bin/mkdir'
fstransform: checking for rmdir...         '/bin/rmdir'
fstransform: checking for rm...         '/bin/rm'
fstransform: checking for dd...         '/bin/dd'
fstransform: checking for sync...         '/bin/sync'
fstransform: checking for fsmove...         '/usr/sbin/fsmove'
fstransform: checking for fsmount_kernel...         '/usr/sbin/fsmount_kernel'
fstransform: checking for fsremap...         '/usr/sbin/fsremap'
fstransform: checking for fsck(source file-system)...        '/sbin/fsck'
fstransform: checking for fsck(target file-system)...        '/sbin/fsck'
fstransform: looking for optional commands
fstransform: checking for sleep...         '/bin/sleep'
fstransform: checking for date...         '/bin/date'
15:38:02 fstransform: environment check passed.
15:38:02 fstransform: saving output of this execution into /var/tmp/fstransform/fstransform.log.6911
15:38:02 fstransform: preparing to transform device '/dev/sdb1' to file-system type 'reiserfs'
15:38:02 fstransform: device is mounted at '/test-mount' with file-system type 'ext2'
15:38:02 fstransform: device raw size = 4292870144 bytes
15:38:02 fstransform: creating sparse loop file '/test-mount/.fstransform.loop.6911' inside device '/dev/sdb1'...
15:38:02 dd: 1+0 records sent
15:38:02 dd: 1+0 records received
15:38:02 dd: 1 byte copied, 0,000177744 s, 5,6 kB/s
15:38:02 fstransform: device file-system block size = 4096 bytes
15:38:02 fstransform: device usable size = 4292870144 bytes
15:38:02 dd: 1+0 records sent
15:38:02 dd: 1+0 records received
15:38:02 dd: 1 byte copied, 0,000110167 s, 9,1 kB/s
15:38:02 fstransform: unmounting device '/dev/sdb1' and remounting it read-only using kernel driver
15:38:02 fstransform: launching '/usr/sbin/fsremap' in simulated mode for pre-validation
15:38:02 fsremap: setting log level to NOTICE
15:38:02 fsremap: starting job 2, persistence data and logs are in '/var/tmp/fstransform/fsremap.job.2'
15:38:02 fsremap: analysis completed: 8.00 kilobytes must be relocated
15:38:03 fsremap: allocated 8.00 kilobytes RAM as memory buffer
15:38:03 fsremap: primary-storage is 1.00 megabytes, initialized and mmapped() to contiguous RAM
15:38:03 fsremap: (simulated) starting in-place remapping. this may take a LONG time ...
15:38:03 fsremap: (simulated) progress: 50.0% done,   8.0 kilobytes still to remap
15:38:03 fsremap: (simulated) clearing 4.00 gigabytes free-space from device ...
15:38:03 fsremap: (simulated) job completed.
15:38:03 fstransform: unmounting device '/dev/sdb1' and remounting it read-write
15:38:03 fstransform: connected loop device '/dev/loop0' to file '/test-mount/.fstransform.loop.6911'
15:38:03 fstransform: formatting loop device '/dev/loop0' with file-system type 'reiserfs'...
15:38:03 mkfs: mkfs.reiserfs 3.6.27
15:38:03 mkfs:
15:38:03 fstransform: mounting loop device '/dev/loop0' on '/tmp/fstransform.loop.6911' ...
15:38:03 fstransform: loop device '/dev/loop0' mounted successfully.
15:38:03 fstransform: preliminary steps completed, now comes the delicate part:
15:38:03 fstransform: fstransform will move '/dev/sdb1' contents into the loop file.

15:38:03 fstransform: WARNING: THIS IS IMPORTANT! if either the original device '/dev/sdb1'
                      or the loop device '/dev/loop0' become FULL,

                       YOU  WILL  LOSE  YOUR  DATA !

                      fstransform checks for enough available space,
                      in any case it is recommended to open another terminal, type
                        watch df /dev/sdb1 /dev/loop0
                      and check that both the original device '/dev/sdb1'
                      and the loop device '/dev/loop0' are NOT becoming full.
                      if one of them is becoming full (or both),
                      you MUST stop fstransform with CTRL+C or equivalent.

15:38:03 fstransform: moving '/dev/sdb1' contents into the loop file.
15:38:03 fstransform: this may take a long time, please be patient...
Killed

15:38:03 ERROR! fstransform: command '/usr/sbin/fsmove -- /test-mount /tmp/fstransform.loop.6911 --exclude /test-mount/.fstransform.loop.6911' failed (exit status 137)
                this is potentially a problem.
                you can either quit now by pressing ENTER or CTRL+C,

                or, if you know what went wrong, you can fix it yourself,
                then manually run the command '/usr/sbin/fsmove -- /test-mount /tmp/fstransform.loop.6911 --exclude /test-mount/.fstransform.loop.6911'
                (or something equivalent)

This results in the following in journalctl:

kernel: BUG: unable to handle page fault for address: ffffffffd60b47cb
kernel: #PF: supervisor read access in kernel mode
kernel: #PF: error_code(0x0000) - not-present page

Full: journalctl-1.log

Or:

kernel: kernel BUG at fs/reiserfs/journal.c:3039!
kernel: invalid opcode: 0000 [#1] PREEMPT SMP PTI

Full: journalctl-2.log

Expected result

No errors, including kernel errors in journalctl.

Extra information

It's possible this can be reproduced with filesystems other than ext2.

Reproducibility

Tested on ALT Linux Sisyphus and p10; tested on both VMs and real hardware.

It seems this is reproducible just on 6.1 kernels:

  • 5.10.194 no
  • 6.1.52 reproducible
  • 6.1.53 reproducible
  • 6.4.16 no

Package versions:

fstransform-0.9.4-alt2_11.x86_64

libreiserfsprogs-3.6.27-alt1.x86_64
reiserfsprogs-3.6.27-alt1.x86_64

Reproducible regardless of whether the following packages are installed:

libprogsreiserfs-0.3.0.5-alt5.x86_64
libreiser4-1.2.1-alt3.x86_64
@cosmos72
Copy link
Owner

cosmos72 commented Sep 25, 2023

Hello @audibleptr

TL;DR very likely, it's a kernel bug.

Long version:
there are only a few ways for a userspace program to cause "kernel BUG" system messages.

  • The first is an actual kernel bug, either in the kernel core or in some driver/module.
  • The second is to mount a corrupted filesystem, or corrupt a mounted filesystem.
  • There are surely others, but we quickly go into directions not pursued by fstransform.

I'd rule out the "mount a corrupted filesystem, or corrupt a mounted filesystem":
fstransform never writes to mounted file systems - this excludes the "corrupt a mounted filesystem" option.
It only writes to unmounted file systems, and runs fsck before mounting them.
As long as fsck itself is not bugged, this also excludes the "mount a corrupted filesystem" option.

Also, the fact that only a subset of the kernel versions you tried (6.1.52, 6.1.53) exhibit this problem,
while others are immune from it (5.10.194, 6.4.16) hints in the direction of a kernel bug.

Finally, it's not unheard of: it already happened to me that fstransform triggered some bug in kernel code,
as it heavily uses huge sparse files, and random writes into it.

@cosmos72 cosmos72 self-assigned this Sep 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants