Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unexpected exception: inode has no file (any) #217

Closed
tpwrules opened this issue Apr 28, 2024 · 8 comments
Closed

Unexpected exception: inode has no file (any) #217

tpwrules opened this issue Apr 28, 2024 · 8 comments
Assignees
Labels
bug Something isn't working
Milestone

Comments

@tpwrules
Copy link

tpwrules commented Apr 28, 2024

I'm trying to use dwarfs to compress a rather large directory and running into this mysterious error. I tried it twice with the same settings and got the same error, so this doesn't seem to be a transient problem. I'm not encouraged to fiddle with the arguments because this takes a long time to scan, I'm hoping for some input on how best to debug.

$ mkdwarfs -i /some/path -o out.dwarfs -B 16 --order=revpath                                                                    
I 14:36:59.353254 scanning "/some/path"
I 17:41:35.859071 assigning directory and link inodes...                                                                                                                                                                                    
I 17:41:53.096851 waiting for background scanners...                                                                   
I 17:41:53.096993 scanning CPU time: 1.502h                                                                            
I 17:41:53.097007 finalizing file inodes...                                                                            
I 17:42:49.902984 saved 359.1 GiB / 1.538 TiB in 20168616/29741820 duplicate files                                     
I 17:42:51.334990 assigning device inodes...                                                                           
I 17:42:54.303668 assigning pipe/socket inodes...                                                                      
I 17:42:56.423774 building metadata...                                                                                 
I 17:42:56.423831 building blocks...                                                                                   
I 17:42:56.423873 saving names and symlinks...                                                                         
I 17:42:56.429051 waiting for segmenting/blockifying to finish...                                                      
F 17:42:57.069465 exception thrown in worker thread: dwarfs::runtime_error: inode has no file (any) [inode_manager.cpp:205]
_______
scanning: /some/path/some/file
5097172 dirs, 21640907/0 soft/hard links, 29741820/29741820 files, 0 other                                             
original size: 1.538 TiB, hashed: 776.8 GiB (29614083 files, 2.136 MiB/s)                                              
scanned: 0 B (0 files, 0 B/s), categorizing: 0 B/s                                                                     
saved by deduplication: 359.1 GiB (20168616 files), saved by segmenting: 0 B                                           
filesystem: 0 B in 0 blocks (0 chunks, 9573203 fragments, 9573204/9573204 inodes)                                      
compressed filesystem: 0 blocks/0 B written                                                                            
_ _  0% __
*** Aborted at 1714239777 (Unix time, try 'date -d @1714239777') ***
*** Signal 6 (SIGABRT) (0x17ef10) received by PID 1568528 (pthread TID 0x7f2b11eda6c0) (linux TID 1568801) (maybe from PID 1568528, UID 0) (code: -6), stack trace: ***
    @ 0000000000b2e2e9 (unknown)
    @ 0000000000cee4af (unknown)
    @ 0000000000d1bcbc (unknown)
    @ 0000000000cee42d (unknown)
    @ 0000000000420d59 (unknown)
    @ 00000000004d7d65 (unknown)
    @ 0000000000439f69 (unknown)
    @ 000000000055a7c4 (unknown)
    @ 0000000000c876e3 (unknown)
    @ 0000000000d1a3eb (unknown)
    @ 0000000000d6905b (unknown)
Aborted

I am using the release dwarfs 0.9.8 binary downloaded from GitHub on x86_64-linux. The files are all on a ZFS filesystem which does not report any errors. The machine has 56 CPUs and 128GiB of ECC RAM. No other processes are accessing or modifying the directory I'm trying to compress.

@tpwrules tpwrules changed the title Unexpected inode has no file (any) abort Unexpected exception: inode has no file (any) Apr 28, 2024
@mhx
Copy link
Owner

mhx commented Apr 28, 2024

Ouch, that takes some time to reproduce unfortunately, sorry.

The stack trace sadly isn't of much help since the damage is done earlier, quite likely in the finalizing file inodes stage. That's where the files are assigned to the inode objects, and at least one of these inode objects ends up without at least one file.

I have a hunch that this could be related either to an optimization introduced between the 0.8.0 and 0.9.0 releases (which skips hashing of large files if they have a unique size) or a relatively large change to handle file errors more gracefully, introduced between 0.7.x and 0.8.0.

I'd appreciate if you could try if this problem is reproducible with dwarfs-0.8.0. If it is, it'd be interesting if it also reproduces with dwarfs-0.7.5. Please also add --log-level=verbose for the 0.8.0 release (0.7.5 doesn't support that yet).

In the meantime, I'll be running mkdwarfs on a multi-TiB directory here to see if I can reproduce the problem myself.

@mhx mhx added the bug Something isn't working label Apr 28, 2024
@mhx mhx self-assigned this Apr 28, 2024
@mhx
Copy link
Owner

mhx commented Apr 28, 2024

Got it:

root@balrog data # mkdwarfs -i backup -o /dev/null --force --order=revpath --log-level=verbose
V 05:45:46.814369 [mkdwarfs_main.cpp:1221] [--order]
V ...............                            default: revpath
V 05:45:46.814435 [mkdwarfs_main.cpp:1231] [--max-lookback-blocks]
V ...............                            default: 1
V 05:45:46.814447 [mkdwarfs_main.cpp:1241] [--window-size]
V ...............                            default: 12
V 05:45:46.814458 [mkdwarfs_main.cpp:1251] [--window-step]
V ...............                            default: 3
V 05:45:46.814467 [mkdwarfs_main.cpp:1261] [--bloom-filter-size]
V ...............                            default: 4
V 05:45:46.815249 [mkdwarfs_main.cpp:1298] [--compression]
V ...............                            default: zstd [level=22]
I 05:45:46.815908 [scanner.cpp:575] scanning "/raid/data/backup"
I 08:38:58.377506 [scanner.cpp:597] assigning directory and link inodes...
I 08:39:01.342210 [scanner.cpp:607] waiting for background scanners...
I 08:39:01.342283 [scanner.cpp:611] scanning CPU time: 1.593h
I 08:39:01.342316 [scanner.cpp:614] finalizing file inodes...
V 08:39:01.342345 [file_scanner.cpp:403] finalized 0 hardlinks [2.783us]
V 08:39:01.434909 [file_scanner.cpp:429] finalized 176532 unique files [92.54ms]
V 08:39:01.435841 [file_scanner.cpp:429] finalized 0 non-unique files [16.56us]
V 08:39:19.765972 [file_scanner.cpp:429] finalized 5644384 non-unique files [18.33s]
I 08:39:19.791115 [scanner.cpp:626] saved 593.7 GiB / 3.941 TiB in 7327217/13148135 duplicate files
I 08:39:19.996119 [scanner.cpp:647] assigning device inodes...
I 08:39:20.235349 [scanner.cpp:653] assigning pipe/socket inodes...
I 08:39:20.474160 [scanner.cpp:658] building metadata...
I 08:39:20.474215 [scanner.cpp:697] building blocks...
I 08:39:20.474273 [scanner.cpp:661] saving names and symlinks...
I 08:39:20.475152 [scanner.cpp:795] waiting for segmenting/blockifying to finish...
V 08:39:20.624835 [inode_manager.cpp:780] ordering 5820918 inodes by reverse path name...
F 08:39:20.825246 [worker_group.cpp:284] exception thrown in worker thread: dwarfs::runtime_error: inode has no file (any) [inode_manager.cpp:205]
⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯
scanning: /raid/data/backup/tux/tmp/logitechmediaserver-8.2.1-1652959290/lib/Tie/IxHash.pm
1345324 dirs, 81424/0 soft/hard links, 13148135/13148135 files, 0 other
original size: 3.941 TiB, hashed: 1.042 TiB (12971603 files, 14.48 MiB/s)
scanned: 0 B (0 files, 0 B/s), categorizing: 0 B/s
saved by deduplication: 593.7 GiB (7327217 files), saved by segmenting: 0 B
filesystem: 0 B in 0 blocks (0 chunks, 5820917 fragments, 5820918/5820918 inodes)
compressed filesystem: 0 blocks/0 B written
▏                                                                                                                                                         ▏  0% 🌑
*** Aborted at 1714293560 (Unix time, try 'date -d @1714293560') ***
*** Signal 6 (SIGABRT) (0x200c) received by PID 8204 (pthread TID 0x7f05727ee6c0) (linux TID 8483) (maybe from PID 8204, UID 0) (code: -6), stack trace: ***
    @ 00000000009a1470 (not found)
    @ 0000000000d432af (not found)
    @ 0000000000d7154c (not found)
    @ 0000000000d4322d (not found)
    @ 0000000000420ef1 (not found)
    @ 00000000004e6ea5 (not found)
    @ 000000000043b5c9 (not found)
    @ 0000000000571ab4 (not found)
    @ 0000000000cdc4c3 (not found)
    @ 0000000000d6fc7b (not found)
    @ 0000000000dc275b (not found)
Aborted

So don't worry about investigating further, I can run the tests mentioned above myself.

@tpwrules
Copy link
Author

Good to know you can replicate it. Once the problem is figured out, I'd like to be sure that it can't create bogus archives. If it crashes and fails, that's fine. But if it can silently skip files or something, that's a problem.

@mhx
Copy link
Owner

mhx commented Apr 28, 2024

Good to know you can replicate it. Once the problem is figured out, I'd like to be sure that it can't create bogus archives. If it crashes and fails, that's fine. But if it can silently skip files or something, that's a problem.

Yes, absolutely agree.

@mhx
Copy link
Owner

mhx commented Apr 28, 2024

I have a hunch that this could be related either to an optimization introduced between the 0.8.0 and 0.9.0 releases (which skips hashing of large files if they have a unique size) [...]

It is definitely that optimization. I can now reproduce this with a small set of data. There's even an assert() that triggers much earlier in a debug build.

The good news is: it's not a race condition and it's not a problem that can occur undetected. If you run into this problem, mkdwarfs will crash.

The problem occurs if there are multiple files of the same size, some of which are identical, and some of which are different, but have the first 4 KiB in common. Before the optimization introduced in 0.9.0, if there was more than one file of the same size, all files of that size would be fully hashed to determine duplicates. This is extremely time consuming if you're dealing with e.g. raw images that tend to be identical in size. So with the optimization, the scanner attempts to only hash those files that are identical in both the size and a hash of the first 4 KiB.

I think I know roughly why this goes wrong, but I need to write some tests first that reliably trigger the issue before attempting a fix.

@mhx mhx added this to the v0.9.9 milestone Apr 28, 2024
@mhx
Copy link
Owner

mhx commented Apr 29, 2024

The fix was actually much simpler than I thought.

When I introduced the optimization, I had to change the map that keeps track of "unique file sizes" to instead keep track of "unique file (size + 4 KiB hash)s". The code uses this map to determine if:

  1. a (size, hash) pair has never been seen before,
  2. the pair has been seen before exactly once (in which case it needs to kick off hashing jobs for both the first file and the current file),
  3. the pair has been seen before more than once (in which case only the current file needs to be hashed).

This part worked just fine. However, there is a second map that keeps track of a "latch" used to synchronize insertion into a third map that tracks all duplicate files. That whole shebang is necessary because inode objects need to be allocated as soon as possible in order to perform additional work (such as categorization or similarity hashing) asynchronously. So as we discover an unseen (size, hash) pair, we know for sure that this must be a new inode. Any subsequent files with the same (size, hash) pair could refer to the same inode, though. In order avoid creating an additional inode for a duplicate file, we must ensure that the first file (for which we have already created an inode) is inserted to the duplicate files table first. But since hashing is done asynchronously, hashing of the first file could finish after hashing of subsequent files. Hence the latch, which all subsequent files wait for before checking the duplicate files table, and which is released only after the first file has been added to the duplicate files table.

The bug was that the latch map was still only keyed on size rather than (size, hash). This meant there could be collisions in that map after the optimization was added, and there was no check ensuring there are no collisions (there is a check now). As a result, it was possible that files waited on the wrong latch. That "wrong" latch could be released earlier than the "correct" latch, so the subsequent file didn't find its hash in the duplicates map yet. Falsely assuming that it was not a duplicate, this triggered the creation of a new inode object.

Ultimately, despite this, all duplicates will end up correctly in the duplicates map. However, there are now two or more inodes associated with them. Up until the "finalizing" stage, the files know which inode they belong to, but the inodes don't yet know the full list of files. During "finalization", the files from the duplicates map are assigned to the inode referenced by the first file (assuming all files reference the same inode). In the case of the bug, one or more inodes that were created in error will end up with no files. And these "empty" inodes will eventually trigger the exception.

To reiterate: even in the presence of this bug, no individual files or data were lost. It was only that unnecessary inode objects were created and some additional unnecessary work was done.

The fix was simply to change the latch map from using size keys to using (size, hash) keys.

mhx added a commit that referenced this issue Apr 29, 2024
When introducing an optimization to skip hashing of large files if they
already differ in the first 4 KiB, only `unique_size_` was updated to
be keyed on (size, start_hash). However, the same change is necessary
for `first_file_hashed_`, as there can otherwise be collisions if files
with the same size, but different start hashes are processed at the same
time.
@mhx
Copy link
Owner

mhx commented Apr 29, 2024

Once the problem is figured out, I'd like to be sure that it can't create bogus archives.

BTW, an easy way to check the integrity of a DwarFS image using dwarfsck:

$ cd data
$ diff <(dwarfsck /tmp/data.dwarfs --list | sort) <(find * | sort)
$ dwarfsck /tmp/data.dwarfs --checksum=sha256 | sha256sum --check

dwarfsck --list lists all files/dirs in the image. This line can be used to check the image is complete.

dwarfsck --checksum computes a hash for each regular file from the actual data in the image. This line can be used to check that all files can be read correctly.

mhx added a commit that referenced this issue Apr 29, 2024
When introducing an optimization to skip hashing of large files if they
already differ in the first 4 KiB, only `unique_size_` was updated to
be keyed on (size, start_hash). However, the same change is necessary
for `first_file_hashed_`, as there can otherwise be collisions if files
with the same size, but different start hashes are processed at the same
time.
@mhx
Copy link
Owner

mhx commented Apr 30, 2024

Fixed in v0.9.9.

@mhx mhx closed this as completed Apr 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants