David-Sterba/E…
Commits on Mar 15, 2019
-
btrfs: switch extent_buffer::lock_nested to bool
The member is tracking simple status of the lock, we can use bool for that and make some room for further space reduction in the structure. Signed-off-by: David Sterba <dsterba@suse.com>
-
btrfs: use assertion helpers for extent buffer write lock counters
Use the helpers where open coded. On non-debug builds, the warnings will not trigger and extent_buffer::write_locks become unused and can be moved to the appropriate section, saving a few bytes. Signed-off-by: David Sterba <dsterba@suse.com>
-
btrfs: add assertion helpers for extent buffer write lock counters
The write_locks are a simple counter to track locking balance and used to assert tree locks. Add helpers to make it conditionally work only in DEBUG builds. Will be used in followup patches. Signed-off-by: David Sterba <dsterba@suse.com>
-
btrfs: use assertion helpers for extent buffer read lock counters
Use the helpers where open coded. On non-debug builds, the warnings will not trigger and extent_buffer::read_locks become unused and can be moved to the appropriate section, saving a few bytes. Signed-off-by: David Sterba <dsterba@suse.com>
-
btrfs: add assertion helpers for extent buffer read lock counters
The read_locks are a simple counter to track locking balance and used to assert tree locks. Add helpers to make it conditionally work only in DEBUG builds. Will be used in followup patches. Signed-off-by: David Sterba <dsterba@suse.com>
-
btrfs: use assertion helpers for spinning readers
Use the helpers where open coded. On non-debug builds, the warnings will not trigger and extent_buffer::spining_readers become unused and can be moved to the appropriate section, saving a few bytes. Signed-off-by: David Sterba <dsterba@suse.com>
-
btrfs: add assertion helpers for spinning readers
Add helpers for conditional DEBUG build to assert that the extent buffer spinning_readers constraints are met. Will be used in followup patches. Signed-off-by: David Sterba <dsterba@suse.com>
-
btrfs: use assertion helpers for spinning writers
Use the helpers where open coded. On non-debug builds, the warnings will not trigger and extent_buffer::spining_writers become unused and can be moved to the appropriate section, saving a few bytes. Signed-off-by: David Sterba <dsterba@suse.com>
-
btrfs: add assertion helpers for spinning writers
Add helpers for conditional DEBUG build to assert that the extent buffer spinning_writers constraints are met. Will be used in followup patches. Signed-off-by: David Sterba <dsterba@suse.com>
Commits on Feb 28, 2019
-
Merge branch 'for-next-next-v5.0-20190228' into for-next-20190228
kdave committedFeb 28, 2019 -
Merge branch 'for-next-current-v4.20-20190228' into for-next-20190228
kdave committedFeb 28, 2019 -
Merge branch 'misc-5.2' into for-next-next-v5.0-20190228
# Conflicts: # fs/btrfs/disk-io.c
kdave committedFeb 28, 2019 -
Merge branch 'ext/josef/rsv-prop' into for-next-next-v5.0-20190228
kdave committedFeb 28, 2019 -
Merge branch 'ext/qu/pre-commit-check-5.1' into for-next-next-v5.0-20…
…190228
kdave committedFeb 28, 2019 -
Merge branch 'ext/JAILLET/retval-mark-extent-written' into for-next-n…
…ext-v5.0-20190228
kdave committedFeb 28, 2019 -
Merge branch 'ext/cmason/fix-dirty-writes' into for-next-next-v5.0-20…
…190228
kdave committedFeb 28, 2019 -
Merge branch 'ext/anand/stale-devids-free' into for-next-next-v5.0-20…
…190228
kdave committedFeb 28, 2019 -
Merge branch 'misc-next' into for-next-next-v5.0-20190228
kdave committedFeb 28, 2019 -
Merge branch 'ext/filipe/snapshot-dio-buff-fix-v2' into for-next-curr…
…ent-v4.20-20190228
kdave committedFeb 28, 2019 -
Merge branch 'misc-5.1' into for-next-current-v4.20-20190228
kdave committedFeb 28, 2019 -
btrfs: Fix the return value in case of error in 'btrfs_mark_extent_wr…
…itten()' We return 0 unconditionally in most of the error handling paths of 'btrfs_mark_extent_written()'. However, 'ret' is set to some error codes in several error handling paths. Return 'ret' instead to propagate the error code. Fixes: 9c8e63d ("Btrfs: kill BUG_ON()'s in btrfs_mark_extent_written") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: David Sterba <dsterba@suse.com>
-
Btrfs: keep pages dirty when using btrfs_writepage_fixup_worker
For COW, btrfs expects pages dirty pages to have been through a few setup steps. This includes reserving space for the new block allocations and marking the range in the state tree for delayed allocation. A few places outside btrfs will dirty pages directly, especially when unmapping mmap'd pages. In order for these to properly go through COW, we run them through a fixup worker to wait for stable pages, and do the delalloc prep. 87826df added a window where the dirty pages were cleaned, but pending more action from the fixup worker. During this window, page migration can jump in and relocate the page. Once our fixup work actually starts, it finds page->mapping is NULL and we end up freeing the page without ever writing it. This leads to crc errors and other exciting problems, since it screws up the whole statemachine for waiting for ordered extents. The fix here is to keep the page dirty while we're waiting for the fixup worker to get to work. This also makes sure the error handling in btrfs_writepage_fixup_worker does the right thing with dirty bits when we run out of space. Signed-off-by: Chris Mason <clm@fb.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
btrfs: drop uuid_mutex in btrfs_free_extra_devids()
btrfs_free_extra_devids() is called only in the mount context which traverses through the fs_devices::devices and frees the orphan devices devices in the given %fs_devices if any. As the search for the orphan device is limited to fs_devices::devices so we don't need the global uuid_mutex. There can't be any mount-point based ioctl threads in this context as the mount thread is not yet returned. But there can be the btrfs-control based scan ioctls thread which calls device_list_add(). Here in the mount thread the fs_devices::opened is incremented way before btrfs_free_extra_devids() is called and in the scan context the fs_devices which are already opened neither be freed or alloc-able at device_list_add(). But lets say you change the device-path and call the scan again, then scan would update the new device path and this operation could race against the btrfs_free_extra_devids() thread, which might be in the process of free-ing the same device. So synchronize it by using the device_list_mutex. This scenario is a very corner case, and practically the scan and mount are anyway serialized by the usage so unless the race is instrumented its very difficult to achieve. Signed-off-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
btrfs: Do mandatory tree block check before submitting bio
There are at least 2 reports about memory bit flip sneaking into on-disk data. Currently we only have a relaxed check triggered at btrfs_mark_buffer_dirty() time, as it's not mandatory and only for CONFIG_BTRFS_FS_CHECK_INTEGRITY enabled build, it doesn't help user to detect such problem. This patch will address the hole by triggering comprehensive check on tree blocks before writing it back to disk. The design points are: - Timing of the check: Tree block write hook This timing is chosen to reduce the overhead. The comprehensive check should be as expensive as csum. Doing full check at btrfs_mark_buffer_dirty() is too expensive for end user. - Loose empty leaf check Originally for empty leaf, tree-checker will report error if it's not a tree root. The problem for such check at write time is: * False alert for tree root created in current transaction In that case, the commit root still needs to be written to disk. And since current root can differ from commit root, then it will cause false alert. This happens for log tree. * False alert for relocated tree block Relocated tree block can be written to disk due to memory pressure, in that case an empty csum tree root can be written to disk and cause false alert, since csum root node hasn't been updated. Although some more reliable empty leaf check is still kept as is. Namely essential trees (e.g. extent, chunk) should never be empty. The example error output will be something like: BTRFS critical (device dm-3): corrupt leaf: root=2 block=1350630375424 slot=68, bad key order, prev (10510212874240 169 0) current (1714119868416 169 0) BTRFS error (device dm-3): block=1350630375424 write time tree block corruption detected BTRFS: error (device dm-3) in btrfs_commit_transaction:2220: errno=-5 IO failure (Error while writing out transaction) BTRFS info (device dm-3): forced readonly BTRFS warning (device dm-3): Skipping commit of aborted transaction. BTRFS: error (device dm-3) in cleanup_transaction:1839: errno=-5 IO failure BTRFS info (device dm-3): delayed_refs has NO entry Reported-by: Leonard Lausen <leonard@lausen.nl> Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> -
btrfs: extent_io: Handle error better in extent_writepages()
Do proper cleanup if we hit any error in extent_writepages(), and check the return value of flush_write_bio(). Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
btrfs: extent_io: Kill the BUG_ON() in extent_write_cache_pages()
Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: David Sterba <dsterba@suse.com>
-
btrfs: extent_io: Kill the BUG_ON() in lock_extent_buffer_for_io()
This function needs some extra check on locked pages and eb. For error handling we need to unlock locked pages and the eb. Also add comment for possible return values of lock_extent_buffer_for_io(). There is a rare >0 return value branch, where all pages get locked while write bio is not flushed. Thankfully it's handled by the only caller, btree_write_cache_pages(), as later write_one_eb() call will trigger submit_one_bio(). So there shouldn't be any problem. Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
btrfs: extent_io: Handle error better in extent_write_locked_range()
Do proper cleanup if we hit any error in extent_write_locked_range(), and check the return value of flush_write_bio(). Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
btrfs: extent_io: Kill the dead branch in extent_write_cache_pages()
Since __extent_writepage() will no longer return >0 value, (ret == AOP_WRITEPAGE_ACTIVATE) will never be true. Kill that dead branch. Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: David Sterba <dsterba@suse.com>
-
btrfs: extent_io: Handle error better in btree_write_cache_pages()
In btree_write_cache_pages(), we can only get @ret <= 0. Add an ASSERT() for it just in case. Then instead of submitting the write bio even we got some error, check the return value first. If we have already hit some error, just clean up the corrupted or half-baked bio, and return error. If there is no error so far, then call flush_write_bio() and return the result. Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
btrfs: extent_io: Handle error better in extent_write_full_page()
Since now flush_write_bio() could return error, kill the BUG_ON() first. Then don't call flush_write_bio() unconditionally, instead we check the return value from __extent_writepage() first. If __extent_writepage() fails, we do cleanup, and return error without submitting the possible corrupted or half-baked bio. If __extent_writepage() successes, then we call flush_write_bio() and return the result. Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
btrfs: extent_io: Move the BUG_ON() in flush_write_bio() one level up
We have a BUG_ON() in flush_write_bio() to handle the return value of submit_one_bio(). Move the BUG_ON() one level up to all its callers. This patch will introduce temporary variable, @flush_ret to keep code change minimal in this patch. That variable will be cleaned up when enhancing the error handling later. Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: David Sterba <dsterba@suse.com>
-
btrfs: disk-io: Show the timing of corrupted tree block explicitly
Just add one extra line to show when the corruption is detected. Currently only read time detection is possible. The planned distinguish line would be: read time: <detail report> block=XXXXX read time tree block corruption detected write time: <detail report> block=XXXXX write time tree block corruption detected Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: David Sterba <dsterba@suse.com> -
btrfs: Always output error message when key/level verification fails
We have internal report of strange transaction abort due to EUCLEAN without any error message. Since error message inside verify_level_key() is only enabled for CONFIG_BTRFS_DEBUG, the error message won't output for most distro. This patch will make the error message mandatory, so when problem happens we know what's causing the problem. Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: David Sterba <dsterba@suse.com>
-
btrfs: use the existing credit for our first prop
We're now reserving an extra items worth of space for property inheritance. We only have one property at the moment so this covers us, but if we add more in the future this will allow us to not get bitten by the extra space reservation. If we do add more properties in the future we should re-visit how we calculate the space reservation needs by the callers. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: David Sterba <dsterba@suse.com>