…rly full * The reblocking code was incorrectly assuming the cursor would be pointing at a valid node element after an unlock/relock sequence, when it could actually be pointing at the EOF of a node. This case can occur when the filesystem is nearly full (possibly due to the reblocking operation itself), when the filesystem is also under load from unrelated operations. * This can result in the creation of a corrupted B-Tree leaf node or data record. * Corruption can be checked with hammer checkmap and hammer show (as of this rev): hammer -f device checkmap Should output no B-Tree node records or free space mismatches. You will still get the initial volume summary. hammer -f device show | egrep '^B' | egrep -v '^BM' Should output no records. * Currently the only recourse if corruption is found is to copy off the filesystem, newfs_hammer, and copy it back. Full history and snapshots can be retained by using 'hammer -B mirror-read' to copy off the filesystem and mirror-write to copy it back. However, pleaes remember you must do this for each PFS individually. Make sure you have a viable backup before newfsing anything. Reported-by: Francois Tigeot <email@example.com>, Jan Lentfer <Jan.Lentfer@web.de>
Originally queue_lock was an LWKT reader-writer locks, which permitted multiple locks by the same thread, and in fact there are few code paths where such multiple locking is used. Doing the similar thing with lockmgr lock without either LK_NOWAIT or LK_CANRECURSE triggers a panic.
* This assertion can occur under certain circumstances if a rename operation moves a file or directory to a parent directory, due to a circular loop in the dependency chain. * Fix the problem by allowing the case. Reported-by: Sascha Wildner, Alex Hornung, Venkatesh Srinivas, others
* use lockmgr lock for FQP lock, as some strategy ops can sleep while acquiring another lock (CAM SIM lock, for example). * reduce overall locking when it isn't really required, mainly during deallocation (losing last ref) of objects. The locking is only explicitly required to protect the internal TAILQs. * NOTE: this is an _attempt_ to fix some unidentified deadlocks that have been reported occasionally. While it shouldn't happen, be aware that this might explode. Reported-by: Antonio Huete, Jan Lentfer
This fixes a number of grave issues on my Sony VAIO VGN-Z51XG, such as messages about not being able to acquire the global lock, freezes when ACPI was fully enabled and a panic at shutdown. BTW, gcc had been warning us about it for a long time. :) In-collaboration-with: aggelos
…TUAL). This fixes the VKERNEL build.
Submitted-by: Venkatesh Srinivas <firstname.lastname@example.org>
* Accept another argument for fqp allocation which is the corresponding fqmp. This is internally stored for proper self-removal out of the fqmp list on destruction. * This parameter is also used to link the fqp into the fqmp list automatically on creation, avoiding code duplication and deadlocking. * Changed the destruction refcount to -0x400 instead of -3 to make tracking of these cases simpler and not confuse them with bad refcounting. * NOTE: this also fixes the longstanding issue of an eventual panic after a number of policy switches to/from fq. Reported-by: Antonio Huete (tuxillo@)
* Avoid an int64 overflow when calculating the total disk budget by losing bits of precision if needed. * Note that this might not quite fix the issue yet, as there is one other place where the int64 overflow can happen, although it is less likely. * While here, make the rebalancing happen every 0.5s instead of every 1s, effectively reducing the chance of int64 overflows. Reported-by: Antonio Huete (tuxillo@)
* Add some strategic KKASSERT to catch negative values where they aren't allowed. * Avoid certain race conditions by using a local variable instead of using the generally accessible one (budget vs dpriv->budgetpb). Only set the final value once we are ready.
* Factor out an fq_drain which will either cancel or dispatch all the bios we have currently in all our fqp queues. * Clean out old comments and code. * Deal with flushes by not queuing them but rather letting dsched handle them. By returning EINVAL, dsched_queue will dispatch the flush itself.