We set req->flushreq when walking the work queue holding srv->lock. The original reply is sent while req is still in the work queue, thus there is a race where flushreq could be set just after the postprocess function tests it and the reply discarded. Defer the flushreq reply until after the req has been removed from the work queue.
Avoid tickling a bug in the kernel client until we fix it.
This avoids the case where an Rflush is returned for Treaddir (say) and then a Tclunk is sent for the Treaddir's fid causing readdir to core dump. (the refcount on the fid doesn't protect the DIR * from being closed/freed underneath.
This simplifies error handling a bit since in handling an out of memory error, we are less likely to get another one while sending the error response.
This removes the 'D' state in dtop request view. Completed, unreplied requests will included in the 'R' (running) state.
Just chased down a double-close problem and this would have helped. Should never happen unless something is wrong with file descriptor accounting.
This appears to have eliminated some spurious EBADF errors that were occuring when running a recursive find in a diod-backed file system.
These operations, if successfully flushed, need to be undone with regard to fid accounting at least. See Tflush(9p)
v9fs does not currently handle a flushed Tclunk/Tremove properly. It should behave as though these ops were never sent, meaning the fid remains valid and cannot be reused. However it unconditionally frees them internally. This must be fixed, however it is ueful to be able to interoperate with older clients, so we create this workaround and enable it by default.
Also, further improvements reducing the race between clunked walk/clunk/remove and walk newfid. Introduce the a blocking version of np_fid_create that makes the server tolerant of either ordering of flush response and flushed request retirement, with fid accounting.