uv_fs_readdir and uv_fs_readdir_next should use readdir #1430
Comments
You mean like making uv_fs_readdir an actual stream? The idea with |
I don't enjoy the idea of making the readdir request a handle-like thing at all.We do have |
I had @jclulow looking at this earlier, hopefully he will be able to show an example of what I'm thinking of |
We should be exposing the POSIX opendir(), readdir() and closedir() functions as This pattern is the only way to get a scalable, streams-like interface that allows for reading very large directories without blocking threads for an unreasonable period of time, and/or blowing out the available heap memory in the process. I'm working on a patch to move to this interface, and to preserve the existing (essentially broken) interfaces by renaming them We need to get this done before committing to an interface. Releasing a major version with the status quo means libuv is simply unfit for reading large directories. |
We could instead have
Please do so as separate patches / PRs. Also I'd like for us to discuss the API further before you jump into writing code, as I mentioned above I don't like the idea of exposing the 3 APIs much.
It means that version is not ideal. There are many things we can improve, but then we'd wait for ever. This was never communicated as a priority, and I really don't want to rush it in the last minute, sorry. |
I don't think it makes sense to conflate the |
There is no use for exposing I'd like to hear some more opinions here: /cc @indutny @piscisaureus @bnoordhuis |
Opening a directory is an operation that has particular failure modes; e.g. the directory does not exist, or you do not have permission to read entries from it. Reading entries from that directory has different failure modes, many of which are less clear; e.g. a particular directory entry could not be read by a process with the current data model (32-bit vs 64-bit), or the directory is partially corrupt, after a certain number of directory entries have been read. Also, it emphatically does not make sense to produce an iterator (which this is) function where you receive a set of results from two different entry points. You should begin the iteration with some begin function ( |
So? We'd return the error in the callback as per usual.
My proposal already addresses that: |
The spirit of the APIs here is that you have some form of an initialize and then the subsequent operations that happen on that resource, consider A contrived example to demonstrate the ideal API: int readdir_sync() {
uv_fs_t req;
uv_dir_t dirh;
uv_dirent_t entry;
assert(uv_fs_opendir(uv_default_loop(),
&req,
&dirh,
"/tmp",
UV_DIR_FLAGS_NONE,
NULL) == UV_OK);
while((uv_fs_readir(uv_default_loop(), &req, &dirh, &entry, NULL) != UV_EOF)) {
assert(req.result == UV_OK);
//do something with entry
}
uv_close(dirh);
} uv_fs_t req;
uv_dir_t dirh;
uv_dirent_t entry;
void readdir_cb(uv_fs_t* req) {
assert(req->ptr == &entry);
assert(req->handle == &dirh);
if (req->result == UV_EOF) {
uv_close(&dirh);
} else {
// do something with entry
// have the option to do multiple *sync* readdir operations
// or defer back to the threadpool
uv_fs_readdir(uv_default_loop(),
&req,
&dirh,
&entry,
readdir_cb);
}
}
void opendir_cb(uv_fs_t* req) {
assert(req->ptr == &dirh);
uv_fs_readdir(uv_default_loop(),
&req,
&dirh,
&entry,
readdir_cb);
}
int readdir_async() {
uv_fs_opendir(uv_default_loop(),
&req,
&dirh,
"/tmp",
UV_DIR_FLAGS_NONE,
opendir_cb);
} Having |
I agree with you on the first part but readdir() and readdir_r() cannot really fail absent libuv bugs. |
I gave this more thought during the weekend and I think I was wrong, exposing the 3 APIs seems to be the sanest thing to do. Here are some things I mentally noted, some obvious:
API:
The NOTE: Each API call requires a fresh request. Open questions:
|
Tested on Linux, MacOS X and SmartOS. Fixes joyent#1430
@tjfontaine @saghul I've submitted an initial PR that implements @tjfontaine's proposal. Further details are in the PR's comments. |
Tested on Linux, MacOS X and SmartOS. Fixes joyent#1430
Tested on Linux, MacOS X and SmartOS. Fixes joyent#1430
Tested on Linux, MacOS X and SmartOS. Fixes joyent#1430
Tested on Linux, MacOS X and SmartOS. Fixes joyent#1430
Tested on Linux, MacOS X and SmartOS. Fixes joyent#1430
Tested on Linux, MacOS X, SmartOS and Windows. Fixes joyent#1430
Tested on Linux, MacOS X, SmartOS and Windows. Fixes joyent#1430.
Tested on Linux, MacOS X, SmartOS and Windows. Fixes joyent#1430.
@saghul @tjfontaine I submitted #1574, a new PR that fixes this issue and implements @saghul's latest design proposal. |
Closing, work as begun at #1574 |
Tested on Linux, MacOS X, SmartOS and Windows. Fixes joyent#1430.
The design intent of
readdir
is to handle streaming iteration of a large directory, the current implementation ofuv_fs_readdir
anduv_fs_readdir_next
(while allocating less) do not actually fill this design need.Seems like
uv_fs_readdir
is the only function that's needed here, and can be invoked multiple times, both on and off the threadpool (like with the current other implementations). And then you should be able to calluv_close
on thereaddir
resource. We should treat like it were just another stream.The text was updated successfully, but these errors were encountered: