Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

error codes on windows #2348

Open
TanninOne opened this issue Jun 24, 2019 · 43 comments
Open

error codes on windows #2348

TanninOne opened this issue Jun 24, 2019 · 43 comments

Comments

@TanninOne
Copy link

  • Version: ? (The one in node.js 10.11.0)
  • Platform: Windows 10

I'm using libuv through node.js so sorry for not knowing the exact version.

One of my pet peeves with node.js/libuv is how many errors on windows are mapped to "UNKNOWN" with all details about the error being lost completely, which makes debugging user reports a nightmare.

Concrete example:
require('child_process').spawn(<something that isn't actually an executable, like a text file>)
calls uv_spawn under the hood (not sure about the parameters, sry) which in turn calls CreateProcessW.
The error produced by windows is code 193 - "%1 is not a valid Win32 application" - which is a fairly useful error message.
uv_spawn seems to return UNKNOWN - "unknown error" which isn't.

Or: accessing any file on disk can trigger 1224 - "The requested operation cannot be performed on a file with a user-mapped section open." due to virus scanners locking files. It should probably map to EBUSY so user code can retry the call but instead it maps, again, to UNKNOWN.

Mapping the above cases to something more reasonable is probably easy enough, but they are only examples that we found more by accident than anything else - we currently have dozens of open error reports with code "UNKNOWN" where the users have no idea on how to reproduce and we have no idea what the trigger may be.
It would help tremendously if there was some way to encode the original error code into the message, even if it was just added into the error message, something like code: UNKNOWN, message: "unknown error (code 1224)" the error message would be infinitely more useful.

Considering windows has an absurd number of different error codes, I don't have a lot of hope that all of them will eventually be mapped to sensible "generic" codes, I think a goal should be to make the unknown error case more useful.

@bnoordhuis
Copy link
Member

You're welcome to open pull requests mapping more Windows errors to libuv errors. See commit 11ce5df for an example.

It would help tremendously if there was some way to encode the original error code into the message, even if it was just added into the error message

Libuv passes errors as integers, not strings. Encoding the Windows error code into the libuv error code would break backwards compatibility (all the UV_E* return codes would change.)

@TanninOne
Copy link
Author

The whole point here is this: When libuv returns UNKNOWN there is no way for us to figure out what caused the problem and what error needs to map to what libuv error, because you're dropping the information.
The above cases are completely random examples that we managed to reproduce out of potentially thousands and MS is introducing new ones by adding new features.

Creating a pull-request adding these mappings doesn't get me any closer to identifying all the other "UNKNOWN" errors being reported.

@bnoordhuis
Copy link
Member

Some loss of fidelity is expected when it comes to system errors, that's what you buy into when you use libuv.

If you can think of a way to expose (more information about) the native error without breaking backwards API/ABI compatibility, let me know.

I'll close this out for now but happy to reopen when there's something actionable.

@TanninOne
Copy link
Author

Since I don't use libuv directly I don't know what options you have but one approach would be to treat the error code as two 16bit values instead of one 32bit number - with the lower 16 bits being the translated, generic code and the upper 16bits the original system-specific code.

Node.js can then "&= 0xFFFF" for all programmatic handling of the error code and attach the system error code as is as an additional field to error objects.

With that information in hand, we (application devs) then have a chance to investigate the issues and report back what actual errors happen in the wild and improve the error code mapping.

If that's not feasible, do what GetLastError or errno do: provide a (thread-)global variable containing the last system error associated with error you're returning.
When node.js creates an Error object it fetches that global and again attaches it to the error object in an additional field.

@bnoordhuis
Copy link
Member

Encoding error codes is not an option because of backwards compatibility. I also thought of that hypothetical uv_get_last_error() function, however...

  1. There's room in uv_handle_t and uv_req_t to store the error code, but uv_loop_t might be a problem and I expect thread-local storage wouldn't work.

  2. Threading through the error code so uv_get_last_error() returns the right error at the right time might be trickier than it seems.

  3. It'd be a Windows Special(TM), something only meaningful on Windows. On other platforms it'd simply return the error code that the failing handle/request/call produces. Libuv tries to avoid such APIs.

(Maybe the Haiku or z/OS or a future Fuchsia port could return something meaningful but that needs investigation.)

If you think those issues can be overcome, you're welcome to open a (WIP or otherwise) pull request for further discussion.

@TanninOne
Copy link
Author

Encoding error codes is not an option because of backwards compatibility

so? Then break backwards compatibility. The whole point of having the first number in semantic versioning is so you can signal to your users when you do. What's stopping you from making this an api change for libuv 2.0?

and I expect thread-local storage wouldn't work.

Why wouldn't it?

might be trickier than it seems.

Might be or will be? Why not investigate that before this issue is closed and forgotten?

Libuv tries to avoid such APIs.

Libuv is doing it wrong then. You can't force every platform into the same limited corset of functionality without severely limiting its usefulness.

Right now the platform support for Windows is seriously borked. This here is not a small issue to me and I have a hard time believing I'm the only one.
A fix for that should not fail because of some arbitrary rule. If this is so hard to fix, that's a serious design flaw in libuv. Platforms differ, a cross platform toolkit needs to be able to account for those differences not ignore them.
I've been using Qt and python cross-platform for over 10 years and with neither I had this kind of problems.

you're welcome to open a (WIP or otherwise) pull request

It would be way too much work for me to acquire enough knowledge about the libuv code to do that. From my perspective a better use of my time is to write my own native modules to circumvent libuv as much as possible.

No offense but imho the start page of libuv should say that Windows is not a fully supported platform, that would have saved me a lot of grief.

@bnoordhuis
Copy link
Member

Overly dramatic replies are unlikely to get you what you want.

#1597 goes into a lot of detail why a v2.x release is unlikely to happen anytime soon.

As to your "Windows is not a fully supported platform" quip, there are quite possibly more Windows users than all other platforms combined. Libuv shows up in a ton of places and most people seem pretty happy with it.

To answer your other questions:

  • Thread-local storage: because it's thread-local. :-) Not everyone restricts their event loop usage to a single thread. It's not an endorsed way of using libuv but I'm also not going to break existing uses.

  • uv_get_last_error() trickiness: I'll be less coy: there are a number of thorny issues to deal with. I'd have listed them and suggested solutions if you had picked this up.

@bzoz
Copy link
Member

bzoz commented Jul 3, 2019

uv_translate_sys_error could set some internal uv__last_error that uv_get_last_error would return. We could maybe name it uv_error_details, so other platforms could possibly reuse that in the future.

While adding more error mappings is always good, but there are a lot of Windows error codes. We cannot possibly map them all. I say we reopen this since the issue is real.

@bzoz bzoz reopened this Jul 3, 2019
@TanninOne
Copy link
Author

Thank you bzoz!

For the time being I've written a small native module that uses MS Detours to crowbar something into uv_translate_sys_error that saves away the last windows error that produced "UNKNOWN" and then I fetch it asap from my JS code.

The main issue - like bnoordhuis already pointed out - is that I can't store the error in a way that lets me reliably match it to the Error object so all I get is "this UNKNOWN error happened, also this windows error recently caused an UNKNOWN error, maybe they are the same, maybe not."

I hope this still gives me a better idea of what "unknown" errors are actually happening in the wild using my software.

@TanninOne
Copy link
Author

Saw ERROR_NOT_READY (code 21) happen on a call to fsync - mapped to UNKNOWN.
According to MS this usually indicates a serious hardware error, not sure what it should be mapped to. Either EBUSY so client code can retry the op hoping it's not a hardware failure or EIO?

@TanninOne
Copy link
Author

Another "UNKNOWN" one: ERROR_VIRUS_INFECTED (code 225) happens when writing a file.

Not sure how to map that...

@TanninOne
Copy link
Author

TanninOne commented Aug 9, 2019

ERROR_DEVICE_HARDWARE_ERROR (483) reported on CreateFile.

Windows reports it as "The request failed due to a fatal device hardware error."

This followed immediately after a ERROR_IO_DEVICE which is actually already mapped to EIO.

@saghul
Copy link
Member

saghul commented Aug 9, 2019

We can probably check here as well: https://github.com/chromium/chromium/blob/master/net/base/net_errors_win.cc

@saghul
Copy link
Member

saghul commented Aug 9, 2019

As for the ones you reported already, my 2 cents:

ERROR_NOT_READY -> EAGAIN
ERROR_VIRUS_INFECTED -> EIO
ERROR_DEVICE_HARDWARE_ERROR -> EIO

@TanninOne
Copy link
Author

I kind of wish there a libuv code that was a bit more precise for the "ERROR_VIRUS_INFECTED" case because in that case the user actually may have to take action based on it by white-listing the application or reporting the false positive (if it is one) to the AV company.
EIO doesn't really tell the user what to do with it.

@saghul
Copy link
Member

saghul commented Aug 9, 2019

On Windows, there actual error is available in the uv_fs_t structure:

DWORD sys_errno_; \

While it's part of the "private" fields, you should be ok using it. Now, since you are using Node, you'd need Node to expose this too.

At the end of the the day I don't think it's libuv / node's resposibility to tell you the file was infected, you have the AV software for that.

@TanninOne
Copy link
Author

Develop software for Windows for a bit and you will learn why it doesn't work like that. Some AVs will silently delete/quarantine files without ever reporting it. Many users will click "ok" on every dialog without reading it.
At the end of the day the only thing that matters to me is the feedback my users give me and if the report the framework provides doesn't let me distinguish between a hardware defect and a false positive from the AV, I can't categorize issues, can't provide help, can't figure out if the error is actually something I have to look into.

This is not about "node having a responsibility to tell that a file was virus infected" but if ERROR_VIRUS_INFECTED is reported as EIO then every EIO could be an AV false positive. The more different errors you map to any libuv error code the more fuzzy the libuv code becomes.

@TanninOne
Copy link
Author

If introducing a new error code in libuv is no option, maybe mapping ERROR_VIRUS_INFECTED to EPERM is more appropriate than EIO.

@saghul
Copy link
Member

saghul commented Aug 19, 2019

If introducing a new error code in libuv is no option

That hasn't been discussed. @libuv/collaborators thoughts on adding UV_EVIRUS? Obviously it wouldn't map to anything on Unix. FWIW: Chromium's error mappings define a specific one for this case. I'm personally +0.25.

maybe mapping ERROR_VIRUS_INFECTED to EPERM is more appropriate than EIO.

I thinkthat's even more misleading, because EPERM means there was a permission error, generally.

@TanninOne
Copy link
Author

Yes, but one could argue that the virus scanner here acts like a form of file protection, does it really matter how and for what purpose the file is "protected"?
Just like with classical file permissions the user would use software to allow access to the file (if it was a false positive). The act of white-listing a file in the AV is kind of similar to giving a use account access to a file, more so than replacing a broken hard disk...

More error codes I encountered:
ERROR_CLOUD_FILE_PROVIDER_NOT_RUNNING (362), ERROR_CLOUD_FILE_VALIDATION_FAILED (383) and ERROR_CLOUD_FILE_PROVIDER_TERMINATED (404) all happened when the software tried to read/write a (text-)file in the users documents folder which happened to be on onedrive: C:\Users<username>\OneDrive\

Essentially you can probably get one of an entire cluster of cloud-related error messages (roughly between 362-404) simply by trying to access documents when the user has decided to use onedrive.

@cjihrig
Copy link
Contributor

cjihrig commented Aug 19, 2019

thoughts on adding UV_EVIRUS?

I don't really have strong feelings either way. My only concern is that embedders might need to add a bunch of checks for UV_EVIRUS (or other platform specific code). I'm assuming we would be targeting v1.x for this, meaning that it could impact existing code.

I think I prefer a separate API for getting the Windows specific error (as @bzoz suggested) if it's feasible.

@saghul
Copy link
Member

saghul commented Aug 19, 2019

My only concern is that embedders might need to add a bunch of checks for UV_EVIRUS (or other platform specific code).

How do? We'd take care of the mapping, just like the other libuv errors.

I'm assuming we would be targeting v1.x for this, meaning that it could impact existing code.

Yeah, that's what I had in mind.

I think I prefer a separate API for getting the Windows specific error (as @bzoz suggested) if it's feasible.

This would be really simple since we already have that field there. But this doesn't help current libuv users (Node being one of them), does it? If one sees this one popping up on a backtrace they'd just go and handle it, I guess.

In general, this discussion has bifurcated a bit, let me summarize FTR:

  • we have some (easily fixable) gaps in our Windows error mapping
  • there are Windows errors which don't have an equivalent on Unix: what do we do with those?

I don'think anyone disagrees with the first one, it's just a matter of mapping them to whatever makes sense.

As for the second one... things get a bit tricky. Generally we seem to fold things to the Unix side, but here we have nothing to fold these errors to. We can provide an API, yes, but then we wouldn't be that platform agnostic, would we?

Also important: the need for the latter seems to have come up exactly once :-P

@vtjnash
Copy link
Member

vtjnash commented Aug 19, 2019

One idea might be to call SetLastError just before return (or callback) in libuv, so that the value is visible to the user application if it wants to retrieve it.

@TanninOne
Copy link
Author

we have some (easily fixable) gaps in our Windows error mapping

I don't really fully agree with "some" or "easily fixable" here. There are at least 40 error messages related to OneDrive in the cluster of errors I mentioned above, some indicate temporary network errors, some configuration errors the user has to fix, some might indicate broken data.
Mapping those alone requires an understanding of onedrive that I don't have, I can only make guesses.

I still feel like we're losing a lot of information even when adding the mappings, I hoped that me posting all these error messages I'm getting demonstrated how useful it would be to get at the original error message.

Besides: As our discussion above regarding ERROR_VIRUS_INFECTED demonstrates it's not all that easy to determine how to map errors, whether the mapping should happen based on what the error most closely represents (EIO) or what gives the user/developer the best indication of how to deal with it (EPERM).

I will continue to post error codes I experienced in the wild if that is useful but for the record: No, I don't think that adding more mappings is an appropriate solution.

We can provide an API, yes, but then we wouldn't be that platform agnostic, would we?

It depends on how you want to define platform agnostic, do you want to support only the set of functionality that all platforms support?
libuv or node (tbh I don't really know where the line is) seems to be focused on exposing a unixoid api on every platform and I get why it's doing it but you have to be aware that this is detrimental to the quality on Windows.
I apologize for this long-ass post but let me give another example here: On unix (or node.js) if you want to recursively read a directory you call readdir, then stat on each item, if it's a dir you recurse into that directory.
On Windows the readdir equivalent (NtQueryDirectoryFile) already returns file attributes in the initial call, all the stat calls are superfluous.
A recursive directory listing using readdir&stat on windows is considerably slower (2-3 times) than what a native Windows application using native functions could achieve.
If libuv was a higher-level api (like Qt) it could provide its own "recursive directory listing" function that still has the same api on all platforms, still platform agnostic, but performance-optimized for each.

Obviously libuv can't just be fully redesigned, it is what it is. I just want you to understand that there is a considerable cost to how low-level libuv is.
libuv has obviously been designed by linux/unix users with very little understanding of Windows or at least very little consideration for it. Shoehorning windows into a unixoid api does not make this framework particularly platform-agnostic. The fact that there are so many windows-specific issues with it make this a very leaky abstraction anyway.
Providing an api to get at the original windows error code wouldn't make libuv less platform-agnostic, it would just make the existing, undeniable, unfixable platform differences more manageable.

Please, please just understand that Windows is not unix with CamelCase function names.

One idea might be to call SetLastError just before return (or callback) in libuv, so that the value is visible to the user application if it wants to retrieve it.

I think it would be safer to use a custom api/custom global to ensure the lasterror isn't overwritten by the next call to the windows api, especially when the user application uses libraries (stl, boost, ...) it's practically impossible to know which call might, under the hood, trigger a call to the windows api and it would be an implementation detail that could change between versions. No library makes any promises on whether they affect the platform "last error" variable.

@saghul
Copy link
Member

saghul commented Aug 20, 2019

I don't really fully agree with "some" or "easily fixable" here.

In the immortal words of The Dude: that's just, like, your opinion, man.

There are at least 40 error messages related to OneDrive in the cluster of errors I mentioned above, some indicate temporary network errors, some configuration errors the user has to fix, some might indicate broken data.
Mapping those alone requires an understanding of onedrive that I don't have, I can only make guesses.

So? That's not an impossible to solve problem. It just requires compiling the list (which you have been doing already!) and see what we can map them too, as accurately as possible.

I still feel like we're losing a lot of information even when adding the mappings, I hoped that me posting all these error messages I'm getting demonstrated how useful it would be to get at the original error message.

No, you haven't proved that. The operation would have still failed, and there would be nothing your application could do to fix it, would there? Yes, more descriptive error messages, that much we know.

Besides: As our discussion above regarding ERROR_VIRUS_INFECTED demonstrates it's not all that easy to determine how to map errors, whether the mapping should happen based on what the error most closely represents (EIO) or what gives the user/developer the best indication of how to deal with it (EPERM).

I suggested we add EVIRUS so I don't know what you mean here.

It depends on how you want to define platform agnostic, do you want to support only the set of functionality that all platforms support?

libuv supports what all platforms support, as best as possible.

libuv or node (tbh I don't really know where the line is) seems to be focused on exposing a unixoid api on every platform and I get why it's doing it but you have to be aware that this is detrimental to the quality on Windows.

Really uh? Tell me if the libuv API doesn't look more like IOCP than epoll, I dare you :-)

On unix (or node.js) if you want to recursively read a directory you call readdir, then stat on each item, if it's a dir you recurse into that directory.
On Windows the readdir equivalent (NtQueryDirectoryFile) already returns file attributes in the initial call, all the stat calls are superfluous.
A recursive directory listing using readdir&stat on windows is considerably slower (2-3 times) than what a native Windows application using native functions could achieve.
If libuv was a higher-level api (like Qt) it could provide its own "recursive directory listing" function that still has the same api on all platforms, still platform agnostic, but performance-optimized for each.

Or you could write a proposal and a PR.

libuv has obviously been designed by linux/unix users with very little understanding of Windows or at least very little consideration for it.

This is just not true. Show some damned respect for the people who worked hard to make Windows a first class citizen, which it is. libuv was designed with Windows in mind from the get go: https://tinyclouds.org/iocp-links.html

I'll stop here because I feel trying to help you has been a colosal waste of my time.

@TanninOne
Copy link
Author

In the immortal words of The Dude: that's just, like, your opinion, man.

Nor did I claim it was. You said "I don'think anyone disagrees with the first one" - acting like you were stating a fact. I was just pointing out that I do in fact disagree.

So? That's not an impossible to solve problem. It just requires compiling the list (which you have been doing already!) and see what we can map them too, as accurately as possible.

Have fun doing that and when you're done there is only ~16000 more error codes still not mapped left.

I still feel like we're losing a lot of information even when adding the mappings, I hoped that me posting all these error messages I'm getting demonstrated how useful it would be to get at the original error message.

No, you haven't proved that.

I didn't say I proved that, I said "I still feel like".
But the fact that I encountered half a dozen errors that all map to UNKNOWN in my one application in a couple of weeks, errors that we wouldn't have been able to identify at all if I hadn't hacked in the feature I'm proposing here doesn't give make you think that maybe it is useful?

The operation would have still failed, and there would be nothing your application could do to fix it, would there? Yes, more descriptive error messages, that much we know.

The fact that I can tell the user "this is a problem in your AV, check the configuration" and then stop the application from auto-reporting it as a bug makes a world of difference and will, over time, save my company a lot of money in support.
The fact that the user now gets an indication of what went wrong and can fix the problem instead of - well - stopping to use the software because it's not working for unexplained reasons, too makes a serious difference.
A solvable problem (solvable by the user or the application) being reported with a non-descript message is not a small problem.

I suggested we add EVIRUS so I don't know what you mean here.

No you didn't, you suggested mapping it to EIO. I suggested introducing a new code and you initially argued against it. Scroll up a bit mate.

libuv supports what all platforms support, as best as possible.

No, it supports all platforms as best as it can. That's a difference. A lot more would be possible with a different design, but that's not really the point here.
Essentially libuv is exposing all of the weaknesses of the windows api while not taking advantage of any of it's strength, while basically mirroring the linux api 1:1.
If you're ok with that, that's fine, your opinion is your own but don't tell me I have to like it or that it can't be done better when other cross-platform toolkits have been doing it better for close to 30 years.

Or you could write a proposal and a PR.

What I posted was an example of how the low-level nature of libuv causes problems, not an application to become a libuv contributor. I don't intend to, considering how this issue has been handled less than ever.

This is just not true. Show some damned respect for the people who worked hard to make Windows a first class citizen, which it is. libuv was designed with Windows in mind from the get go: https://tinyclouds.org/iocp-links.html

My beef was with the external api which replicates a low-level posix api, not with the internal implementation. My problem is not with how async is implemented under the hood but with the fact that the exposed api is a pita on Windows, performance issues are one example but lack of descriptive error message is just another side of the same coin.

You post a document from when libuv was introduced but libuv was spawned from node.js which was not designed with windows in mind.
The documentation of the fs module in node.js even states, as the first sentence "The fs module provides an API for interacting with the file system in a manner closely modeled around standard POSIX functions."
Are you claiming that this design of node.js and the fact libuv was initially developed for node.js did not affect the design of libuv?
node.js was ported to windows after it was developed exclusively for unix for the first couple of years.
libuv is part of that port as far as I understand it. That doesn't mean that libuv wasn't constrained from the get go by the focus on unix.

I did not intend to be disrespectful to the people who put work into this project and I don't think I was.
But just because I respect work doesn't mean I can't criticize it. If you can't appreciate constructive criticism that's your problem.

You have a very curious look on how open-source development is supposed to work. Issues are "invalid" unless they come with a PR to fix them and criticism isn't acceptable because someone created the work being criticized (which I guess means criticism isn't ever acceptable)?

@saghul
Copy link
Member

saghul commented Aug 20, 2019

You post a document from when libuv was introduced but libuv was spawned from node.js which was not designed with windows in mind.
The documentation of the fs module in node.js even states, as the first sentence "The fs module provides an API for interacting with the file system in a manner closely modeled around standard POSIX functions."
Are you claiming that this design of node.js and the fact libuv was initially developed for node.js did not affect the design of libuv?
node.js was ported to windows after it was developed exclusively for unix for the first couple of years.
libuv is part of that port as far as I understand it. That doesn't mean that libuv wasn't constrained from the get go by the focus on unix.

You are mistaken. Initially Nodejs used libev on Unix and it gained Windows support later. It was seen that that wasn't very future-proof and a step back was taken to design a platform layer from the ground up (dumping libev in the process too, even though that happened a tad later) and libuv is the result of that.

Abstracting things means we'll walways lose something on each platform in the process, I'm not sure if tha's fixable because then you'd be back to platform specific code.

I did not intend to be disrespectful to the people who put work into this project and I don't think I was.
But just because I respect work doesn't mean I can't criticize it. If you can't appreciate constructive criticism that's your problem.

I think you were disrespectful.

You have a very curious look on how open-source development is supposed to work. Issues are "invalid" unless they come with a PR to fix them and criticism isn't acceptable because someone created the work being criticized (which I guess means criticism isn't ever acceptable)?

Not ever once have I said this is invalid. I've been trying to help you find a way forward.

@TanninOne
Copy link
Author

TanninOne commented Aug 20, 2019

Abstracting things means we'll walways lose something on each platform in the process

So then, can you quote a few examples what functionalities a linux system "loses" (like ACLs on windows)? Performance penalties incurred on a linux system that don't occur on windows (like the aforementioned readdir/stat problem)? Feedback linux provides that gets dropped by libuv (like this issue is about)?

I'm not sure if tha's fixable because then you'd be back to platform specific code

Again, this is mostly because of the low-level nature of libuv.
Take ACLs for example. Yes, Qt too doesn't have direct support for ACLs (NTFS or otherwise), that would be platform specific. It does however provide high-level functions like "QFileInfo::isWritable" that will consider ACLs on platforms that support them.
In libuv - as far as I can tell - you can only use stat which will use only the dos "read-only" attribute and is thus practically useless (for this purpose) on windows.

In the same way: A high-level "make sure user x can read and write file y"-function could be implemented cross-platform cross-fs and do the right thing on each to fulfill the request.
But a posix "chmod(0x744)" may not have the same effect on windows and linux and is thereby a leaky abstraction. It's not "necessary because of platform-agnostic", it is actually exposing platform differences to the dev or - worse - the user and requires the application to deal with them in a very awkward way because the framework provides no support at all.

Yes, some limitations are probably necessary for platform independence but not to the degree libuv does it.

I think you were disrespectful.

If I came across as disrespectful I want to apologize to anyone who thinks so. My intend was simply to expose limitations of the libuv api and explain how they are not "necessary".

Again, the problem this issue is about: If libuv introduces a "get native error code" function that doesn't make the api less "platform agnostic", the way the api behaves differently between platform already means a dev is going to have platform specific code anyway, this new api would just make that code far less awkward.

EDIT: I already use native code for fast directory iteration and ACLs on windows because my application needs both. Now I need another native library just to identify error cases and at least this one I'd like to scrap at some point.

Give devs a way to break out of the platform abstraction, there is no shame in that. If they don't need it: great - but in practice libuv is not going to be able to cover everything.

@vtjnash
Copy link
Member

vtjnash commented Aug 20, 2019

On Windows the readdir equivalent (NtQueryDirectoryFile) already returns file attributes in the initial call, all the stat calls are superfluous.

What stat calls? If you don't want to come across as disrespectful, don't wander off topic at all, and especially don't claim things that are easy to verify without a source link like this one:

libuv/src/win/fs.c

Lines 1546 to 1584 in 1bd7cc5

while (dirent_idx < dir->nentries) {
if (dir->need_find_call && FindNextFileW(dir->dir_handle, find_data) == 0) {
if (GetLastError() == ERROR_NO_MORE_FILES)
break;
goto error;
}
/* Skip "." and ".." entries. */
if (find_data->cFileName[0] == L'.' &&
(find_data->cFileName[1] == L'\0' ||
(find_data->cFileName[1] == L'.' &&
find_data->cFileName[2] == L'\0'))) {
dir->need_find_call = TRUE;
continue;
}
r = uv__convert_utf16_to_utf8((const WCHAR*) &find_data->cFileName,
-1,
(char**) &dirents[dirent_idx].name);
if (r != 0)
goto error;
/* Copy file type. */
if ((find_data->dwFileAttributes & FILE_ATTRIBUTE_DIRECTORY) != 0)
dent.d_type = UV__DT_DIR;
else if ((find_data->dwFileAttributes & FILE_ATTRIBUTE_REPARSE_POINT) != 0)
dent.d_type = UV__DT_LINK;
else if ((find_data->dwFileAttributes & FILE_ATTRIBUTE_DEVICE) != 0)
dent.d_type = UV__DT_CHAR;
else
dent.d_type = UV__DT_FILE;
dirents[dirent_idx].type = uv__fs_get_dirent_type(&dent);
dir->need_find_call = TRUE;
++dirent_idx;
}
SET_REQ_RESULT(req, dirent_idx);
return;

See? It just uses the native API and returns the name and type.

It does however provide high-level functions like "QFileInfo::isWritable" that will consider ACLs on platforms that support them.

The posix name for this is access, which libuv also has. The presence of ACL's (and the TOCTOU bug in your suggestion) causing problems for programs that use chmod and stat isn't unique to Windows. FWIW, Windows itself exports the _chmod function that you're saying can't be platform-agnostic.

lasterror isn't overwritten by the next call to the windows api

Eh, just don't make any more API calls in the meantime. This way has worked for native Windows and Posix developers for decades. It's a bit of a pain, for sure, but it's been feasible.

@TanninOne
Copy link
Author

TanninOne commented Aug 20, 2019

See? It just uses the native API and returns the name and type.

Did you overlook or not understand this part of my post: "if you want to recursively read a directory you call readdir, then stat on each item, if it's a dir you recurse into that directory." ?
I'm not talking about how readdir is implemented, I'm talking about how you'd use readdir to recursively read a directory.

The posix name for this is access, which libuv also has.

the libuv implementation too doesn't support acls any more than stat so what's the point?

and the TOCTOU bug in your suggestion

a) I'm not making suggestions, - like at all - in regards to the api, I'm merely pointing out limitations.
b) how is there a TOCTOU bug in a single isWritable call? What are you even talking about?

causing problems for programs that use chmod and stat isn't unique to Windows

No and I didn't say it was unique to Windows. I said that Windows is affected by this problem, how does that imply exclusivity?

FWIW, Windows itself exports the _chmod function that you're saying can't be platform-agnostic.

again, I don't see how that's relevant. Yes, Windows has a posix-alike api, that doesn't proof - in any way - that that api isn't limited or limiting.
Afaik MS primarily implemented that api because the FIPS requires it, deducing that that means it's a good idea to use it - for any purpose - is very flawed logic.
Windows 10 also still supports APIs that it deprecated with Windows 95. Still doesn't mean it's a good idea to use them.

Eh, just don't make any more API calls in the meantime. This way has worked for native Windows and Posix developers for decades. It's a bit of a pain, for sure, but it's been feasible.

sigh. I didn't say it's not possible, I said I would prefer a libuv-specific "errno" because then you can trust that nothing but a call to libuv will overwrite it.
If you have a different opinion, fine, no reason to "Eh".

@TanninOne
Copy link
Author

ERROR_TIMEOUT (1460) in file write operation

"The cloud operation cannot be performed on a file with incompatible hardlinks." (396)
I didn't find the symbol for this one and unfortunately the error message had no stack but going by the error message we were trying to create a hardlink on a onedrive-synced directory or something.

"A device which does not exist was specified." (433) in file read operation.
Same, no symbol found

@TanninOne
Copy link
Author

TanninOne commented Sep 26, 2019

ERROR_FILE_OFFLINE (4350) may happen when accessing a file on a network-share. Guess ENOENT?

@TanninOne
Copy link
Author

ERROR_CANT_ACCESS_FILE (1920) - The file cannot be accessed by the system.
Happened accessing a file

Seems to be pretty broad, google results seem to indicate this might either be a filesystem corruption (EIO) but in the context of starting services at least it seems to also possibly indicate permission problems (EACCESS)

@TanninOne
Copy link
Author

ERROR_UNEXP_NET_ERR (59) - An unexpected network error occurred.
Happened calling fsync - I'll assume the file was on a network drive.

@TanninOne
Copy link
Author

ERROR_REPARSE_TAG_MISMATCH (4394) - There is a mismatch between the tag specified in the request and the tag present in the reparse point.

Happened trying to remove a file - potentially a hard-link from a OneDrive directory.

@TheAifam5
Copy link

TheAifam5 commented Nov 5, 2019

I hope you are not gonna post every M$ error code. 🙃

What about removing error code mapping and just return the original value?

Mapping every error code for Linux/Windows/MacOS/Zircon(Fuchsia) will take forever. Have in mind that this library may work on other systems in the future.
Managing and updating the code every time M$ decides to change/remove/add the error codes (unlikely remove, but likely deprecate) will be a real horror.

Otherwise,
What about leave as is and provide a function or something to handle unknown error codes and maybe someone will share his extended „mapper“. libuv could slowly expand the current mapper meanwhile.

@bzoz
Copy link
Member

bzoz commented Nov 5, 2019

Removing the mapping would break backward compatibility.

For most of the fs stuff, the original error is stored in sys_errno_ field of uv_fs_t. We could document that and expose it in Node.

@saghul
Copy link
Member

saghul commented Nov 5, 2019

As long as they don't collide with our own mappings, we could also return a negated error code, as on Unix, maybe?

@bzoz
Copy link
Member

bzoz commented Nov 5, 2019

SGTM

@TanninOne
Copy link
Author

TanninOne commented Nov 5, 2019

ERROR_VOLUME_DIRTY (6851) - The operation could not be completed because the volume is dirty. Please run chkdsk and try again.

Happened writing to a file, probably overwriting an existing file.

ERROR_DISK_OPERATION_FAILED (1127) - While accessing the hard disk, a disk operation failed even after retries.

Happened trying to rename a file.

@TanninOne
Copy link
Author

I hope you are not gonna post every M$ error code. 🙃

What about removing error code mapping and just return the original value?

Mapping every error code for Linux/Windows/MacOS/Zircon(Fuchsia) will take forever. Have in mind that this library may work on other systems in the future.
Managing and updating the code every time M$ decides to change/remove/add the error codes (unlikely remove, but likely deprecate) will be a real horror.

Otherwise,
What about leave as is and provide a function or something to handle unknown error codes and maybe someone will share his extended „mapper“. libuv could slowly expand the current mapper meanwhile.

Sorry I didn't reply before, didn't see this message.
My initial request was for a way to get at the original code, not to map all windows error codes to unix codes.
It was bnoordhuis and saghul who requested I continue listing codes so they can be mapped.
I'm only listing errors my user actually got, so it's not like I'm just going down the list of error codes, they all happened in the wild.
I was hoping this would demonstrate how futile it is to try and map everything because every few weeks at least there is one or two new errors being reported that I hadn't seen before. I've kind of given up because I didn't get any feedback for months but if the plan is to actually map everything, I can probably continue listing new codes like this for years to come.

What I'm currently doing, using my own native lib to dig up the original code, is a massive improvement in how well I can support user problems and deal with these issues but the way I had to implement it it's a terrible hack that I'd reeeeeally like to replace.

@bnoordhuis
Copy link
Member

It was bnoordhuis and saghul who requested I continue listing codes so they can be mapped.

Not quite. You're welcome to open pull requests mapping error codes that you need / run into.

@TanninOne
Copy link
Author

Yes, sorry, I just took that to mean that that is the preferred way to deal with the problem.

bzoz added a commit to JaneaSystems/libuv that referenced this issue Apr 28, 2020
Exposes the original system error of the filesystem syscalls. Adds a new
uv_fs_get_system_error which returns orignal errno on Linux or
GetLastError on Windows.

Ref: libuv#2348
bzoz added a commit to JaneaSystems/libuv that referenced this issue Apr 29, 2020
Exposes the original system error of the filesystem syscalls. Adds a new
uv_fs_get_system_error which returns orignal errno on Linux or
GetLastError on Windows.

Ref: libuv#2348
bzoz added a commit that referenced this issue Apr 29, 2020
Exposes the original system error of the filesystem syscalls. Adds a new
uv_fs_get_system_error which returns orignal errno on Linux or
GetLastError on Windows.

Ref: #2348

PR-URL: #2810
Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Santiago Gimeno <santiago.gimeno@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants