New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
error codes on windows #2348
Comments
You're welcome to open pull requests mapping more Windows errors to libuv errors. See commit 11ce5df for an example.
Libuv passes errors as integers, not strings. Encoding the Windows error code into the libuv error code would break backwards compatibility (all the |
The whole point here is this: When libuv returns UNKNOWN there is no way for us to figure out what caused the problem and what error needs to map to what libuv error, because you're dropping the information. Creating a pull-request adding these mappings doesn't get me any closer to identifying all the other "UNKNOWN" errors being reported. |
Some loss of fidelity is expected when it comes to system errors, that's what you buy into when you use libuv. If you can think of a way to expose (more information about) the native error without breaking backwards API/ABI compatibility, let me know. I'll close this out for now but happy to reopen when there's something actionable. |
Since I don't use libuv directly I don't know what options you have but one approach would be to treat the error code as two 16bit values instead of one 32bit number - with the lower 16 bits being the translated, generic code and the upper 16bits the original system-specific code. Node.js can then "&= 0xFFFF" for all programmatic handling of the error code and attach the system error code as is as an additional field to error objects. With that information in hand, we (application devs) then have a chance to investigate the issues and report back what actual errors happen in the wild and improve the error code mapping. If that's not feasible, do what GetLastError or errno do: provide a (thread-)global variable containing the last system error associated with error you're returning. |
Encoding error codes is not an option because of backwards compatibility. I also thought of that hypothetical
(Maybe the Haiku or z/OS or a future Fuchsia port could return something meaningful but that needs investigation.) If you think those issues can be overcome, you're welcome to open a (WIP or otherwise) pull request for further discussion. |
so? Then break backwards compatibility. The whole point of having the first number in semantic versioning is so you can signal to your users when you do. What's stopping you from making this an api change for libuv 2.0?
Why wouldn't it?
Might be or will be? Why not investigate that before this issue is closed and forgotten?
Libuv is doing it wrong then. You can't force every platform into the same limited corset of functionality without severely limiting its usefulness. Right now the platform support for Windows is seriously borked. This here is not a small issue to me and I have a hard time believing I'm the only one.
It would be way too much work for me to acquire enough knowledge about the libuv code to do that. From my perspective a better use of my time is to write my own native modules to circumvent libuv as much as possible. No offense but imho the start page of libuv should say that Windows is not a fully supported platform, that would have saved me a lot of grief. |
Overly dramatic replies are unlikely to get you what you want. #1597 goes into a lot of detail why a v2.x release is unlikely to happen anytime soon. As to your "Windows is not a fully supported platform" quip, there are quite possibly more Windows users than all other platforms combined. Libuv shows up in a ton of places and most people seem pretty happy with it. To answer your other questions:
|
While adding more error mappings is always good, but there are a lot of Windows error codes. We cannot possibly map them all. I say we reopen this since the issue is real. |
Thank you bzoz! For the time being I've written a small native module that uses MS Detours to crowbar something into uv_translate_sys_error that saves away the last windows error that produced "UNKNOWN" and then I fetch it asap from my JS code. The main issue - like bnoordhuis already pointed out - is that I can't store the error in a way that lets me reliably match it to the Error object so all I get is "this UNKNOWN error happened, also this windows error recently caused an UNKNOWN error, maybe they are the same, maybe not." I hope this still gives me a better idea of what "unknown" errors are actually happening in the wild using my software. |
Saw ERROR_NOT_READY (code 21) happen on a call to fsync - mapped to UNKNOWN. |
Another "UNKNOWN" one: ERROR_VIRUS_INFECTED (code 225) happens when writing a file. Not sure how to map that... |
ERROR_DEVICE_HARDWARE_ERROR (483) reported on CreateFile. Windows reports it as "The request failed due to a fatal device hardware error." This followed immediately after a ERROR_IO_DEVICE which is actually already mapped to EIO. |
We can probably check here as well: https://github.com/chromium/chromium/blob/master/net/base/net_errors_win.cc |
As for the ones you reported already, my 2 cents: ERROR_NOT_READY -> EAGAIN |
I kind of wish there a libuv code that was a bit more precise for the "ERROR_VIRUS_INFECTED" case because in that case the user actually may have to take action based on it by white-listing the application or reporting the false positive (if it is one) to the AV company. |
On Windows, there actual error is available in the Line 611 in 2c27950
While it's part of the "private" fields, you should be ok using it. Now, since you are using Node, you'd need Node to expose this too. At the end of the the day I don't think it's libuv / node's resposibility to tell you the file was infected, you have the AV software for that. |
Develop software for Windows for a bit and you will learn why it doesn't work like that. Some AVs will silently delete/quarantine files without ever reporting it. Many users will click "ok" on every dialog without reading it. This is not about "node having a responsibility to tell that a file was virus infected" but if ERROR_VIRUS_INFECTED is reported as EIO then every EIO could be an AV false positive. The more different errors you map to any libuv error code the more fuzzy the libuv code becomes. |
If introducing a new error code in libuv is no option, maybe mapping ERROR_VIRUS_INFECTED to EPERM is more appropriate than EIO. |
That hasn't been discussed. @libuv/collaborators thoughts on adding UV_EVIRUS? Obviously it wouldn't map to anything on Unix. FWIW: Chromium's error mappings define a specific one for this case. I'm personally +0.25.
I thinkthat's even more misleading, because EPERM means there was a permission error, generally. |
Yes, but one could argue that the virus scanner here acts like a form of file protection, does it really matter how and for what purpose the file is "protected"? More error codes I encountered: Essentially you can probably get one of an entire cluster of cloud-related error messages (roughly between 362-404) simply by trying to access documents when the user has decided to use onedrive. |
I don't really have strong feelings either way. My only concern is that embedders might need to add a bunch of checks for I think I prefer a separate API for getting the Windows specific error (as @bzoz suggested) if it's feasible. |
How do? We'd take care of the mapping, just like the other libuv errors.
Yeah, that's what I had in mind.
This would be really simple since we already have that field there. But this doesn't help current libuv users (Node being one of them), does it? If one sees this one popping up on a backtrace they'd just go and handle it, I guess. In general, this discussion has bifurcated a bit, let me summarize FTR:
I don'think anyone disagrees with the first one, it's just a matter of mapping them to whatever makes sense. As for the second one... things get a bit tricky. Generally we seem to fold things to the Unix side, but here we have nothing to fold these errors to. We can provide an API, yes, but then we wouldn't be that platform agnostic, would we? Also important: the need for the latter seems to have come up exactly once :-P |
One idea might be to call |
I don't really fully agree with "some" or "easily fixable" here. There are at least 40 error messages related to OneDrive in the cluster of errors I mentioned above, some indicate temporary network errors, some configuration errors the user has to fix, some might indicate broken data. I still feel like we're losing a lot of information even when adding the mappings, I hoped that me posting all these error messages I'm getting demonstrated how useful it would be to get at the original error message. Besides: As our discussion above regarding ERROR_VIRUS_INFECTED demonstrates it's not all that easy to determine how to map errors, whether the mapping should happen based on what the error most closely represents (EIO) or what gives the user/developer the best indication of how to deal with it (EPERM). I will continue to post error codes I experienced in the wild if that is useful but for the record: No, I don't think that adding more mappings is an appropriate solution.
It depends on how you want to define platform agnostic, do you want to support only the set of functionality that all platforms support? Obviously libuv can't just be fully redesigned, it is what it is. I just want you to understand that there is a considerable cost to how low-level libuv is. Please, please just understand that Windows is not unix with CamelCase function names.
I think it would be safer to use a custom api/custom global to ensure the lasterror isn't overwritten by the next call to the windows api, especially when the user application uses libraries (stl, boost, ...) it's practically impossible to know which call might, under the hood, trigger a call to the windows api and it would be an implementation detail that could change between versions. No library makes any promises on whether they affect the platform "last error" variable. |
In the immortal words of The Dude: that's just, like, your opinion, man.
So? That's not an impossible to solve problem. It just requires compiling the list (which you have been doing already!) and see what we can map them too, as accurately as possible.
No, you haven't proved that. The operation would have still failed, and there would be nothing your application could do to fix it, would there? Yes, more descriptive error messages, that much we know.
I suggested we add
libuv supports what all platforms support, as best as possible.
Really uh? Tell me if the libuv API doesn't look more like IOCP than epoll, I dare you :-)
Or you could write a proposal and a PR.
This is just not true. Show some damned respect for the people who worked hard to make Windows a first class citizen, which it is. libuv was designed with Windows in mind from the get go: https://tinyclouds.org/iocp-links.html I'll stop here because I feel trying to help you has been a colosal waste of my time. |
Nor did I claim it was. You said "I don'think anyone disagrees with the first one" - acting like you were stating a fact. I was just pointing out that I do in fact disagree.
Have fun doing that and when you're done there is only ~16000 more error codes still not mapped left.
I didn't say I proved that, I said "I still feel like".
The fact that I can tell the user "this is a problem in your AV, check the configuration" and then stop the application from auto-reporting it as a bug makes a world of difference and will, over time, save my company a lot of money in support.
No you didn't, you suggested mapping it to EIO. I suggested introducing a new code and you initially argued against it. Scroll up a bit mate.
No, it supports all platforms as best as it can. That's a difference. A lot more would be possible with a different design, but that's not really the point here.
What I posted was an example of how the low-level nature of libuv causes problems, not an application to become a libuv contributor. I don't intend to, considering how this issue has been handled less than ever.
My beef was with the external api which replicates a low-level posix api, not with the internal implementation. My problem is not with how async is implemented under the hood but with the fact that the exposed api is a pita on Windows, performance issues are one example but lack of descriptive error message is just another side of the same coin. You post a document from when libuv was introduced but libuv was spawned from node.js which was not designed with windows in mind. I did not intend to be disrespectful to the people who put work into this project and I don't think I was. You have a very curious look on how open-source development is supposed to work. Issues are "invalid" unless they come with a PR to fix them and criticism isn't acceptable because someone created the work being criticized (which I guess means criticism isn't ever acceptable)? |
You are mistaken. Initially Nodejs used libev on Unix and it gained Windows support later. It was seen that that wasn't very future-proof and a step back was taken to design a platform layer from the ground up (dumping libev in the process too, even though that happened a tad later) and libuv is the result of that. Abstracting things means we'll walways lose something on each platform in the process, I'm not sure if tha's fixable because then you'd be back to platform specific code.
I think you were disrespectful.
Not ever once have I said this is invalid. I've been trying to help you find a way forward. |
So then, can you quote a few examples what functionalities a linux system "loses" (like ACLs on windows)? Performance penalties incurred on a linux system that don't occur on windows (like the aforementioned readdir/stat problem)? Feedback linux provides that gets dropped by libuv (like this issue is about)?
Again, this is mostly because of the low-level nature of libuv. In the same way: A high-level "make sure user x can read and write file y"-function could be implemented cross-platform cross-fs and do the right thing on each to fulfill the request. Yes, some limitations are probably necessary for platform independence but not to the degree libuv does it.
If I came across as disrespectful I want to apologize to anyone who thinks so. My intend was simply to expose limitations of the libuv api and explain how they are not "necessary". Again, the problem this issue is about: If libuv introduces a "get native error code" function that doesn't make the api less "platform agnostic", the way the api behaves differently between platform already means a dev is going to have platform specific code anyway, this new api would just make that code far less awkward. EDIT: I already use native code for fast directory iteration and ACLs on windows because my application needs both. Now I need another native library just to identify error cases and at least this one I'd like to scrap at some point. Give devs a way to break out of the platform abstraction, there is no shame in that. If they don't need it: great - but in practice libuv is not going to be able to cover everything. |
What Lines 1546 to 1584 in 1bd7cc5
See? It just uses the native API and returns the name and type.
The posix name for this is
Eh, just don't make any more API calls in the meantime. This way has worked for native Windows and Posix developers for decades. It's a bit of a pain, for sure, but it's been feasible. |
Did you overlook or not understand this part of my post: "if you want to recursively read a directory you call readdir, then stat on each item, if it's a dir you recurse into that directory." ?
the libuv implementation too doesn't support acls any more than stat so what's the point?
a) I'm not making suggestions, - like at all - in regards to the api, I'm merely pointing out limitations.
No and I didn't say it was unique to Windows. I said that Windows is affected by this problem, how does that imply exclusivity?
again, I don't see how that's relevant. Yes, Windows has a posix-alike api, that doesn't proof - in any way - that that api isn't limited or limiting.
sigh. I didn't say it's not possible, I said I would prefer a libuv-specific "errno" because then you can trust that nothing but a call to libuv will overwrite it. |
ERROR_TIMEOUT (1460) in file write operation "The cloud operation cannot be performed on a file with incompatible hardlinks." (396) "A device which does not exist was specified." (433) in file read operation. |
ERROR_FILE_OFFLINE (4350) may happen when accessing a file on a network-share. Guess ENOENT? |
ERROR_CANT_ACCESS_FILE (1920) - The file cannot be accessed by the system. Seems to be pretty broad, google results seem to indicate this might either be a filesystem corruption (EIO) but in the context of starting services at least it seems to also possibly indicate permission problems (EACCESS) |
ERROR_UNEXP_NET_ERR (59) - An unexpected network error occurred. |
ERROR_REPARSE_TAG_MISMATCH (4394) - There is a mismatch between the tag specified in the request and the tag present in the reparse point. Happened trying to remove a file - potentially a hard-link from a OneDrive directory. |
I hope you are not gonna post every M$ error code. 🙃 What about removing error code mapping and just return the original value? Mapping every error code for Linux/Windows/MacOS/Zircon(Fuchsia) will take forever. Have in mind that this library may work on other systems in the future. Otherwise, |
Removing the mapping would break backward compatibility. For most of the fs stuff, the original error is stored in |
As long as they don't collide with our own mappings, we could also return a negated error code, as on Unix, maybe? |
SGTM |
ERROR_VOLUME_DIRTY (6851) - The operation could not be completed because the volume is dirty. Please run chkdsk and try again. Happened writing to a file, probably overwriting an existing file. ERROR_DISK_OPERATION_FAILED (1127) - While accessing the hard disk, a disk operation failed even after retries. Happened trying to rename a file. |
Sorry I didn't reply before, didn't see this message. What I'm currently doing, using my own native lib to dig up the original code, is a massive improvement in how well I can support user problems and deal with these issues but the way I had to implement it it's a terrible hack that I'd reeeeeally like to replace. |
Not quite. You're welcome to open pull requests mapping error codes that you need / run into. |
Yes, sorry, I just took that to mean that that is the preferred way to deal with the problem. |
Exposes the original system error of the filesystem syscalls. Adds a new uv_fs_get_system_error which returns orignal errno on Linux or GetLastError on Windows. Ref: libuv#2348
Exposes the original system error of the filesystem syscalls. Adds a new uv_fs_get_system_error which returns orignal errno on Linux or GetLastError on Windows. Ref: libuv#2348
Exposes the original system error of the filesystem syscalls. Adds a new uv_fs_get_system_error which returns orignal errno on Linux or GetLastError on Windows. Ref: #2348 PR-URL: #2810 Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl> Reviewed-By: Colin Ihrig <cjihrig@gmail.com> Reviewed-By: Santiago Gimeno <santiago.gimeno@gmail.com>
I'm using libuv through node.js so sorry for not knowing the exact version.
One of my pet peeves with node.js/libuv is how many errors on windows are mapped to "UNKNOWN" with all details about the error being lost completely, which makes debugging user reports a nightmare.
Concrete example:
require('child_process').spawn(<something that isn't actually an executable, like a text file>)
calls uv_spawn under the hood (not sure about the parameters, sry) which in turn calls CreateProcessW.
The error produced by windows is code 193 - "%1 is not a valid Win32 application" - which is a fairly useful error message.
uv_spawn seems to return UNKNOWN - "unknown error" which isn't.
Or: accessing any file on disk can trigger 1224 - "The requested operation cannot be performed on a file with a user-mapped section open." due to virus scanners locking files. It should probably map to EBUSY so user code can retry the call but instead it maps, again, to UNKNOWN.
Mapping the above cases to something more reasonable is probably easy enough, but they are only examples that we found more by accident than anything else - we currently have dozens of open error reports with code "UNKNOWN" where the users have no idea on how to reproduce and we have no idea what the trigger may be.
It would help tremendously if there was some way to encode the original error code into the message, even if it was just added into the error message, something like code: UNKNOWN, message: "unknown error (code 1224)" the error message would be infinitely more useful.
Considering windows has an absurd number of different error codes, I don't have a lot of hope that all of them will eventually be mapped to sensible "generic" codes, I think a goal should be to make the unknown error case more useful.
The text was updated successfully, but these errors were encountered: