-
Notifications
You must be signed in to change notification settings - Fork 7.3k
fs.Stats.ino always returns 0 on Windows #2670
Comments
Just wanted to add that this totally breaks 'findit', and as a consequence, every library that depends on findit is badly broken. That includes various test frameworks, e.g. jasmine-node, archivators, build systems. This is a very big deal. Essentially, node.js on Windows is hosed due to this. It worth noting that this breakage is not very obvious. For example, jasmin-node would only execute a single spec file, ignoring the rest. |
As a work around to findit being broken, walker solves this by not doing the inode check if the inode response is 0. More generally, this plus the fact that windows fs.watchFile throws an exception means that many significant libraries are completely broken on windows and node is unusable for my use case without hacking libraries. |
Is there any news on this? I think this is a big problem for node on Windows. edit: I tested with both v0.6.15 and v0.7.8, the issue remains. |
Node pretty much just blindly takes what libuv gives it here. Moved to joyent/libuv#613 |
@tjfontaine suggested that we could change the type of |
Any news on that? |
@piscisaureus ... what's the status on this one? |
The The issue that remains is that the
|
Ok. Changing
|
Yes, indeed.
That may be better indeed. However, as we're at it, let's consider that the only thing that makes |
Mostly 64bit. However, It's 128bit for Windows Server 2012 ReFS. For the 64bit part of the problem (for the majority of Windows file systems), 32bit number type is still useful since Windows returns a file handle that has 32bit high and 32bit low identifiers. In case the os is Windows, 'ino' may contain lower index and a separate ino_high or something like that can hold the higher end. ReFS is another story and requires additional attention. i.e. the device number is also 64 bit. |
@piscisaureus The @obastemur my key concern with splitting the high and low identifiers like that is basic usability. How would you expect it to be used from within node that would not be covered by the concatenated |
@obastemur Do you have any background on that? I use the https://github.com/libuv/libuv/blob/a6fa3ca99a379912167c45b78a8d03ad23fa6d33/src/win/fs.c#L1087 |
@jasnell The original problem here is that currently node fs.stat.ino does return 0 for Windows. @piscisaureus offered uid because of a) 64bit index number on Windows b) node eventually concatenates dev and ino to use it internally. However imagine an application or native module would simply use Stats object from fs assuming to find the relevant information. Now ino is 0, when we add uid, it will be 0 again. Besides, we will be introducing a new string parameter into Stats which is way more expensive than a DWORD / uint32_t etc. On Windows we could combine dev + highIndex + lowIndex to reach whatever we need only when it's needed. Still lowerIndex serves the purpose (unless it's not ReFS) that ino is no longer 0. I also would like to add that, on Windows lowerIndex serves the file reference index purpose while the higher index is a sequence number and it changes only when i.e a new file is created etc. On NTFS file system these information is only reliable during a process life time (although I didn't see them changing, it's highly possible the file system may reuse the index on a networked file system) thus we also shouldn't encourage people to store a string value that we created which may not be relevant when the process is restarted again in future. |
Thanks for the link; I hadn't noticed that msft had mutilated the API in Windows 2012 (changing an API whose sole purpose is to return an unique identifier, and then making it return a non-unique identifier instead, is just plain stupid).
Most of what you're saying seems reasonable, but I don't really understand what you're trying to suggest we should do. The problem we have is 1) we need a solution that allows users to detect file equality on windows. 2) we don't want to break the existing api. My suggestion is to add another API, using a string, to allow for 1). The numeric |
I don't think we are the only people trying to solve inode puzzle for ntfs. My concerns are;
Instead of easing an operating system variable into something only we know, we may store it as is. So the developer may find something make sense (i.e. on msdn see what these numbers actually mean). |
It's not 0 right now; it gets truncated (albeit in a different way that you suggest; you're suggesting chopping off the topmost 32 bits whereas currently the ino is cast to a double).
Is that not a concern on unix? I don't think linux has a way to provide a temporally constant ino on network filesystem either.
That's a reasonable concern, although in reality we don't know how big the performance impact would be. I think it would make less than 10% difference. A much bigger impact is to be expected from reading the 128-bit index number, since that requires an additional syscall. We already fill out the For performance reasons we might want to support a "lite" version of fs.stat which only reads the most commonly used fields.
No! this is very much the opposite of how I have approached windows support in node. I've always tried to find "common denominator" APIs so people could (as much as possible) assume that node would behave the same on all platforms. I really don't want to make people look up what the semantic differences between 'ino' and 'indexNumber' is on windows, only to find out that there really isn't any but we couldn't fiddle the second value into the first field. Instead, try to be a more imaginative.
|
BTW, I may sound a little aggressive but I'm really happy that you're questioning the way libuv abstracts these things. In the past 4 years nobody has really taken in interest in it and I don't want to do this on my own forever. But let's align ourselves a little bit on goals, so here's my perspective: Node (and by proxy, libuv) should first and foremost try to "plaster over" api differences between platforms, esp. if the underlying feature is the same. e.g. TCP is really the same on windows and unix, so there's absolutely no excuse for the APIs to be platform-specific. Sometimes there are conceptual differences between APIs. In that case I try to set up the APIs such that, when used for the intended use case, the behavior will be similar. So for example:
etc. Sometimes the conceptual differences are too big. I haven't been able to meaningfully support "user groups" and "user ids" on windows. This affects e.g. The way There is no way to read/modify file attributes on windows (e.g. hidden, readonly). (*) Currently not supported |
Last time I've checked it was 0 for node 0.10.x. For node 0.12.x (assuming it takes the higher index), ino represents a sequence number which is mostly useless alone. However lowerIndex represents the file reference index which is most likely unique per app instance. My suggestion is that they could both hold lowerIndex etc.
No it's not. BTW, which format are you referring to ? Windows file index number is a dynamic variable that you shouldn't rely on. However on ext2, 3 etc. it's a static identifier.
For a regular server, I wouldn't expect much performance issue since a system call would consume much more. However a smaller device that especially we are working on with JXcore would suffer from it.
If this statement refers to supporting ReFS, I think we need much more than that and yet supporting it would require a small api break anyways.
Why bother instead of fixing what ino gives and sharing higher index from another property? I bet if there is any guarantee that a unix distro would always return a 32bit value for inode. It all depends to kernel and file system setup.
Definitely I'm not an NTFS expert but wouldn't expect a dynamic index to server this purpose. I don't see much similarities among unix inode and Windows fileIndex apart from the unique id stuff. In details they both act different. I'm agree with you that uid is the simplest option (but may not be the reliable one) yet I have other concerns. We are trying to keep node compatibility with JXcore while trying to figure out how to reduce the footprint. I would really appreciate if we use as less string as possible on the critical parts of the library. I've just wanted to share an option that we don't need a string. Besides, my experience with Windows 10 ARM (iot), I'm having a hard time to believe that dynamic file index is something reliable. Maybe it's just a preview edition problem but who knows.
No hard feelings :) We are just discussing to do the best. |
It takes both the higher and lower index as a single 64-bit value. This gets cast to a double, which is where information is lost. |
I've made some tests on a 2+ years old Windows 8 installation (real hdd). In other words enough number of files and their historical duplicates. Besides a similar years old VMWare Fusion Virtual Machine (Server 2012). Here is the test app
This test has no use on current node 0.10.x since there is no ino, dev support on Windows. It also fails very quickly with latest node.js 0.12.3 . However jxcore/jxcore@967ff3f fixes the problem on the jxcore side. The fix is based on the fact that st_ino, st_uid, st_gid, st_rdev memory blocks are not used by node at all (on Windows). So the solution benefits the (3 x short + 1 x uint) empty space for 2xDWORDs What has changed;
Some results from the tests;
Wasn't really surprised;
Problems;
The proposed solution doesn't break the current API or existing solutions. (no memory aligning etc. problems. Uses the same memory structure, no change in actual types) This was a solution for node 0.10.x. Indeed node 0.12.x uses a different libUV, WinAPI etc. If this approach (sequenceId or something like that) is good to go, I can help with the rest. |
Is this fixed now? I can see a value in Is this value genuinely unique within a filesystem? |
I don't think so. I can find two files look like this:
Windows 7 |
fs.Stats.ino always returns 0 on Windows. Libraries which depends on it, works incorrectly on Windows. For example, "findit".
"findit" has open issue related to this: https://github.com/substack/node-findit/issues/5. @svelez mentions that fileID from FILE_ID_BOTH_DIR_INFO could be used for fs.Stats.ino.
More about FILE_ID_BOTH_DIR_INFO: http://msdn.microsoft.com/en-us/library/windows/desktop/aa364226%28v=vs.85%29.aspx
The text was updated successfully, but these errors were encountered: