Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: os: add ReparsePointTag #70735

Open
qmuntal opened this issue Dec 9, 2024 · 15 comments
Open

proposal: os: add ReparsePointTag #70735

qmuntal opened this issue Dec 9, 2024 · 15 comments

Comments

@qmuntal
Copy link
Member

qmuntal commented Dec 9, 2024

Proposal Details

Windows applications might need to know the reparse point tag of a given os.FileInfo. Although the Windows implementation of os.FileInfo knows the reparse point tag, there is no way to get that information using the public os API. This means that applications are forced to partially reimplement os.Lstat by opening the file and calling GetFileInformationByHandleEx.

I propose to add the following helper function to the os package:

// ReparsePointTag returns the the reparse point tag associated with fi.
//
// ReparsePointTag only applies to results returned by this package's [LStat].
// fi must be associated with a reparse point.
// Otherwise, ReparsePointTag returns false.
func ReparsePointTag(fi FileInfo) (uint32, bool)

The ReparsePointTag implementation could be made more os-agnostic by also defining an interface that external os.FileInfo implementations could implement to get the reparse point tag. This might be an overkill for now, as reparse points are not a broadly-addopted concept outside NTFS file systems.

ReparsePointTag would only be useful on Windows for now, although it could potentially be implemented on other OSes supporting NTFS drivers.

@golang/windows

@earthboundkid
Copy link
Contributor

This is a Windows specific concept and does not belong in the os package. It could go in a syscall package however.

@ianlancetaylor ianlancetaylor moved this to Incoming in Proposals Dec 9, 2024
@ianlancetaylor
Copy link
Member

It seems hard to put this proposed function in any package other than the os package, because it would have to interrogate the unexported Windows-specific fileStat.

That said I also wonder whether there is some way to handle this general idea outside the os package.

@mateusz834
Copy link
Member

mateusz834 commented Dec 9, 2024

It seems hard to put this proposed function in any package other than the os package, because it would have to interrogate the unexported Windows-specific fileStat.

The unexported struct can be moved to an internal/ package. (and it can be exported then)

@qmuntal
Copy link
Member Author

qmuntal commented Dec 9, 2024

The unexported struct can be moved to an internal/ package. (and it can be exported then)

Which would be de input of the syscall function? syscall can't depend on os, so it can't accept a os.FileInfo.

@mateusz834
Copy link
Member

mateusz834 commented Dec 9, 2024

@qmuntal I have not checked it, but what about fs.FileInfo? os.FileInfo is an alias now.

EDIT: io/fs imports time which imports syscall, so not possible.

@earthboundkid
Copy link
Contributor

earthboundkid commented Dec 9, 2024

The unexported struct can be moved to an internal/ package. (and it can be exported then)

Which would be de input of the syscall function? syscall can't depend on os, so it can't accept a os.FileInfo.

I don't understand how these work or what they actually do, but surely it comes back to some FD or Windows equivalent which you can pull out of FileInfo.Sys()?

@apparentlymart
Copy link

apparentlymart commented Dec 9, 2024

I think the point here is that the requested information has already been retrieved from the OS by the time the caller is holding the os.FileInfo object:

func newFileStatFromGetFileInformationByHandle(path string, h syscall.Handle) (fs *fileStat, err error) {
var d syscall.ByHandleFileInformation
err = syscall.GetFileInformationByHandle(h, &d)
if err != nil {
return nil, &PathError{Op: "GetFileInformationByHandle", Path: path, Err: err}
}
var reparseTag uint32
if d.FileAttributes&syscall.FILE_ATTRIBUTE_REPARSE_POINT != 0 {
var ti windows.FILE_ATTRIBUTE_TAG_INFO
err = windows.GetFileInformationByHandleEx(h, windows.FileAttributeTagInfo, (*byte)(unsafe.Pointer(&ti)), uint32(unsafe.Sizeof(ti)))
if err != nil {
return nil, &PathError{Op: "GetFileInformationByHandleEx", Path: path, Err: err}
}
reparseTag = ti.ReparseTag
}
return &fileStat{
name: filepathlite.Base(path),
FileAttributes: d.FileAttributes,
CreationTime: d.CreationTime,
LastAccessTime: d.LastAccessTime,
LastWriteTime: d.LastWriteTime,
FileSizeHigh: d.FileSizeHigh,
FileSizeLow: d.FileSizeLow,
vol: d.VolumeSerialNumber,
idxhi: d.FileIndexHigh,
idxlo: d.FileIndexLow,
ReparseTag: reparseTag,
// fileStat.path is used by os.SameFile to decide if it needs
// to fetch vol, idxhi and idxlo. But these are already set,
// so set fileStat.path to "" to prevent os.SameFile doing it again.
}, nil
}

Note that ReparseTag is stored as a field of this object, but it's returned as an os.FileInfo interface value and since the type is unexported there is no way to type-assert from outside the package to get at it.

Edit: Whoops... submitted to early. 🙄

It seems like this could therefore be exported as a new field of the syscall.Win32FileAttributeData returned by Sys as @earthboundkid suggested:

go/src/os/types_windows.go

Lines 276 to 285 in 6705ac6

func (fs *fileStat) Sys() any {
return &syscall.Win32FileAttributeData{
FileAttributes: fs.FileAttributes,
CreationTime: fs.CreationTime,
LastAccessTime: fs.LastAccessTime,
LastWriteTime: fs.LastWriteTime,
FileSizeHigh: fs.FileSizeHigh,
FileSizeLow: fs.FileSizeLow,
}
}

@qmuntal
Copy link
Member Author

qmuntal commented Dec 9, 2024

It seems like this could therefore be exported as a new field of the syscall.Win32FileAttributeData returned by Sys as @earthboundkid suggested:

That was also my initial thought, but I worry that we will confuse users that legitimately uses syscall.Win32FileAttributeData as an argument to syscall.GetFileAttributesEx (which was the original usage of that type). They will see that there are some properties passed to the Win32 call which are never populated. There are a few projects using it in this way: https://github.com/search?q=%22var+fa+syscall.Win32FileAttributeData%22+language%3AGo+NOT+is%3Afork&type=code.

But maybe it's fine given that these projects should be using golang.org/x/sys.Win32FileAttributeData instead.

@ianlancetaylor
Copy link
Member

Is syscall.Win32FileAttributeData a standard Windows type, or is it something we invented for Go?

@qmuntal
Copy link
Member Author

qmuntal commented Dec 9, 2024

@ianlancetaylor
Copy link
Member

Thanks, unfortunately in that case it seems weird to add a field.

That said, perhaps for a reparse point only we could have the Sys method return a different type. Would that be likely to break existing code?

@apparentlymart
Copy link

While I acknowledge that is isn't really clear how to get there without breaking backward compatibility now, it seems like this Sys method would've been better specified as something which always returns an unexported type that implements zero or more exported interfaces, so that it could be used as a general extension point for various kinds of platform-specific data that can expand over time as new needs emerge.

Then one of those interfaces could've been an extra indirection over getting this Win32FileAttributeData object, and another could separately return reparse point information.

I'm struggling to think of a way to get there from here without breaking code that relies on the current concrete type. Adding new unexported members to it and then having it implement extra interfaces in terms of those unexported members could potentially work, but it'd be weird to do that to a type that is intended to correspond directly to a Win32 API type. 😖

@TBBle
Copy link

TBBle commented Dec 12, 2024

Just spitballing, but what about a new API that returns platform-specific data about any os.ModeIrregular file?

On Windows, that would be the reparsepoint data, e.g. something that could be passed to DecodeReparsePoint or DecodeReparsePointData or one's own custom decoder as needed.

A quick look suggests that the only other in-tree user is Solaris (#52259) and I have no idea what useful data could be provided in those three cases. os.ModeIrregular itself was introduced for the benefit of the tar package (#22903) but that work was never actually completed, so it's not used at all on POSIX AFAICT.

@qmuntal
Copy link
Member Author

qmuntal commented Jan 10, 2025

That said, perhaps for a reparse point only we could have the Sys method return a different type. Would that be likely to break existing code?

It will be a breaking change, unfortunately.

On Windows, that would be the reparsepoint data, e.g. something that could be passed to DecodeReparsePoint or DecodeReparsePointData or one's own custom decoder as needed.

If we did so, os.Lstat would have to retrieve more information from the file handle than what it does today, and that would incur in a performance penalty. Reparse point encoding is a niche task, I don't think it has it's place in the public API of the standard library.

A quick look suggests that the only other in-tree user is Solaris (#52259) and I have no idea what useful data could be provided in those three cases. os.ModeIrregular itself was introduced for the benefit of the tar package (#22903) but that work was never actually completed, so it's not used at all on POSIX AFAICT.

Thanks for the context!

@qmuntal
Copy link
Member Author

qmuntal commented Jan 10, 2025

Thanks for the discussion. There are two path forwards right now that doesn't involve breaking changes:

  1. Implement func ReparsePointTag(fi FileInfo) (uint32, bool).
    • Pros: Doesn't need to modify the meaning of syscall.Win32FileAttributeData.
    • Cons: Pollute the os API with a Windows-only function (would be no-op on other platforms).
  2. Declare that syscall.Win32FileAttributeData is no longer tied to the Win32 struct WIN32_FILE_ATTRIBUTE_DATA, and extend it as with the ReparseTag property.
    • Pros: Doesn't pollute the os API and opens the path to add more Windows-specific attributes to syscall.Win32FileAttributeData.
    • Cons: Users might expect syscall.GetFileAttributesEx to fill the new syscall.Win32FileAttributeData, but it won't. This drawback is mitigated by the fact that users should be using golang.org/x/sys/windows.Win32FileAttributeData instead, and we could document the possible confusion in the syscall package.

I'm leaning towards option 2. What do you think?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Incoming
Development

No branches or pull requests

7 participants