Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

copy-file misuse may lead to information disclosure vulnerabilities #4511

Open
winny- opened this issue Dec 3, 2022 · 11 comments
Open

copy-file misuse may lead to information disclosure vulnerabilities #4511

winny- opened this issue Dec 3, 2022 · 11 comments

Comments

@winny-
Copy link
Contributor

winny- commented Dec 3, 2022

What version of Racket are you using?
8.5 cs

What program did you run?

winston@icarus ~ $ mkdir /tmp/perm-experiment
winston@icarus ~ $ cd /tmp/perm-experiment/
winston@icarus /tmp/perm-experiment $ touch orig
winston@icarus /tmp/perm-experiment $ umask 077
winston@icarus /tmp/perm-experiment $ cp orig copied_with_cp
winston@icarus /tmp/perm-experiment $ cp -p orig copied_with_cp-p
winston@icarus /tmp/perm-experiment $ racket -e '(copy-file "orig" "copied_with_racket_copy-file")'
winston@icarus /tmp/perm-experiment $ ls -l
total 0
-rw------- 1 winston users 0 Dec 15 10:30 copied_with_cp
-rw-r--r-- 1 winston users 0 Dec 15 10:30 copied_with_cp-p
-rw-r--r-- 1 winston users 0 Dec 15 10:30 copied_with_racket_copy-file
-rw-r--r-- 1 winston users 0 Dec 15 10:30 orig

Another way to reproduce - set up POSIX ACL for default permissions on a directory. copy-file defeats this mechanism.

winston@icarus ~ $ mkdir /tmp/b
winston@icarus ~ $ setfacl -d -m u::rwX,g::rwX,o::- /tmp/b
winston@icarus ~ $ cd /tmp/b
winston@icarus /tmp/b $ ls -la
total 0
drwxr-xr-x+ 1 winston users     0 Dec  3 15:52 .
drwxrwxrwt  1 root    root  10006 Dec  3 15:52 ..
winston@icarus /tmp/b $ touch orig
winston@icarus /tmp/b $ ls -la
total 0
drwxr-xr-x+ 1 winston users     8 Dec  3 15:54 .
drwxrwxrwt  1 root    root  10006 Dec  3 15:52 ..
-rw-rw----  1 winston users     0 Dec  3 15:54 orig
winston@icarus /tmp/b $ chmod o+rw orig
winston@icarus /tmp/b $ ls -l
total 0
-rw-rw-rw- 1 winston users 0 Dec  3 15:54 orig
winston@icarus /tmp/b $ cp orig cp
winston@icarus /tmp/b $ racket -e '(copy-file "orig" "racket")'
winston@icarus /tmp/b $ ls -l
total 0
-rw-rw---- 1 winston users 0 Dec  3 15:54 cp
-rw-rw-rw- 1 winston users 0 Dec  3 15:54 orig
-rw-rw-rw- 1 winston users 0 Dec  3 15:54 racket

What should have happened?

  • respect umask semantics and POSIX ACLs

If you got an error message, please include it here.

Please include any other relevant details
e.g., the operating system used or how you are running the code.

NixOS 22.05. Probably reproducible on any unix like.

https://discord.com/channels/571040468092321801/618895179343986688/1048579606430617722

@mflatt
Copy link
Member

mflatt commented Dec 3, 2022

The copy-file function is intended to be like cp -p. More generally, it's meant to provide access to file-copying behavior when more directly supported by the OS, like CopyFileW on Windows (and similar resource-fork-preserving functionality on Mac OS back in the days before OS X).

I can imagine adding a flag to copy-file, but maybe a new function in racket/file would be better, since it can be implemented relatively easily with open-input-file and open-output-file. Having a separate function that is cross-referenced from copy-file might help make the current behavior more clear.

@winny-
Copy link
Contributor Author

winny- commented Dec 4, 2022

My goal is to protect users from surprises due to not understanding this unique diversion from standard unix permission semantics. I'd encourage updating the copy-file api itself to do the "safe thing".

We could take inspiration from Python and include a copy-mode procedure that could be used after copy-file to restore this old functionality where truly desired. (I am at a loss for where it might cause breakage).

@mfelleisen
Copy link
Collaborator

What about adding a function copy-file/unix?

@LiberalArtist
Copy link
Contributor

LiberalArtist commented Dec 5, 2022

I haven't paged it all back in yet, but there was some discussion about permissions with copy-file in #4126, https://racket.discourse.group/t/umask-set-default-file-permissions/613/6, #3870 (comment), and #3870 (comment).

The copy-file function is intended to be like cp -p. More generally, it's meant to provide access to file-copying behavior when more directly supported by the OS, like CopyFileW on Windows (and similar resource-fork-preserving functionality on Mac OS back in the days before OS X).

This part reminds me that I've sometimes been interested in adding support in rktio for more recent OS copying interfaces like clonefile or fcopyfile on Mac OS (and FreeBSD?) and copy_file_range or sendfile on Linux/Glibc (and maybe some BSDs?). These APIs avoid moving file data from kernel space to user space and back and, depending on the file system, can use copy-on-write/"reflink" support. I think @Bogdanp was interested in this for the web server at one point, too. (At least copy_file_range is more general and might be useful in copy-port, too.)

@mflatt
Copy link
Member

mflatt commented Dec 6, 2022

After looking a various file APIs and thinking about the trade-offs of new functions versus new arguments, here's a proposal to update the copy-file API:

 (copy-file src dest 
            [exists-ok?/legacy #f] ; but error if `#:exists-ok?` is also provided
            #:exists-ok? [exist-ok? exists-ok?/legacy]
            #:permissions [permissions 'same] ; or an integer
            #:umask? [umask? #f]) ; Ignored on Windows

I think it might be better if umask? defaulted to #t, but I'm really not sure. In any case, a change is bound to break some existing script (e.g., a typical #o022 umask is in place when a file is copied that needs to stay group-writable, for whatever reason), so our policy would be to leave behavior as-is.

mflatt added a commit to mflatt/racket that referenced this issue Dec 10, 2022
Support a `#:permissions` optional argument as well as one that
control the interaction with a umask when the destination file is
created.

The optional by-position `exists-ok?` argument is still supported, but
a new `#:exists-ok?` keyword argument is also available; a contract
exception is raised if both the by-position and keyword variants are
supplied.

Related to racket#4511
mflatt added a commit that referenced this issue Dec 10, 2022
Support a `#:permissions` optional argument as well as one that
control the interaction with a umask when the destination file is
created.

The optional by-position `exists-ok?` argument is still supported, but
a new `#:exists-ok?` keyword argument is also available; a contract
exception is raised if both the by-position and keyword variants are
supplied.

Related to #4511
@LiberalArtist
Copy link
Contributor

After looking a various file APIs and thinking about the trade-offs of new functions versus new arguments, here's a proposal to update the copy-file API:

 (copy-file src dest 
            [exists-ok?/legacy #f] ; but error if `#:exists-ok?` is also provided
            #:exists-ok? [exist-ok? exists-ok?/legacy]
            #:permissions [permissions 'same] ; or an integer
            #:umask? [umask? #f]) ; Ignored on Windows

I think it might be better if umask? defaulted to #t, but I'm really not sure. In any case, a change is bound to break some existing script (e.g., a typical #o022 umask is in place when a file is copied that needs to stay group-writable, for whatever reason), so our policy would be to leave behavior as-is.

Do I recall correctly that all of the other file-creation functions ultimately go through open-output-file? It specifies that:

When the file specified by path is created, permissions specifies the permissions of the created file, where an integer representation of permissions is treated the same as for file-or-directory-permissions. On Unix and Mac OS, these permissions bits are combined with the process’s umask.

Does "combined" here mean bitwise-ior?

Should we have an option for "use exactly these permissions, ignoring the umask"? (If so, maybe it's almost worth considering packing #:permissions, #:exists-ok?, and #:umask? into a value that could be passed around more conveniently.)

We could take inspiration from Python and include a copy-mode procedure that could be used after copy-file to restore this old functionality where truly desired. (I am at a loss for where it might cause breakage).

AFAICT, Python's shutil.copy in fact does copy the permissions of the source file to the destination file, just as Racket's copy-file does, and shutil.copy2 seems to go even further and copy all of the source file's metadata.

Overall, I guess I don't have very strong expectations about how umask will be treated outside of a few narrow situations. (Insert requisite grumbling about pervasive, implicit, mutable state here.)

@mflatt
Copy link
Member

mflatt commented Dec 13, 2022

The interface I ended up pushing so far is

 (copy-file src
            dest
            [exists-ok?/pos #f]
            [#:exists-ok? exists-ok? exists-ok?/pos] ; but error if both supplied
            [#:permissions permissions #f] ; #f or integer
            [#:replace-permissions? replace-permissions? #t])

It's #:replace-permissions? instead of #:umask? because there's also a question of what to do when the file already exists, so #:replace-permissions? forces permissions whether they would ptherwise be affected by umask or be the existing file's permissions.

Do I recall correctly that all of the other file-creation functions ultimately go through open-output-file?

That sounds right. I don't remember another way to create a file via Racket primitives.

Does "combined" here mean bitwise-ior?

The actual combination is bitwise-and of bitwise-not of a umask. The Racket documentation doesn't try to explain a umask, treating it as something defined by the OS, just like the concept of "file". It's never clear where to draw that line, though.

Should we have an option for "use exactly these permissions, ignoring the umask"?

That ends up being the meaning of supplying #:permisssions. Separating #:permissions and #:replace-permissions? provides more flexibility and, I think, better reflects the underlying operations.

If so, maybe it's almost worth considering packing #:permissions, #:exists-ok?, and #:umask? into a value that could be passed around more conveniently.

This question could apply to any Racket function that accepts multiple keyword arguments instead of a single option-specifying object. There are trade-offs either way, but our convention is generally multiple keyword arguments, and I'd prefer to stick with that here.

Zuo doesn't have keyword arguments and generally uses a hash-table argument for options. Along the same lines as the Racket copy-file changes, the cp function in Zuo now accepts an options table. That convention works ok, and it occasionally works a little better, but it's usually a toss-up. Also, I end up putting pre-packaged option sets like :no-replace-mode in the standard library to get back a similar level of convenience to keyword arguments.

@LiberalArtist
Copy link
Contributor

LiberalArtist commented Dec 14, 2022

The interface I ended up pushing so far is

 (copy-file src
            dest
            [exists-ok?/pos #f]
            [#:exists-ok? exists-ok? exists-ok?/pos] ; but error if both supplied
            [#:permissions permissions #f] ; #f or integer
            [#:replace-permissions? replace-permissions? #t])

It's #:replace-permissions? instead of #:umask? because there's also a question of what to do when the file already exists, so #:replace-permissions? forces permissions whether they would ptherwise be affected by umask or be the existing file's permissions.

I'd missed that change in the commit: I like #:replace-permissions? much better!

Does "combined" here mean bitwise-ior?

The actual combination is bitwise-and of bitwise-not of a umask. The Racket documentation doesn't try to explain a umask, treating it as something defined by the OS, just like the concept of "file". It's never clear where to draw that line, though.

Should we have an option for "use exactly these permissions, ignoring the umask"?

That ends up being the meaning of supplying #:permisssions. Separating #:permissions and #:replace-permissions? provides more flexibility and, I think, better reflects the underlying operations.

I think your answer here was about copy-file, but I was trying to ask whether open-output-file and derived functions ought to also get an argument like #:replace-permissions?.

If so, maybe it's almost worth considering packing #:permissions, #:exists-ok?, and #:umask? into a value that could be passed around more conveniently.

This question could apply to any Racket function that accepts multiple keyword arguments instead of a single option-specifying object. There are trade-offs either way, but our convention is generally multiple keyword arguments, and I'd prefer to stick with that here.

Zuo doesn't have keyword arguments and generally uses a hash-table argument for options. Along the same lines as the Racket copy-file changes, the cp function in Zuo now accepts an options table. That convention works ok, and it occasionally works a little better, but it's usually a toss-up. Also, I end up putting pre-packaged option sets like :no-replace-mode in the standard library to get back a similar level of convenience to keyword arguments.

I agree that the multiple keywords are the more idiomatic solution here, and I think we can avoid options proliferating to an extent that would make the trade-off uncomfortable.

The example I had in mind of an options-specifying object was web-server/safety-limits. I think it worked out fairly well in its context, but there were more options involved, especially many places they needed to be propagated (some of them obscure), and complexity around default values and planning for compatibility.

@mflatt
Copy link
Member

mflatt commented Dec 15, 2022

I think your answer here was about copy-file, but I was trying to ask whether open-output-file and derived functions ought to also get an argument like #:replace-permissions?.

Oh, ok. I'll add that.

@winny-
Copy link
Contributor Author

winny- commented Dec 15, 2022

AFAICT, Python's shutil.copy in fact does copy the permissions of the source file to the destination file, just as Racket's copy-file does, and shutil.copy2 seems to go even further and copy all of the source file's metadata.

Kind of. shutil.copyfile does not copy metadata (i.e. perms). shutil.copy2 is like the former, however tries to preserve metadata (i.e. perms).

winston@icarus ~ $ mkdir /tmp/perm-experiment
winston@icarus ~ $ cd /tmp/perm-experiment/
winston@icarus /tmp/perm-experiment $ touch orig
winston@icarus /tmp/perm-experiment $ umask 077
winston@icarus /tmp/perm-experiment $ cp orig copied_with_cp
winston@icarus /tmp/perm-experiment $ cp -p orig copied_with_cp-p
winston@icarus /tmp/perm-experiment $ racket -e '(copy-file "orig" "copied_with_racket_copy-file")'
winston@icarus /tmp/perm-experiment $ python -c 'import shutil; shutil.copyfile("orig", "copied_with_python_shutil_copyfile")'
winston@icarus /tmp/perm-experiment $ python -c 'import shutil; shutil.copy2("orig", "copied_with_python_shutil_copy2")'
winston@icarus /tmp/perm-experiment $ ls -l
total 0
-rw------- 1 winston users 0 Dec 15 10:30 copied_with_cp
-rw-r--r-- 1 winston users 0 Dec 15 10:30 copied_with_cp-p
-rw-r--r-- 1 winston users 0 Dec 15 10:30 copied_with_python_shutil_copy2
-rw------- 1 winston users 0 Dec 15 10:32 copied_with_python_shutil_copyfile
-rw-r--r-- 1 winston users 0 Dec 15 10:30 copied_with_racket_copy-file
-rw-r--r-- 1 winston users 0 Dec 15 10:30 orig

@LiberalArtist
Copy link
Contributor

AFAICT, Python's shutil.copy in fact does copy the permissions of the source file to the destination file, just as Racket's copy-file does, and shutil.copy2 seems to go even further and copy all of the source file's metadata.

Kind of. shutil.copyfile does not copy metadata (i.e. perms). shutil.copy2 is like the former, however tries to preserve metadata (i.e. perms).

I was confused for a while before I read this more closely and realized there are three functions (among others):

  • shutil.copyfile: Content only, no metadata
  • shutil.copy: Content and permissions
  • shutil.copy2: Content and most metadata, but:

    On POSIX platforms, … file owner and group are lost as well as ACLs. On Mac OS, the resource fork and other metadata are not used. This means that resources will be lost and file type and creator codes will not be correct. On Windows, file owners, ACLs and alternate data streams are not copied.

I for one don't have a particular expectation about which semantics an unfamiliar file copying function might have. (If anything, I'd be most surprised by loosing resource forks!)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants