Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zstd willing to (try to) compress non-files #232

Closed
chipturner opened this issue Jul 1, 2016 · 13 comments · Fixed by #519
Closed

zstd willing to (try to) compress non-files #232

chipturner opened this issue Jul 1, 2016 · 13 comments · Fixed by #519

Comments

@chipturner
Copy link
Contributor

gzip refuses to compress files that aren't "normal" such as devices, fifos, sockets, symlinks, etc. zstd tries to compress them anyway, which can lead to odd behavior such as hanging when it's a fifo, reading forever when it's /dev/zero, etc etc.

@Cyan4973
Copy link
Contributor

Cyan4973 commented Jul 1, 2016

Indeed, good point.

I guess this falls outside of standard C,
it will probably require some Posix C specific tests.

@Cyan4973
Copy link
Contributor

Cyan4973 commented Jul 1, 2016

Some more detail :
gzip refuses to compress a direct call to devices, such as :

gzip /dev/zero
gzip: /dev/zero is not a directory or a regular file - ignored

but it does not detect such issue behind a pipe :
gzip < /dev/zero > out will proceed compressing the endless stream.

So it makes the issue easier to deal with.

For information, current version of zstd just exits when provided /dev/zero as input.
No message, it's silent, but there is a non-zero exit code.
It seems to already detect something. Let's make the behavior more complete.

@Cyan4973
Copy link
Contributor

Cyan4973 commented Jul 1, 2016

The reason zstd exits while attempting to compress /dev/zero is that it cannot open the default destination file /dev/zero.zst.

This is now made clearer in latest "dev" branch update, since it will print a more explicit message.
But it doesn't answer this request per se, which is to detect that /dev/zero is not a file and decline compression attempt.

Therefore, zstd /dev/zero -c > out will proceed with compressing /dev/zero, since destination file creation is no longer a problem.

Note however that gzip /dev/zero -c > out and xz /dev/zero -c > out do also proceed with compressing /dev/zero !

This is a surprise. Not sure if this should be considered a feature to imitate or a bug...

@Cyan4973
Copy link
Contributor

Cyan4973 commented Jul 4, 2016

With v0.7.2 out,
zstd will now properly refuse to compress /dev/* and provide a clear error message,
but not the same one as gzip.

Instead of :
gzip: /dev/zero is not a directory or a regular file - ignored
it will tell :
zstd: /dev/zero.zst: Permission denied

that is, compression will not happen because it's not possible to create destination file, and not because /dev/zero is a device.

That being said, I've been surprised to note that both gzip and xz will also happily compress /dev/zero if their output is redirected using -c.
As a consequence, I'm really wondering if they are refusing because source file is a device (but then why would they accept in the other case ?) or for another reason.

Anyway, we are now in a situation where zstd behave externally the same as gzip and xz, but for a difference in the error message.

Is it considered good enough ?

@chipturner
Copy link
Contributor Author

gzip and xz are stat'ing the file to determine its type, and refusing if it isn't a file. When processing as a stream, they can't tell the source of bytes is a device, so they simply have to compress it. For devices, FIFOs, unix domain sockets, and other file-like-but-not-quite-file objects, they can refuse.

Basically they are refusing when they can tell and otherwise doing what they must -- consume input bytes from an already opened file descriptor.

@Cyan4973
Copy link
Contributor

Cyan4973 commented Jul 4, 2016

When processing as a stream, they can't tell the source of bytes is a device, so they simply have to compress it.

Yes, when /dev/zero is provided as a stream, it's logical.

However, in the following example, it seems to be provided as a regular File System object :

gzip /dev/zero -c > /dev/null

Only the output is redirected, not the input.

@chipturner
Copy link
Contributor Author

That may be intentional but the behavior is probably not practically useful. The logic is perhaps "don't compress if input isn't a file and output is" under the premise the output could get terminated by whatever is consuming the pipe. It seems pretty sketchy, but it makes sense for fifos, sockets, etc, even when the output directory is writable.

xz basically checks the output type. Looks like behavior emulated to be consistent (mostly) with gzip and bzip2.

https://github.com/nobled/xz/blob/96f94bc925d579a700147fa5d7793b64d69cfc18/src/xz/file_io.c#L304-L307

inikep added a commit to inikep/zstd that referenced this issue Jan 25, 2017
inikep added a commit to inikep/zstd that referenced this issue Jan 25, 2017
@inikep inikep mentioned this issue Jan 25, 2017
@Cyan4973
Copy link
Contributor

Latest dev branch update by @inikep fixes this issue

terrelln pushed a commit to terrelln/zstd that referenced this issue Jan 27, 2017
terrelln pushed a commit to terrelln/zstd that referenced this issue Jan 27, 2017
terrelln added a commit to terrelln/zstd that referenced this issue Jan 27, 2017
* dev:
  updated NEWS
  fixed MSAN warnings in legacy decoders
  Fix cmake build
  updated NEWS
  Edits as per comments, and change wildcard 'X' to '?'
  Fix Visual Studios project
  Fix pool.c threading.h import
  Fix zstdmt_compress.h include
  Fixed commented issues
  Updated format specification to be easier to understand
  improved facebook#232 fix
  Fixed facebook#232
  .travis.yml: different tests for "master" branch
  .travis.yml: optimized order of short tests
  .travis.yml: test jobs 12-15
  JOB_NUMBER -eq 9
  improved ZSTD_compressBlock_opt_extDict_generic
@kovaxis
Copy link

kovaxis commented May 30, 2021

Is there any good reason to disable compressing on non-files? There's no point in artifically preventing this, and in some cases it can be useful (eg. compressing an entire device image). There's not even a flag to re-enable this functionality, the only current workaround is to pass the input through a pipe, consuming extra CPU.

Could be nice functionality to implement for some edge use-cases.

@terrelln
Copy link
Contributor

terrelln commented Jun 2, 2021

@negamartin you should be able to use -f to force zstd to compress non-files.

@kovaxis
Copy link

kovaxis commented Jun 3, 2021

Seems like it only works with links: running zstd -f /dev/sda -o /dev/null still complains about /dev/sda not being a regular file, at least on version v1.4.4 as reported by --version.

EDIT: Even when running with root privileges.

@terrelln
Copy link
Contributor

terrelln commented Jun 3, 2021

This was recently changed in v1.5.0, see #2613.

@kovaxis
Copy link

kovaxis commented Jun 5, 2021

Thank you! I'm sorry, I didn't find that issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants