Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Differences between gzip and zstd #854

Closed
ghost opened this issue Sep 18, 2017 · 4 comments
Closed

Differences between gzip and zstd #854

ghost opened this issue Sep 18, 2017 · 4 comments

Comments

@ghost
Copy link

ghost commented Sep 18, 2017

zstd on Linux: two observations

  1. If I start gzip to compress a large file and if I kill gzip using ^C, gzip will delete the output file before terminating. zstd currently doesn't do that. I think gzip's behavior is better, because it's better to have no output file than to have an incomplete or corrupted output file.

  2. When compressing or decompressing a file, zstd uses the system call below to create the output file (from strace):
    open("file.zst", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 3
    gzip and xz use similar system calls, but they use mode 0600. They switch the mode of the output file to the desired mode after the file was closed. Is there a reason why zstd uses 0666? I think 0600 would be better: don't allow read access to other users before the file is ready.

@Cyan4973
Copy link
Contributor

  1. trapping the ^C event requires a signal handler. I haven't look into it yet. I'm also worried by portability concerns.

  2. Your trace is more precise than our code.
    The call to open destination file is here :
    https://github.com/facebook/zstd/blob/dev/programs/fileio.c#L324
    It's just a standard fopen("name", "wb");, we don't control any property bit.
    It happens to translate into this trace on your system, just know we don't have control over it.
    In general, we try to remain as close as possible to standard C, for portability reasons (and reduced dependency headache).

@ghost
Copy link
Author

ghost commented Sep 19, 2017

First of all: thanks for writing zstd. I'm amazed by speed and compression ratio! Very nice!

Please feel free to close this ticket. My two observations are only minor nitpicks and it's perfectly fine if you decide not to do anything.

  1. I agree that trapping ^C requires a signal handler.
  2. Interesting. I looked into the source code of xz and they open the output file with the code below:
      int flags = O_WRONLY | O_BINARY | O_NOCTTY | O_CREAT | O_EXCL;
      #ifndef TUKLIB_DOSLIKE
          flags |= O_NONBLOCK;
      #endif
      const mode_t mode = S_IRUSR | S_IWUSR;
      dest_fd = open( dest_name, flags, mode );

They use open() instead of fopen() - and thus can control the permission bits of the output file.

@Cyan4973
Copy link
Contributor

Cyan4973 commented Sep 19, 2017

Right, I believe xz targets POSIX compliant systems,
hence it can freely use posix libraries, at the cost of being not compatible with non-posix systems or requiring custom wrappers (Windows comes to mind).

zstd tries to be standard-C as much as possible, to improve portability perspectives.

Technically, we also occasionally use some OS-dependent interfaces, when there is no standard-C equivalent. In such case, we try to write several variants to support multiple targets, and for unsupported ones, relevant functionalities are disabled cleanly (they tend to be non-essential).

Cyan4973 added a commit that referenced this issue Oct 1, 2017
Now, pressing Ctrl-C during compression or decompression
will erase operation artefact (unfinished destination file)
before leaving execution.
@Cyan4973
Copy link
Contributor

Cyan4973 commented Oct 1, 2017

After learning a bit more about it, it appears function signal() is actually part of standard C, which makes it pretty well portable.

In latest dev branch update, I added Ctrl-C trapping to zstd cli.
On pressing Ctrl-C, it now erases operation artefact (unfinished destination file) before exiting.

@Cyan4973 Cyan4973 closed this as completed Oct 1, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant