Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Non-deterministic corruption of resulting tarball on Mac OSX #23

Closed
xoqem opened this issue Feb 22, 2013 · 22 comments
Closed

Non-deterministic corruption of resulting tarball on Mac OSX #23

xoqem opened this issue Feb 22, 2013 · 22 comments

Comments

@xoqem
Copy link

xoqem commented Feb 22, 2013

Basically, the compress step produces a tarball that is 10 bytes in length about 20% of the time or something in that range, we've also potentially seen it produce a file that may be a somewhat longer, but still truncated number of bytes. 10 bytes seems to be the length for the typical error condition. We've only seen it on mac boxes so far, though we haven't really tested much on other boxes.

We witnessed this on several larger projects, so we broke it down to a very simple case (setup below), I test by running the following command repeatedly (previously we were hitting it using a more proper grunt work flow, this was just a simple command developed to test numerous times quickly):

rm -rf dist; grunt compress; ls -l dist; tar tzf dist/test.tar.gz

In the case that it fails the tar tzf command typically will output "tar: Error opening archive: (Empty error message)"

Project setup:


Root folder has the following contents:

/src/test.html (just an empty file for testing)
/Gruntfile.js
/package.json


Gruntfile.js contents:

/global module:false, node:true/
module.exports = function(grunt) {
// Project configuration.
grunt.initConfig({
pkg: grunt.file.readJSON('package.json'),
meta: {
},
compress: {
test: {
options: {
archive: 'dist/test.tar.gz'
},
files: [
{expand: true, cwd: 'src', src: ['*']}
]
}
}
});

grunt.loadNpmTasks('grunt-contrib-compress');
};


package.json contents:

{
"name": "GruntCompress",
"version": "0.0.0",
"title": "",
"dependencies": {},
"devDependencies": {
"grunt": "~0.4",
"grunt-contrib-compress": "~0.4.0"
}
}

@xoqem
Copy link
Author

xoqem commented Feb 22, 2013

Note, that if you add a task to sleep that will execute after compress (we tried 200 ms, for example), the issue seems to go away. For example:

grunt.registerTask('sleep', function() {
setTimeout(this.async(), 200);
});

This may help in tracking down the nature of the problem.

@ctalkington
Copy link
Member

compress v0.4 actually uses node-archiver, my project, tar was only initially supported in v0.3 which compress uses.

being the issue you are seeing seems related to timing, its most likely that that the use of nextTick in v0.3 is causing issues. this was replaced with a more efficient event based approach in v0.4, recently released.

could you try installing that version in your project and see if you get the same results? also what version of node are you running?

cd node_modules/grunt-contrib-compress
npm install archiver@0.4.0

@ctalkington
Copy link
Member

also, could you try it without gzip as that is a whole other factor and will help pinpoint exact cause.

@xoqem
Copy link
Author

xoqem commented Feb 22, 2013

Hey, thanks for the reply. Node version (via process.version) is v0.8.12. I tried the npm install archiver@0.4.0 and I still get the error. Though, I'm not convinced it was using the proper version of archiver. If I did an npm list it showed archiver@0.4.0 at the top, but further down the list it showed:

└─┬ grunt-contrib-compress@0.4.0
├─┬ archiver@0.3.0

Is it smart enough to use the newer version, or always use the hierarchical version? I'm a bit of a Node newbie, and I couldn't find syntax for swapping out the version of a dependency of a dependency.

@xoqem
Copy link
Author

xoqem commented Feb 22, 2013

Oh, sure, I'll try without the gzip. Good question.

@ctalkington
Copy link
Member

you would need to cd into module dir and do it.

cd node_modules/grunt-contrib-compress
npm install archiver@0.4.0

@xoqem
Copy link
Author

xoqem commented Feb 22, 2013

Ah, thanks for the tip, I'll try the archiver@0.4.0 next and update.

To answer your second question, doing just a .tar seems to not suffer from the error (after trying it numerous times, I also tried with a bunch of files in the src directory to make sure it wasn't just the decreased run time).

@jaredr
Copy link

jaredr commented Feb 24, 2013

When I try this, with a single file I get corruption sometimes if I'm building a .tar.gz, but no problems if I make it either .tar or a .gz.

Mac OS 10.6.8, node 0.8.20, grunt 0.4, grunt-contrib-compress 0.4.0, archiver 0.3.0.

If I manually switch to archiver 0.4.0, I see the same problem. (Also, npm ls mentions the version as "invalid", but I presume that's just because it doesn't match the version that compress has in its package.json file)

Also note that even when the .tar.gz is not corrupted, the file can vary in size by a few bytes, so the gzip process seems to be somewhat non-deterministic even when it does manage to get to everything. Presumably there are different results depending on how much the OS is reading at any particular time?

@xoqem
Copy link
Author

xoqem commented Feb 26, 2013

I too still see the problem after manually overriding the version of archiver to 0.4.0 (and also see it marked as invalid in the npm list).

@ctalkington
Copy link
Member

@jaredr can you confirm that the issue goes away with more than one file/large file? I'm thinking it must be related to CPU cycles or something. generally you wouldn't tar a single small file.

@jaredr
Copy link

jaredr commented Feb 26, 2013

Nope, tried an archive with two small files and I have the same problem.

@ctalkington
Copy link
Member

i'm not quite sure, have you tried downloading the repo and running tests to see if any of them fail? it would seem like its exiting early but it must be OS / config specific as tests aren't throwing on travis (linux) and i've never had issues on windows.

@jaredr
Copy link

jaredr commented Feb 26, 2013

What does archiver (or whatever) do to flush the file when it is finished writing?

I'm looking at how grunt finishes, and it looks like it's an explicit call to exit:
https://github.com/gruntjs/grunt/blob/v0.4.0/lib/grunt.js#L141

but a call to exit does not flush buffers before termination:
nodejs/node-v0.x-archive#3737

(grunt's exit is a special one that checks for pending writes to stdout & stderr, but that's it, as far as I can tell: https://github.com/gruntjs/grunt/blob/v0.4.0/lib/util/exit.js )

@ctalkington
Copy link
Member

archiver just uses a stream that pipes to fsWriteStream/any Stream variant, however in the case of tar.gz its a double pipe. the way compress works is by letting this happen async. now that i think of it, i bet the issue is with the calling of done based on archiver which could sometimes happen before gzip since it is piping twice

@ctalkington
Copy link
Member

most likely someone making a PR would need to hook on to the end event of archiveStream and do a special case for tar gz.

@shama
Copy link
Member

shama commented Mar 13, 2013

Could you try this with the latest v0.4.2? Thanks!

@xoqem
Copy link
Author

xoqem commented Mar 13, 2013

I updated to the latest versions below, and I'm still getting the error with a similar frequency:

"grunt": "0.4.1"
"grunt-contrib-compress": "0.4.2"

@shama
Copy link
Member

shama commented Mar 13, 2013

Thanks @sinfree I'll look to see if I can figure out why.

@ctalkington
Copy link
Member

@shama its because when gzipping listening to just Archiver.finalize and calling done wont cut it, as Archiver will have finished but then your piping gzip to file.

also, Archiver v0.4.1 is out. massive improvements in performance, uses readable-stream, more event based vs old nextTick flow.

EDIT: API should remain the same but addFile (aliased for now) was replaced with append due to ability to update archives in future release.

EDIT2: Oh and you dont need to wrap append anymore, Archiver has its own queue now, so you can do normal forEach and call finalize right after.

@derekcicerone-zz
Copy link
Contributor

Just updated to 0.4.4 and the new 1:1 gzip option also seems to be affected by this issue. The gzipped version of files are not decompressing back to the original.

@shama
Copy link
Member

shama commented Mar 19, 2013

@ctalkington Odd, all of your comments on this thread were not appearing for me earlier. Thanks for the tips on Archiver! I'll see if I can implement them soon.

@ctalkington
Copy link
Member

@shama odd, maybe had to do with my account being marked spam for a bit. was a false positive and had to contact GH but i couldnt use gists for a bit cause of it.

@shama shama closed this as completed in 2e6b0f7 Mar 20, 2013
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants