putFile corrupting files on node 0.4.3 #10

Closed
ralphholzmann opened this Issue Mar 25, 2011 · 7 comments

Projects

None yet

3 participants

@ralphholzmann

Hello,

I have the following file on my server

http://dl.dropbox.com/u/47777/star-green16.gif

and I am trying to transfer it to my Dropbox using the following code sample

dropbox.putFile('/var/www/sites/ralphholzmann/star-green16.gif', '', function (err, data) {

  if (err) return console.error(err)

});

Upon running the script, the file transfers and there's no error. The file appears in my Dropbox, however its corrupted:

http://dl.dropbox.com/u/47777/star-green16-corrupt.gif

What puzzles me is that data is definitely getting transferred. The original gif is 945 bytes, and the corrupted file that appears after running the script is 970 bytes.

I've messed with all variations of encodings for file reads and writes and have not been able to get the file to successfully transfer uncorrupted.

I'm beginning to wonder if 0.4.3 somehow messed up how they decode files to UTF-8, or if I'm missing something completely.

Thanks for your great lib btw, I plan on forking and contributing.

Ralph

@ralphholzmann

So, after further investigation and testing, it seems to be an issue with either the dropbox package or the oauth (0.9) package. Here's an example that fails every time:

var filename = "star-green16.gif";

fs.readFile( filename, function( err, originalBuffer ) {

    dropbox.putFile(filename, '', function ( err, data ) {
      if ( err ) return console.log( err );

      setTimeout(function(){

        dropbox.getFile(filename, function( err, data ) {
          if ( err ) return console.log( err );

          if ( originalBuffer.toString( 'utf8' ) == data ) {
            console.log( 'files are the same!' );
          } else {
            console.log( 'files are not the same!' );
            console.log( '\nOriginal: ' )
            console.log( originalBuffer.toString( 'utf8' ) );
            console.log( '\n\nFrom Dropbox: ' )
            console.log( data );
          }

        });

      }, 200);
    });


});

Any insight into this issue would be greatly appreciated. Do you have an officially supported range of node versions?

Thanks,

Ralph

@ralphholzmann

More developments:

Looks like this may be an issue with the oauth library. Also, the issue does not exist on text / ascii files, only binary files, images, pdfs, etc.

@evnm
Owner
evnm commented Mar 29, 2011

The issue seems to boil down to the inability of the Dropbox API to reliably interpret the encodings provided by Node Buffers. I made some minor changes to putFile, as evidenced in this gist. On line 17, the Buffer data isn't being handled properly and substituting it for data.toString() with any of the valid encodings doesn't change anything.

Furthermore, if you compare the actual binary data of the original and uploaded files (i.e. using hexl-mode in Emacs), the relevant bytes seems to be the same, but the uploaded file contains a bunch of extra padding. See example.

@ralphholzmann

Hey Evan,

Thanks for diving into this issue. After reading your findings and doing some more digging, I've found a fix for this issue.

The issue seems to boil down to the inability of the Dropbox API to reliably interpret the encodings provided by Node Buffers.

This statement sent me on a hunt to see if the Dropbox API supported any alternative Content-Transfer-Encodings for file uploads, (base64, utf-8, etc). After searching the forums, I found this comment by a Dropbox employee:

As per RFC2616, Content-Transfer-Encoding isn't supported by HTTP (http://www.rfc2616.com/#19.4.5). Dropbox's servers assume all requests send the literal bytes to be interpreted, and we have no plans to change that in the near future.

via http://forums.dropbox.com/topic.php?id=29988#post-284597

So this made it pretty clear that the file being uploaded needs to be transferred using binary encoding:

https://gist.github.com/892564

However, simply changing the buffer output to binary didn't seem to solve the problem. The transferred file would appear in dropbox with the correct size, but a few of the bytes would be off (after looking at a hexadecimal diff of the two files).

What ended up fixing the issue was in the node-oauth library. Not only does the file need to be output in binary in the post body, but the writing of the request stream also needs to be in binary.

So I changed the follows lines from node-oauth

https://github.com/ciaranj/node-oauth/blob/master/lib/oauth.js#L321-329

from using request.write(post_body);
to use request.write(post_body, 'binary');

After making this change, the image files I've been testing with finally transferred to my Dropbox uncorrupted! \o/

However, this presents two problems. The first being, I dont know what the implications are of making this change to node-oauth with concern to other projects that depend on it. This fixes a Dropbox issue, but will it create issues for other services using OAuth? The second problem is that Buffer.toString('binary') is deprecated, and will be removed in future versions of node (per http://nodejs.org/docs/v0.4.4/api/buffers.html), so forward compatibility will require a work around.

Also, in case you were wondering, after changing the oauth library to write its streams in binary, I did try going back and undoing the data.toString('binary') in dropbox-node, to see if it would transfer correctly without being forced to binary output. It wont :(.

I'll post an issue on the node-oauth repo and see what their thoughts are on the changes to the request writing, but for now, I can finally move forward in my application development.

Thanks,

Ralph

@evnm
Owner
evnm commented Mar 29, 2011

Thanks for following through with this! Your findings lead me to discover the cause of the getFile/getThumbnail binary data bugs as well. I commented on your node-oauth issue, so hopefully we can get a response from ciaranj soon.

The OAuth lib should provide some way to deal with things in binary. Dropbox's is obviously not the only API to deal with images, for example (picplz, instagram, et all).

In the meantime, I'm glad to hear that you can make progress on your app!

@rubenstolk

Guys, as we're all waiting for claranj to have time to fix this, does anyone have a solution that could work for dropbox right now? When I use above approach (request.write(..., 'binary')), my script hangs and I get an exception "socket hang up"...

@evnm
Owner
evnm commented Mar 31, 2014

Closing this issue, as this library has long been deprecated.

@evnm evnm closed this Mar 31, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment