Current code only supports files up to 2GB in size #23

jwkblades · 2019-01-10T16:44:00Z

As it currently stands, the code only supports files up to 2GB in size, as it relies on read being able to return the entire file length in a single call. This also assumes that the underlying device is ready and capable of returning the entire file in a single shot. Both can be untrue under certain conditions.

Here is a quick patch to allow reading more than 2GB in size (just for the diff side of things currently):

index 628f1c1..481f00f 100644
--- a/bsdiff.c
+++ b/bsdiff.c
@@ -373,6 +373,19 @@ static int bz2_write(struct bsdiff_stream* stream, const void* buffer, int size)
        return 0;
 }
 
+static off_t readFileTo(int fd, off_t size, uint8_t* buf)
+{
+       off_t bytesRead = 0;
+       int inc = 0;
+       while (bytesRead < size)
+       {
+               inc = read(fd, buf + bytesRead, size - bytesRead);
+               if (inc > 0) bytesRead += inc;
+               else break;
+       }
+       return bytesRead;
+}
+
 int main(int argc,char *argv[])
 {
        int fd;
@@ -397,7 +410,7 @@ int main(int argc,char *argv[])
                ((oldsize=lseek(fd,0,SEEK_END))==-1) ||
                ((old=malloc(oldsize+1))==NULL) ||
                (lseek(fd,0,SEEK_SET)!=0) ||
-               (read(fd,old,oldsize)!=oldsize) ||
+               (readFileTo(fd,oldsize,old)!=oldsize) ||
                (close(fd)==-1)) err(1,"%s",argv[1]);
 
 
@@ -407,7 +420,7 @@ int main(int argc,char *argv[])
                ((newsize=lseek(fd,0,SEEK_END))==-1) ||
                ((new=malloc(newsize+1))==NULL) ||
                (lseek(fd,0,SEEK_SET)!=0) ||
-               (read(fd,new,newsize)!=newsize) ||
+               (readFileTo(fd,newsize,new)!=newsize) ||
                (close(fd)==-1)) err(1,"%s",argv[2]);
 
        /* Create the patch file */

The text was updated successfully, but these errors were encountered:

`read` and `write` return a signed int, meaning that they can only read up to 2 billion bytes (2GB) at a time. bsdiff and bspatch were expecting the entire file to be read (or written, respectively) in a single call to `read` or `write`, which is only possible if they are less than 2GB in size. There are also other times in which a single function call would be inadequate for IO, for instance in the case where a device is busy. The fix for this was to place the functions in a loop and continue as long as at least 1 byte was transferred (in or out). If an error, or 0 return value, comes back from the transfer, break out of the loop and return the total number of bytes that had been transferred up to that point. Updated the .gitignore file to ignore vim swap files, as well as the autoconf (generated) files and the executables.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Current code only supports files up to 2GB in size #23

Current code only supports files up to 2GB in size #23

jwkblades commented Jan 10, 2019

Current code only supports files up to 2GB in size #23

Current code only supports files up to 2GB in size #23

Comments

jwkblades commented Jan 10, 2019