Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Current code only supports files up to 2GB in size #23

Open
jwkblades opened this issue Jan 10, 2019 · 0 comments
Open

Current code only supports files up to 2GB in size #23

jwkblades opened this issue Jan 10, 2019 · 0 comments

Comments

@jwkblades
Copy link

As it currently stands, the code only supports files up to 2GB in size, as it relies on read being able to return the entire file length in a single call. This also assumes that the underlying device is ready and capable of returning the entire file in a single shot. Both can be untrue under certain conditions.

Here is a quick patch to allow reading more than 2GB in size (just for the diff side of things currently):

index 628f1c1..481f00f 100644
--- a/bsdiff.c
+++ b/bsdiff.c
@@ -373,6 +373,19 @@ static int bz2_write(struct bsdiff_stream* stream, const void* buffer, int size)
        return 0;
 }
 
+static off_t readFileTo(int fd, off_t size, uint8_t* buf)
+{
+       off_t bytesRead = 0;
+       int inc = 0;
+       while (bytesRead < size)
+       {
+               inc = read(fd, buf + bytesRead, size - bytesRead);
+               if (inc > 0) bytesRead += inc;
+               else break;
+       }
+       return bytesRead;
+}
+
 int main(int argc,char *argv[])
 {
        int fd;
@@ -397,7 +410,7 @@ int main(int argc,char *argv[])
                ((oldsize=lseek(fd,0,SEEK_END))==-1) ||
                ((old=malloc(oldsize+1))==NULL) ||
                (lseek(fd,0,SEEK_SET)!=0) ||
-               (read(fd,old,oldsize)!=oldsize) ||
+               (readFileTo(fd,oldsize,old)!=oldsize) ||
                (close(fd)==-1)) err(1,"%s",argv[1]);
 
 
@@ -407,7 +420,7 @@ int main(int argc,char *argv[])
                ((newsize=lseek(fd,0,SEEK_END))==-1) ||
                ((new=malloc(newsize+1))==NULL) ||
                (lseek(fd,0,SEEK_SET)!=0) ||
-               (read(fd,new,newsize)!=newsize) ||
+               (readFileTo(fd,newsize,new)!=newsize) ||
                (close(fd)==-1)) err(1,"%s",argv[2]);
 
        /* Create the patch file */
jwkblades added a commit to jwkblades/bsdiff that referenced this issue Jan 14, 2019
`read` and `write` return a signed int, meaning that they can only read
up to 2 billion bytes (2GB) at a time. bsdiff and bspatch were expecting
the entire file to be read (or written, respectively) in a single call
to `read` or `write`, which is only possible if they are less than 2GB
in size. There are also other times in which a single function call
would be inadequate for IO, for instance in the case where a device is
busy. The fix for this was to place the functions in a loop and continue
as long as at least 1 byte was transferred (in or out). If an error, or 0
return value, comes back from the transfer, break out of the loop and
return the total number of bytes that had been transferred up to that
point.

Updated the .gitignore file to ignore vim swap files, as well as the
autoconf (generated) files and the executables.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant