New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recursive directory diff support #21

Open
GoogleCodeExporter opened this Issue Mar 24, 2015 · 22 comments

Comments

Projects
None yet
4 participants
@GoogleCodeExporter

GoogleCodeExporter commented Mar 24, 2015

This is a frequently-requested feature. In particular, it's an RTPatch
feature. This will probably require some Unix/Win-specific code.

Original issue reported on code.google.com by josh.mac...@gmail.com on 5 Feb 2007 at 11:47

@GoogleCodeExporter

This comment has been minimized.

GoogleCodeExporter commented Mar 24, 2015

gzip 1.3.5 seems to handle this well on both POSIX and Windows platforms, and 
is of 
course GPL2 licensed. I'm not sure how much gnuwin32 replaces the baseline Gnu 
code, 
though, or how modularized it really is. My C skills are caked with 10 years of 
rust.

Just thought I'd throw it out as an example, anwyay.

Original comment by malay...@gmail.com on 6 Feb 2007 at 4:45

@GoogleCodeExporter

This comment has been minimized.

GoogleCodeExporter commented Mar 24, 2015

It's not exactly the same, but SVN 125 (after release 3.0o) has a new 
environment
variable for use with tar:

export XDELTA
XDELTA="-s source-1.tar"
TAR="tar --use-compress-program=xdelta3"
$TAR -cvf source-1-source2.tar.vcdiff source-2/
...
$TAR -xvf source-1-source2.tar.vcdiff

Original comment by dotdotis...@gmail.com on 8 Feb 2007 at 3:04

@GoogleCodeExporter

This comment has been minimized.

GoogleCodeExporter commented Mar 24, 2015

Attached is the perl script I use to achieve this functionality.  If anyone is
interested I also have a version that works with xdelta 1.1.x.

Original comment by eje...@gmail.com on 18 Jun 2007 at 6:11

Attachments:

@GoogleCodeExporter

This comment has been minimized.

GoogleCodeExporter commented Mar 24, 2015

Josh, this is pretty cool software.  I really could use the directory diff 
though as
I am adding directories and files in them from version to version.  That, of 
course,
causes my tar/zip files to be pretty different, and I usually only save about 
fifteen
megs in a 260mb file.  Anyway, thanks for the program, and I'd be happy to beta 
test
on the Windows side if you choose to get this directory diff thing going.

Original comment by rwsto...@gmail.com on 5 Aug 2007 at 1:41

@GoogleCodeExporter

This comment has been minimized.

GoogleCodeExporter commented Mar 24, 2015

Original comment by josh.mac...@gmail.com on 14 Dec 2007 at 9:30

  • Added labels: Type-Enhancement, Milestone-Release3.0
  • Removed labels: Type-Defect
@GoogleCodeExporter

This comment has been minimized.

GoogleCodeExporter commented Mar 24, 2015

Original comment by josh.mac...@gmail.com on 14 Dec 2007 at 9:38

  • Removed labels: Milestone-Release3.0
@GoogleCodeExporter

This comment has been minimized.

GoogleCodeExporter commented Mar 24, 2015

I'm not sure I understand the tar method fully. Can someone give a full example,
let's say, comparing D:\Dir1 against D:\Dir2 and then also how to apply that 
diff as
well? Both xdelta and tar need to be in the temp dir. I would like to make this 
part
of an update installation.

Thanks

Original comment by j.scru...@gmail.com on 4 Jul 2008 at 2:01

@GoogleCodeExporter

This comment has been minimized.

GoogleCodeExporter commented Mar 24, 2015

Attached is a new version of xdelta-dir.  It fixes an issue with recursive 
directory
handling.

Original comment by eje...@gmail.com on 24 Feb 2009 at 6:57

Attachments:

@GoogleCodeExporter

This comment has been minimized.

GoogleCodeExporter commented Mar 24, 2015

That's my modifications made to the script, since it's in GPL I'm posting here 
for
anyone to see and use, if they find it useful...
I've fixed some hangups if one dir was present in one file but not in the 
other, and
also added a file named DELETED in patchdir, from whom I read removed files and 
dir,
also added some checkups and output of xdelta-dir to a log (named patcher.log) 
via
TeeOutPut.pm (search google for that)

Original comment by fwiff...@gmail.com on 16 Mar 2009 at 6:35

Attachments:

@GoogleCodeExporter

This comment has been minimized.

GoogleCodeExporter commented Mar 24, 2015

I have made a handy Java translation of the Perl script above that anyone can 
use in
their Java programs.

It is standalone and only requires the use to have xdelta3.exe in the working
directory or to set the system property xdelta3.

In a DOS batch do this to create a recursive delta of an entire directory:

SET xdelta3=somefilelocation (maybe c:\delta.exe)
java XDeltaHelper [-v] delta olddir newdir patchdir

if the user wants a manifest of all the folders and files in the newdir that 
were not
found in the olddir use -m or --manifest followed by a filename. This must be 
the
second paramater

SET xdelta3=c:\xdelta3.exe)
java XDeltaHelper [-v] [-m manifest.txt] delta olddir newdir patchdir

Applying a patch is equally easy.

SET xdelta3=c:\xdelta3.exe
java XDeltaHelper [-v] patch patchdir olddir newdir

It is also possible to use this as a convenient wrapper for just a plain old 
xdelta
of a file

SET xdelta3=c:\xdelta3.exe
java XDeltaHelper [-v] delta oldfile newfile patchfile

or to make a new file from a patch and an old file

SET xdelta3=c:\xdelta3.exe
java XDeltaHelper [-v] patch patchfile oldfile newfile


*** To APPLY A PATCH To AN EXISTING OLD FILE ****
This will create the newfile and then replace the old with the new.

SET xdelta3=c:\xdelta3.exe
java XDeltaHelper --applypatch [-v] patchfile oldfile

Happy Coding

Original comment by canw...@gmail.com on 11 Apr 2009 at 4:46

Attachments:

@GoogleCodeExporter

This comment has been minimized.

GoogleCodeExporter commented Mar 24, 2015

Hi, I created a utility that creates patch according to two directories... I'm 
using
JojoDiff but I could easily use yours too. Feel free to contact me if you're
interested at 
xmarot at gmail dot com

Original comment by xma...@gmail.com on 18 Apr 2009 at 5:15

Attachments:

@GoogleCodeExporter

This comment has been minimized.

GoogleCodeExporter commented Mar 24, 2015

This issue is almost 5 years old, aren't there any news on this matter yet?

Original comment by schmid...@gmail.com on 2 Sep 2011 at 7:15

@GoogleCodeExporter

This comment has been minimized.

GoogleCodeExporter commented Mar 24, 2015

It looks like you'll have to use a script/frontend to get this behavior. While 
it would be nice to have it built in, it really makes sense to keep it separate 
because then you can easily swap xdelta for some other binary diff executable 
that operates on one file at a time. In other words, binary diffing whole 
directories is a behavior that's agnostic of binary diffing algorithms.

I wrote one of my own, similar to the ones above, but never got it polished 
enough for release. I will post a link here if I ever finish it. My approach 
was to just mirror the directory structure in a tar.bz2 archive full of xdelta 
patch files and placeholders to signify creation or deletion of elements in the 
hierarchy.

In short, I don't think this feature is going to be added to the codebase here.

Original comment by eli...@gmail.com on 2 Sep 2011 at 11:38

@GoogleCodeExporter

This comment has been minimized.

GoogleCodeExporter commented Mar 24, 2015

Unfortunately scripts/frontends turns things too clumsy for distributing 
patches for end-users. Even worst when you have users of various different 
platforms.

I also tried the tar approach, but it revealed not practical for distributing 
patches to end users also: tar stores the username on the tar file, so xdelta 
always complains that the target file will not received the patch because the 
checksum don't match. Disabling the checksum causes the problem that you must 
require the user to run a checksum (sha1, i.e.) to check if the resulting file 
is really correct. IOW, it's clumsy and error-prone again.

Original comment by schmid...@gmail.com on 3 Sep 2011 at 11:03

@GoogleCodeExporter

This comment has been minimized.

GoogleCodeExporter commented Mar 24, 2015

Issue 134 has been merged into this issue.

Original comment by josh.mac...@gmail.com on 18 Jun 2012 at 12:33

@GoogleCodeExporter

This comment has been minimized.

GoogleCodeExporter commented Mar 24, 2015

> It's still something I'd like to handle, but I have to focus on the core 
issues / bugs first.
> It will be easier to do something like this as a separate program, maybe in 
Python.


The problem with Python is the same as any other external solution: it's too 
clumsy for end users on multiple platforms. Most systems don't have a native 
Python installation, and requiring them to download/install it just to 
distribute a small patch completely ruins any advantages of distributing the 
small patch.


Original comment by schmid...@gmail.com on 25 Sep 2012 at 5:18

@GoogleCodeExporter

This comment has been minimized.

GoogleCodeExporter commented Mar 24, 2015

The reason this hasn't been implemented in 5 long years is probably because of 
git (the source code management system). Git was introduced in 2005, and 
evolved into a strong suite of version control tools (is really just a very 
competent filesystem at heart, but that's another story).

The fact that Git is a source code management system should let you know that 
it handles any manner of nested folder structures. That is something that 
Xdelta can't do, and something that is so often needed (we handle collections, 
work doesn't just involve single files).

Git let's you create and apply binary patches. That is, you can make changes to 
a graphics file (say jpg), send the binary diff over to someone, and have him 
apply the patch and get the new image. The binary diff is obviously many times 
smaller in size than the whole graphics files.

Git stores your changes in deltas. Thus, a 40KB file that is changed 10 times 
may be stored in much much less than 400KB of space. And here's the pleasant 
surprise: Git potentially (mostly, really) stores deltas more efficiently than 
CVS, SVN, and even your (schmid...) envisioned system of tying deltas to "older 
versions of the same file".

Let's explain. A 40KB file is first stored in 40KB of space. A tiny change will 
be stored in a tiny delta, say 1KB. But what if you did a massive change? The 
delta could be 30KB! Or even 40KB if you plonked in a totally different file.

The same huge delta happens when you drastically change the size of said file, 
say from 40KB to 1KB or to 400KB.

Git will hunt your collection for a better "delta base", a base from which to 
build a delta. Say you changed your file from 40KB to 1KB. Git may hunt for 
another file that is 2KB in size, and manage to calculate a small 1KB delta. 
Thus, Git avoids calculating a huge (30-40KB) delta based on the original 40KB 
file.

Hope the above is clear. Git is actually a really competent filesystem 
(database system). A great version control mechanism just happens to be built 
on top of it. Anything can be built on top of a great database system.

Original comment by jhannw...@gmail.com on 1 Feb 2013 at 4:31

@GoogleCodeExporter

This comment has been minimized.

GoogleCodeExporter commented Mar 24, 2015

Comparing git to xdelta seems inappropriate to me. I can't speak for everyone, 
but to me, the appeal of having xdelta act recursively is that it would fulfill 
the need for a FOSS binary patching tool with a small footprint. I don't know 
if xdelta's binary diffing is better, worse, equivalent, or just different than 
git's, but trying to distribute patches using git is never going to be as clean 
and straightforward as e.g. "xdelta the.patch /path/to/game/or/app", since 
that's nowhere near the use-case git was designed for. The binary diffing 
algorithms used don't really matter, as they can always be interchanged with 
superior ones as they come along. Fundamentally, it's the user interfaces that 
differ, and the workflows that go with them.

It's always bugged me how much bandwidth games tend to waste in distributing 
their updates. A few devs do it right; using the bittorrent protocol is 
probably the easiest way (Blizzard did/does), but I think the ever-increasing 
availability of broadband has rendered the issue moot for both users and devs.

Original comment by eli...@gmail.com on 2 Feb 2013 at 1:09

@GoogleCodeExporter

This comment has been minimized.

GoogleCodeExporter commented Mar 24, 2015

Hi, everybody!
Scarab library uses xdelta (as an option) and builds diff for directories.
Check it out here: 
https://github.com/loyso/Scarab
Any comments are welcome!

Original comment by loys...@gmail.com on 9 Mar 2013 at 12:07

@sgnn7

This comment has been minimized.

sgnn7 commented Apr 29, 2015

If there's a way to have some of the code from https://github.com/endlessm/xdelta3-dir-patcher integrated into this tool, you might have an easy win.

@tramseyer

This comment has been minimized.

tramseyer commented Oct 23, 2017

The way I do it with tools which are currently available is as follows:

  • Create both uncompressed tar's (GNU tar 1.28 or newer) with --sort=name and optionally with --mtime=1970-01-01.
  • Create xdelta3 diff from the one to the other tar.

If the directory structure in the tar changes much, there might be more sophisticated methods.
But if some files and/or folders are just added or removed it works quite well.

@i30817

This comment has been minimized.

i30817 commented Jan 17, 2018

If this feature is worked on, i'd like the option of sorting the underlying tar (or whatever container format ends up existing to de-multiplex multiple files into just one patch) by filesize not name for the case where the source and destination have differently named files have several (but often not all) bytes in common.

This happens often in multiple formats of the same thing, and i expect byte size is better indicator of relative order than name in at least some of these cases (option because often the name is better if it's not truly random even in sources and destinations with different number of files, so name should be default). I actually don't know if this is relevant for xdelta compression, but i'd expect so considering all the other approaches tendency to hook into equal filenames as the supreme indicator of which bytes should be patched.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment