Skip to content
Commits on Sep 6, 2010
  1. handle uid/gid == -1 on cygwin

    On cygwin, the uid or gid might be -1 for some reason.  struct.pack()
    complains about a DeprecationWarning when packing a negative number into an
    unsigned int, so fix it up first.
    Signed-off-by: Avery Pennarun <>
    committed Sep 6, 2010
  2. cmd/memtest: use getrusage() instead of /proc/self/stat.

    Only Linux has /proc/self/stat, so 'bup memtest' didn't work on anything
    except Linux.  Unfortunately, getrusage() on *Linux* doesn't have a valid
    RSS field (sigh), so we have to use /proc/self/stat as a fallback if it's
    Now memtest works on MacOS as well, which means 'make test' passes again.
    (It stopped passing because 'bup memtest' recently got added to one of the
    Signed-off-by: Avery Pennarun <>
    committed Sep 5, 2010
Commits on Sep 5, 2010
  1. @davidcroda

    cmd/index: catch exception for paths that don't exist.

    Rather than aborting completely if a path specified on the command line
    doesn't exist, report it as a non-fatal error instead.
    (Heavily modified by apenwarr from David Roda's original patch.)
    Signed-off-by: David Roda <>
    Signed-off-by: Avery Pennarun <>
    davidcroda committed with Aug 31, 2010
Commits on Sep 4, 2010
  1. Don't use $(wildcard) during 'make install'.

    It seems the $(wildcard) is evaluated once at make's startup, so any changes
    made *during* build don't get noticed.
    That means 'make install' would fail if you ran it without first running
    'make all', because $(wildcard cmd/bup-*) wouldn't match anything at startup
    time; the files we were copying only got created during the build.
    Problem reported by David Roda.
    Signed-off-by: Avery Pennarun <>
    committed Sep 4, 2010
  2. Don't forget to install _helpers.dll on cygwin.

    We were installing *.so, but not *$(SOEXT) like we should have.  Now we do,
    which should fix some cygwin install problems reported by David Roda.
    Also, when installing *.so and *.dll files, make them 0755 instead of 0644,
    also to prevent permissions problems on cygwin, also reported by David Roda.
    Signed-off-by: Avery Pennarun <>
    committed Sep 4, 2010
Commits on Sep 2, 2010
  1. recover more elegantly if a MIDX file has the wrong version.

    Previously we'd throw an assertion for any too-new-format MIDX file, which
    isn't so good.  Let's recover more politely (and just ignore the file in
    question) if that happens.
    Noticed by Zoran Zaric who was testing my midx3 branch.
    Signed-off-by: Avery Pennarun <>
    committed Sep 2, 2010
  2. cmd/midx: add a new --max-files parameter.

    Zoran reported that 'bup midx -f' on his system tried to open 3000 files at
    a time and wouldn't work.  That's no good, so let's limit the maximum files
    to open; the default is 500 for now, since that ought to be usable for
    normal people.  Arguably we could use getrlimit() or something to find out
    the actual maximum, or just keep opening stuff until we get an error, but
    maybe there's no point.
    Unfortunately this patch isn't really perfect, because it limits the
    usefulness of midx files.  If you could merge midx files into other midx
    files, then you could at least group them all together after multiple runs,
    but that's not currently supported.
    Signed-off-by: Avery Pennarun <>
    committed Sep 2, 2010
Commits on Aug 27, 2010
  1. keep statistics on how much sha1 searching we had to do.

    And cmd/memtest prints out the results.  Unfortunately this slows down
    memtest runs by 0.126/2.526 = 5% or so.  Yuck.  Well, we can take it out
    Signed-off-by: Avery Pennarun <>
    committed Aug 26, 2010
  2. cmd/memtest: add a --existing option to test with existing objects.

    This is useful for testing behaviour when we're looking for objects
    that *do* exist.  Of course, it just goes through the objects in order, so
    it's not actually that realistic.
    Signed-off-by: Avery Pennarun <>
    committed Aug 26, 2010
Commits on Aug 26, 2010
  1. cmd/midx: fix SHA_PER_PAGE calculation.

    For some reason we were dividing by 200 instead of by 20, which was way off.
    Switch to 20 instead.  Suspiciously, this makes memory usage slightly worse
    in my current (smallish) set of test data, so we might need to revert it
    later...?  But if we're going to have an adjustment, we should at least make
    it clear what for, rather than hiding it in something that looks
    suspiciously like a typo.
    Signed-off-by: Avery Pennarun <>
    committed Aug 25, 2010
  2. cmd/margin: add a new --predict option.

    When --predict is given, it tries to guess the offset in the indexfile of
    each hash, based on assumption that the hashes are distributed evenly
    throughout the file.  Then it prints the maximum amount by which this guess
    deviates from reality.
    I was hoping the results would show that the maximum deviation in a typical
    midx was less than a page's worth of hashes; that would mean the toplevel
    lookup table could be redundant, which means fewer pages hit in the
    common case.  No such luck, unfortunately; with 1.6 million objects, my
    maximum deviation was 913 hashes (about 18 kbytes, or 5 pages).
    By comparison, midx files should hit about 2 pages in the common case (1
    lookup table + 1 data page).  Or 3 pages if we're unlucky and the search
    spans two data pages.
    Signed-off-by: Avery Pennarun <>
    committed Aug 25, 2010
  3. cmd/memtest: print per-cycle and total times.

    This makes it easier to compare output from other people or between
    machines, and also gives a clue as to swappiness.
    Signed-off-by: Avery Pennarun <>
    committed Aug 25, 2010
Commits on Aug 23, 2010
  1. Rename to

    Okay, wasn't a good choice of names.  Partly because not
    everything in there is just to make stuff faster, and partly because some
    *proposed* changes to it don't just make stuff faster.  So let's rename it
    one more time.  Hopefully the last time for a while!
    Signed-off-by: Avery Pennarun <>
    committed Aug 22, 2010
Commits on Aug 22, 2010
  1. @lelutin

    lib/bup/ssh: Add docstrings

    Document the code with doctrings.
    Also add an "import sys" line since it is used by sys.argv[0] on line 6.
    Signed-off-by: Gabriel Filion <>
    lelutin committed with Aug 15, 2010
  2. @lelutin

    lib/bup/options: Add docstrings

    Document the code with docstrings.
    Use one line per imported module as recommended by PEP 8 to make it
    easier to spot unused modules.
    Signed-off-by: Gabriel Filion <>
    lelutin committed with Aug 15, 2010
  3. @lelutin

    import cleanup

    Remove unused imported modules.
    I started using the pyflakes.vim plugin and it automagically shows a
    bunch of problems/uncleanliness in the code. It helped me pull this out
    in 15mins.
    This change shouldn't have any impact on performance or functionality
    but it makes the code cleaner.
    Signed-off-by: Gabriel Filion <>
    lelutin committed with Aug 15, 2010
  4. cmd/ftp: don't die if we can't import the ctypes module.

    It's only needed on some rare broken versions of readline anyway.  If we
    can't find the module, chances are the system doesn't have that broken
    version of readline.
    Based on suggestions by Gabriel Filion and Aaron Ucko.
    Signed-off-by: Avery Pennarun <>
    committed Aug 21, 2010
  5. @lelutin

    lib/bup/vfs: bring back Python 2.4 support

    There is currently one test failure when running tests against Python
    2.4: a try..except..finally block that's interpreted as a syntax error.
    The commit introducing this incompatibility with 2.4 is f77a082
    This is a well known python 2.4 limitation and the workaround, although
    ugly, is easy.
    With this test passing, Python 2.4 support is back.
    Signed-off-by: Gabriel Filion <>
    lelutin committed with Aug 20, 2010
Commits on Aug 11, 2010
  1. @lelutin

    lib/bup/vfs: Add docstrings

    Since the vfs module uses the function git._treeparse, it should not be
    named as if it was a private function. Rename git._treeparse to
    git.treeparse and document it (add a docstring to it).
    Also, transform _ChunkReader, _FileReader and Node into new-style
    Finally, remove trailing spaces from lib/bup/ .
    lelutin committed with Aug 2, 2010
Commits on Aug 2, 2010
  1. DESIGN: update mentions of stupidsum to reflect new rollsum algorithm.

    Pointed out by Gabriel Filion.
    Signed-off-by: Avery Pennarun <>
    committed Aug 1, 2010
  2. README: typo.

    Noticed by Zoran Zaric.
    Signed-off-by: Avery Pennarun <>
    committed Aug 1, 2010
Commits on Jul 31, 2010
  1. cmd/save: update the progress meter less often.

    If you ran 'bup save' in an ssh sessio, you could end up sending huge
    amounts of data back over ssh *just* to update the progress meter after
    every single block!  Oops.  Limit the updates to only about 5 per second,
    which is much better.
    committed Jul 31, 2010
  2. Rename to, and move bupsplit into its own so…

    …urce file.
    A lot of stuff in _hashsplit.c wasn't actually about hashsplitting; it was
    just a catch-all for all our C accelerator functions.  Now the module name
    reflects that.
    Also move the bupsplit functions into their own non-python-dependent C
    source file so they can be used as part of other projects.
    Signed-off-by: Avery Pennarun <>
    committed Jul 30, 2010
  3. check the return code of 'bup random'

    Signed-off-by: Avery Pennarun <>
    committed Jul 30, 2010
Commits on Jul 28, 2010
  1. support for putting default values in [square brackets].

    This looks good in the usage message, and is a better place to hardcode such
    things than in the code itself.
    Signed-off-by: Avery Pennarun <>
    committed Jul 16, 2010
  2. _hashsplit.c: get rid of some warnings indicated by a C++ compiler.

    Not hugely important, but might as well fix them.
    Signed-off-by: Avery Pennarun <>
    committed Jul 27, 2010
  3. _hashsplit.c: replace the stupidsum algorithm with rsync's adler32-ba…

    …sed one.
    I've been meaning to do this for a while, but a particular test dataset that
    really caused problems with stupidsum() (ie. it split things into way more
    chunks than it should have) finally screwed me over.  Let's change over to a
    "real" checksum algorithm.
    Non-annoying datasets shouldn't be noticeably affected, but bad ones (such
    as my test case from EQL Data) can be 10x more sensible.  Typical backup
    sets now have about 20% fewer chunks, although this has little affect on the
    overall repository size.
    WARNING: After this patch, all your chunk boundaries will be different from
    before!  That means your incremental backups won't be terribly incremental
    and your backup repositories will jump in size.  This should only happen
    Signed-off-by: Avery Pennarun <>
    committed Jul 27, 2010
  4. _hashsplit.c: switch rollsum_roll() to a macro instead of an inline f…

    gcc 4.3's optimizer manages to fail at optimizing the inline, but works okay
    with the macro.
    Mysteriously, if find_ofs() is *not* static (and therefore presumably
    *harder* to optimize), the optimizer works either way.  But removing the
    static is just wrong, so use the macro instead.
    The difference in speed is about 53 megs/sec vs 80 megs/sec on my machine
    for this command:
    	bup random 100M 2>/dev/null | bup split -N --bench
    Signed-off-by: Avery Pennarun <>
    committed Jul 27, 2010
  5. _hashsplit.c: refactor a bit, and add a self-test.

    In preparation for replacing the stupidsum algorithm with the rsync
    adler32-based one.
    Signed-off-by: Avery Pennarun <>
    committed Jul 27, 2010
  6. make clean: remove some leftover files.

    Stuff has moved around a bit recently, and we weren't cleaning up everything
    like we should.
    committed Jul 28, 2010
Commits on Jul 27, 2010
  1. @lelutin

    cmd/web: hide .dotfiles by default

    Make all files begining with a dot be hidden by default. The hidden
    files can be shown by giving the argument "hidden" with a vlue of 1 in
    the URL.
    Also, in _compute_dir_contents, remove the line "contents = []" since it
    is never used.
    Finally add a "Show/Hide hidden files" link on the pages where content
    is hidden.
    Signed-off-by: Gabriel Filion <>
    lelutin committed with Jul 27, 2010
  2. @lelutin

    cmd/ftp: exit cleanly on Ctrl-C

    bup ftp currently does not handle KeyboardInterrupt exceptions.
    Simply call handle_ctrl_c() at the beginning of the file to make the
    command exit without a stacktrace.
    lelutin committed with Jul 27, 2010
  3. @lelutin

    cmd/ftp: Hide .dotfiles by default (-a shows them)

    Normally in FTP sites, files beginning with a dot are hidden from a list
    (ls) command by default. Also, using the argument '-a' makes the list
    show hidden files.
    The current 'bup ftp' implementation does not behave so. Make it hide
    hidden files by default, as expected, and show hidden files when '-a' or
    '--all' is specified to the 'ls' command.
    All unknown switches will make bup ftp show the ls command usage.
    Users can also give 'ls --help' to obtain the usage string.
    Signed-off-by: Gabriel Filion <>
    lelutin committed with Jul 26, 2010
  4. @lelutin

    lib/options: Add an onabort argument to Options()

    Some times, we may want to parse a list of arguments and not have the
    call to Options.parse() exit the program when it finds an unknown
    Add an argument to the class' __init__ method that can be either a
    function or a class (must be an exception class). If calling the
    function or class constructor returns an object, this object will be
    raised on abort.
    Also add a convenience exception class named Fatal that can be
    passed to Options() to exclusively catch situations in which
    Options.parse() would have caused the program to exit.
    Finally, set the default value to the onabort argument to call
    sys.exit(97) as was previously the case.
    Signed-off-by: Gabriel Filion <>
    lelutin committed with Jul 26, 2010
Something went wrong with that request. Please try again.