Permalink
Commits on Jan 23, 2010
  1. Fix misspelling of DT_UNKNOWN.

    How foolish of me to advertise the fact that I pushed a commit without
    compiling it first...
    cworth-gh committed Jan 23, 2010
  2. README: Tighten up the text a bit.

    As Keith pointed out, (with a humorous citation from Mark Twain),
    the two uses of "very" added nothing to the description. Also,
    "large collection of email" was repeated uselessly.
    cworth-gh committed Jan 23, 2010
  3. Add some comments to document the recently-fixed handling of d_type.

    The fix was subtle, (requiring less code than originally expected), so
    it behooves us to document it well.
    cworth-gh committed Jan 23, 2010
  4. notmuch new: Fix to work on filesystems returning DT_UNKNOWN

    Such as reiserfs or xfs. This has been broken since the merge of
    support for rename and deletion of files from the mail store.
    
    Here's the original justification for the patch:
    
    A review of notmuch-new.c shows three uses of ->d_type:
    
    Near line 153, in _entries_resemble_maildir() we can simply allow for
    DT_UNKNOWN. This would fail if people have MH-style folders which have
    three folders called "new" "cur" and "tmp", but that seems unlikely, in
    which case the "tmp" folder would simply not be scanned.
    
    Near line 273 in add_files_recursive() we have another check. If
    DT_UNKNOWN, we fall through, then add_files_recursive() does a stat
    almost immediately, returning with success if the path isn't a
    directory.
    
    Thus, the fallback is already written.
    
    Finally, near line 343, in add_files_recursive() (a long function) we
    have another check. Here we can simply treat DT_UNKNOWN as DT_LNK, since
    the logic for the stat() results are the same.
    Geo Carncross committed with cworth-gh Jan 21, 2010
Commits on Jan 14, 2010
  1. Install zsh completion file

    According to the Debian zsh maintainer Clint Adams, this is the first
    time that a package installs its own completer into zsh. Part of the
    reason this is not usually done is because zsh does not provide a stable
    API.
    
    We agreed to try it, given that notmuch is expected to change quite
    a bit initially. If there are problems or the completer goes stable,
    we'll move it into the upstream zsh repository.
    
    Signed-off-by: martin f. krafft <madduck@debian.org>
    madduck committed Jan 8, 2010
Commits on Jan 10, 2010
  1. notmuch new: Print upgrade progress report as a percentage.

    Previously we were printing a number of messages upgraded so far. The
    original motivation for this was to accurately reflect the fact that
    there are two passes, (so each message is processed twice and it's not
    accurate to represent with a single count). But as it turns out, the
    second pass takes zero time (relatively speaking) so we're still not
    accounting for it.
    
    If nothing else, the percentage-based reporting makes for a cleaner
    API for the progress_notify function.
    cworth-gh committed Jan 10, 2010
Commits on Jan 9, 2010
  1. lib: Add non-content terms with a WDF value of 0.

    The WDF is the "within-document frequency" value for a particular
    term. It's intended to provide an indication of how frequent a term is
    within a document, (for use in computing relevance). Xapian's term
    generator already computes WDF values when we use that, (which we do
    for indexing all mail content).
    
    We don't use the term generator when adding single terms for things
    that don't actually appear in the mail document, (such as tags, the
    filename, etc.). In this case, the WDF value for these terms doesn't
    matter much.
    
    But Xapian's flint backend can be more efficient with changes to terms
    that don't affect the document "length". So there's a performance
    advantage for manipulating tags (with the flint backend) if the WDF of
    these terms is 0.
    cworth-gh committed Jan 9, 2010
  2. lib: Explicitly set BoolWeight when searching.

    All notmuch searches currently sort by value (either date or message
    ID) so it's just wasted effort for Xapian to compute relevance values
    for each result. We now explicitly tell Xapian that we're uninterested
    in the relevance values.
    cworth-gh committed Jan 9, 2010
  3. lib: Split the database upgrade into two phases for safer operation.

    The first phase copies data from the old format to the new format
    without deleting anything. This allows an old notmuch to still use the
    database if the upgrade process gets interrupted. The second phase
    performs the deletion (after updating the database version number). If
    the second phase is interrupted, there will be some unused data in the
    database, but it shouldn't cause any actual harm.
    cworth-gh committed Jan 9, 2010
Commits on Jan 8, 2010
  1. lib: Delete stale timestamp documents during database upgrade.

    Once we move the timestamp to the new directory document, we don't
    need the old one anymore.
    cworth-gh committed Jan 8, 2010
  2. notmuch new: Don't prevent database upgrade from being interrupted.

    Our signal handler is designed to quickly flush out changes and then
    exit. But if a database upgrade is in progress when the user
    interrupts, then we just want to immediately abort. We could do
    something fancy like add a return value to our progress_notify
    function to allow it to tell the upgrade process to abort. But it's
    actually much cleaner and robust to delay the installation of our
    signal handler so that the default abort happens on SIGINT.
    cworth-gh committed Jan 8, 2010
  3. notmuch new: Fix progress notification on database upgrade.

    This was firing continuously rather than just once per second as
    intended.
    cworth-gh committed Jan 8, 2010
  4. notmuch new: Automatically upgrade the database if necessary.

    This takes advantage of the recently added library support to detect
    if the database needs to be upgraded and then automatically performs
    that upgrade, (with a nice progress report).
    cworth-gh committed Jan 8, 2010
  5. lib: Implement versioning in the database and provide upgrade function.

    The recent support for renames in the database is our first time
    (since notmuch has had more than a single user) that we have a
    database format change. To support smooth upgrades we now encode a
    database format version number in the Xapian metadata.
    
    Going forward notmuch will emit a warning if used to read from a
    database with a newer version than it natively supports, and will
    refuse to write to a database with a newer version.
    
    The library also provides functions to query the database format
    version:
    
    	notmuch_database_get_version
    
    to ask if notmuch wants a newer version than that:
    
    	notmuch_database_needs_upgrade
    
    and a function to actually perform that upgrade:
    
    	notmuch_database_upgrade
    cworth-gh committed Jan 8, 2010
  6. notmuch new: Fix deletion support to recurse on removed directories.

    Previously, when notmuch detected that a directory had been deleted it
    was only removing files immediately in that directory. We now
    correctly recurse to also remove any directories (and files, etc.)
    within sub-directories, etc.
    cworth-gh committed Jan 8, 2010
  7. TODO: Add a couple of ideas that came up during recent coding.

    The notmuch_query_count_messages functions duplicates a lot of code
    undesirably.
    cworth-gh committed Jan 8, 2010
Commits on Jan 7, 2010
  1. Prefer READ_ONLY consistently over READONLY.

    Previously we had NOTMUCH_DATABASE_MODE_READ_ONLY but
    NOTMUCH_STATUS_READONLY_DATABASE which was ugly and confusing. Rename
    the latter to NOTMUCH_STATUS_READ_ONLY_DATABASE for consistency.
    cworth-gh committed Jan 7, 2010
  2. lib: Consolidate checks for read-only database.

    Previously, many checks were deep in the library just before a cast
    operation. These have now been replaced with internal errors and new
    checks have instead been added at the beginning of all top-levelentry
    points requiring a read-write database.
    
    The new checks now also use a single function for checking and
    printing the error message. This will give us a convenient location to
    extend the check, (such as based on database version as well).
    cworth-gh committed Jan 7, 2010
  3. lib: Clarify internal documentation of _notmuch_database_filename_to_…

    …direntry
    
    The original wording made it sound like this function was just doing
    some string manipulation. But this function actually creates new
    directory documents as a side effect. So make that explicit in its
    documentation.
    cworth-gh committed Jan 7, 2010
  4. notmuch_message_get_filename: Support old-style filename storage.

    When a notmuch database is upgraded to the new database format, (to
    support file rename and deletion), any message documents corresponding
    to deleted files will not currently be upgraded. This means that a
    search matching these documents will find no filenames in the expected
    place.
    
    Go ahead and return the filename as originally stored, (rather than
    aborting with an internal error), in this case.
    cworth-gh committed Jan 7, 2010
Commits on Jan 6, 2010
  1. notmuch new: Never ask the database for any names from a new directory.

    When we know that we are adding a new directory to the database, (and
    we therefore are using inode rather than strcmp-based sorting of the
    filenames), then we *never* want to see any names from the
    database. If we get any names that could only make us inadvertently
    remove files that we just added.
    
    Since it's not obvious from the Xapian documentation whether new terms
    being added as part of new documents will appear in the in-progress
    all-terms iteration we are using, (and this might differ based on
    Xapian backend and also might differ based on how many new directories
    are added and whether a flush threshold is reached).
    
    For all of these reasons, we play it safe and use NULL rather than a
    real notmuch_filenames_t iterator in this case to avoid any problem.
    cworth-gh committed Jan 6, 2010
  2. lib: Treat NULL as a valid (and empty) notmuch_filenames_t iterator.

    This will be convenient to avoid some special-casing in higher-level
    code.
    cworth-gh committed Jan 6, 2010
  3. notmuch new: Fix bug resulting in file removal on initial build of da…

    …tabase.
    
    The bug here was that we would see that the database did not know
    anything about a directory so would get results from the filesystem in
    inode rather than strcmp order.
    
    However, we wouldn't actually ask for the list of files from the
    database until after recursing into the sub-directories. So by the
    time we traverse the filenames looking for deletions, the database
    *does* have entries and we end up detecting erroneous deletions
    because our filename list from the filesystem isn't in strcmp order.
    
    So ask for the list of names from the database before doing any
    additions to avoid this problem.
    cworth-gh committed Jan 6, 2010
  4. notmuch new: Fix to detect deletions of names at the end of the list.

    Previously we only scanned the list of filenames in the filesystem and
    detected a deletion whenever that scan skipped a name that existed in
    the database. That much was fine, but we *also* need to continue
    walking the list of names from the database when the filesystem list
    is exhausted.
    
    Without this, removing the last file or directory within any
    particular directory would go undetected.
    cworth-gh committed Jan 6, 2010
  5. notmuch new: Fix regression preventing addition of symlinked mail files.

    As described in the previous commit message, we introduced multiple
    symlink-based regressions in commit
    3df737bc4addfce71c647792ee668725e5221a98
    
    Here, we fix the case of symlinks to regular files by doing an extra
    stat of any DT_LNK files to determine if they do, in fact, link to
    regular files.
    cworth-gh committed Jan 6, 2010
  6. notmuch new: Fix regression preventing recursion through symlinks.

    In commit 3df737bc4addfce71c647792ee668725e5221a98 we switched from
    using stat() to using the d_type field in the result of scandir() to
    determine whether a filename is a regular file or a directory. This
    change introduced a regression in that the recursion would no longer
    traverse through a symlink to a directory. (Since stat() would resolve
    the symlink but with scandir() we see a distinct DT_LNK value in
    d_type).
    
    We fix this for directories by allowing both DT_DIR and DT_LNK values
    to recurse, and then downgrading the existing not-a-directory check
    within the recursion to not be an error. We also add a new
    not-a-directory check outside the recursion that is an error.
    cworth-gh committed Jan 6, 2010
  7. Fix typo in comment.

    The difference between "now" and "not" ends up being fairly dramatic.
    cworth-gh committed Jan 6, 2010
  8. notmuch new: Print counts of deleted and renamed messages.

    It's nice to be able to see a report indicating that the recently
    added support for detecting file rename and deletion is working.
    cworth-gh committed Jan 6, 2010
  9. lib: Indicate whether notmuch_database_remove_message removed anything.

    Similar to the return value of notmuch_database_add_message, we now
    enhance the return value of notmuch_database_remove_message to
    indicate whether the message document was entirely removed (SUCCESS)
    or whether only this filename was removed and the document exists
    under other filenamed (DUPLICATE_MESSAGE_ID).
    cworth-gh committed Jan 6, 2010
  10. lib: Update documentation of notmuch_database_add_message.

    Previously, adding a filename with the same message ID as an existing
    message would do nothing. But we recently fixed this to instead add
    the new filename to the existing message document. So update the
    documentation to match now.
    cworth-gh committed Jan 6, 2010
  11. Index content from citations and signatures.

    In the presentation we often omit citations and signatures, but this
    is not content that should be omitted from the index, (especially
    when the citation detection is wrong---see cases where a line
    beginning with "From" is corrupted to ">From" by mail processing
    tools).
    cworth-gh committed Jan 6, 2010
  12. notmuch new: Proper support for renamed and deleted files.

    The "notmuch new" command will now efficiently notice if any files or
    directories have been removed from the mail store and will
    appropriately update its database.
    
    Any given mail message (as determined by the message ID) may have
    multiple corresponding filenames, and notmuch will return one of
    them. When a filen is deleted, the corresponding filename will be
    removed from the message in the database. When the last filename is
    removed from a message, that message will be entirely removed from the
    database.
    
    All file additions are handled before any file removals so that rename
    is supported properly.
    cworth-gh committed Jan 6, 2010
  13. notmuch new: Store detected removed filenames for later processing.

    It is essential to defer the actual removal of any filenames from the
    database until we are entirely done adding any new files. This is to
    avoid any information loss from the database in the case of a renamed
    file or directory.
    
    Note that we're *still* not actually doing any removal---still just
    printing messages indicating the filenames that were detected as
    removed. But we're at least now printing those messages at a time when
    we actually *can* do the actual removal.
    cworth-gh committed Jan 6, 2010
  14. notmuch new: Detect deleted (renamed) files and directories.

    This takes advantage of the notmuch_directory_t interfaces added
    recently (with cooresponding storage of directory documents in the
    database) to detect when files or entire directories are deleted or
    renamed within the mail store.
    
    This also fixes the recent regression where *all* files would be
    processed by every run of "notmuch new", (now only new files are
    processed once again).
    
    The deleted files and directories are only detected so far. They
    aren't properly removed from the database.
    cworth-gh committed Jan 6, 2010
  15. add_files_recursive: Make the maildir detection more efficient.

    Previously, we were re-scanning the entire list of entries for every
    directory entry. Instead, we can simply check if the entries look like
    a maildir once, up-front.
    cworth-gh committed Jan 6, 2010