Skip to content

Handling Directories

Paul Ruane edited this page Jan 9, 2017 · 2 revisions

There are two ways to handle directories: by tagging just the directory or tagging the directory and all of its contents.

Tagging Just a Directory

TMSU, by default, does not tag a directory's contents.

$ tmsu tag /tmp temporary-files

This adds only the /tmp directory to the database. TMSU does not look inside the directory. This makes the tagging operation fast and the database smaller. However, querying the database will show only the directory entry and not the files contents:

$ tmsu files temporary-files
/tmp

To list the matching directories' contents combine the query with a call to 'find':

$ tmsu files temporary-files | xargs find
/tmp
/tmp/banana
/tmp/cucumber
/tmp/cucumber/kirby

Recursively listing files in this manner can obviously be slow, especially if the directories have lots of files, if the filesystem is inherently slow or is across a network.

If you tag just the directory entry TMSU does not fingerprint the files in the directory so it cannot report duplicate files. Whilst file entries will not be shown in the virtual filesystem directly, it is still possible to navigate to the files via the directory symbolic link.

  • Quicker to tag
  • Smaller database
  • No duplicate file detection
  • Slower to list tagged files
  • Files not shown in the virtual filesystem (but are still accessible)

Tagging a Directory and its Contents

$ tmsu tag --recursive /tmp temporary-files

Tagging the directory recursively adds every file from that directory to the database, so it can be slow and results in a considerably larger database than tagging the directory entries alone. However that cost is borne only once: subsequent queries do not need to consult the filesystem:

$ tmsu files temporary-files
/tmp
/tmp/banana
/tmp/cucumber
/tmp/cucumber/kirby

In addition, because the files are in the database it means they are shown directly in the virtual filesystem, though this may clutter the tag directories. TMSU is also able to identify duplicate files as a fingerprint of each file is taken when it is added.

  • Slower to tag
  • Larger database
  • Duplicate files detected
  • Quicker to list tagged files
  • Files shown in the virtual filesystem