Collection Hydra, for managing multi-headed collections of files
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
cohydra
.gitignore
.travis.yml
LICENSE
README.md
setup.py

README.md

CoHydra (Collection Hydra)

CoHydra is a python library for managing multi-headed collections of files. It was created to manage a music collection with audio files, images (e.g., front and back covers of albums), videos (or other extras that come with some albums), CD TOC files, and other files associated with music. Given a collection of files like that, CoHydra can create multiple profiles of the collection. One profile could have only music files and front cover images. Another profile could be derived from that profile, but with the music files recoded to a lower bitrate, for use on a device with insufficient storage space to handle the entire original collection of files. Yet another profile could have only video files.

Currently, CoHydra itself provides only the framework to manage a collection. If you want to use it, you'll need to write some code to define how each profile differs from its parent profile. See the example code below.

Dependencies

  • python, version 3

  • six

  • Possibly a POSIX environment. This has not been tested on other environments.

Terminology

  • collection: An abstract bunch of files.

  • profile: A single head of a multi-headed collection, represented as all the files and sub-directories within a single top-level directory.

  • root profile: A profile with original source files, not generated by CoHydra. It has no parent profile.

  • derived profile: A profile that is generated by CoHydra, using source files from its parent profile.

  • parent profile and child profile: When one profile is derived from another, the former is the parent profile and the latter is the child profile. As with human parents and children, it is possible for a profile to be the child of one profile and the parent of another.

  • ancestor profile and descendant profile: The transitive closures of parent profile and child profile, respectively.

  • source: A file or directory (or symlink) used when generating a profile. The source of a derived profile is the destination of its parent profile. It is very important that a profile never writes to its source, see the warning below.

  • destination: A file or directory (or symlink) generated in a profile by CoHydra. Note that the destination of one profile is the source of all of its child profiles. Where possible, profiles use symlinks from their destinations to their sources to save on disk space.

WARNING

CoHydra has not received a lot of testing yet. It may mess up source files that you don't want it to write to. It shouldn't, but it might. If you care about the collection of files you manage with CoHydra, please be careful. I personally use read-only bind mounting to prevent accidental writes to anything in my root profile. My fstab has the line below, but note that this solution depends on the version of the kernel and the version of mount.

/home/dseomn/Music/master.rw /home/dseomn/Music/master none bind,ro 0 0

Example Use

The following python script uses CoHydra to generate profiles of a collection of music files and accompanying images and videos. It is intended to serve as an example from which you can make your own profiles. However, there are important notices in the documentation of the various classes in cohydra.profile, so you probably should not rely solely on this example when writing your own code.

#!/usr/bin/python3

import logging
import mimetypes
import os
import subprocess

import cohydra.profile


# This root profile is a directory of music and related files,
# including images and videos.
music_master = cohydra.profile.RootProfile(
  top_dir='/home/dseomn/Music/master',
  )


def music_default_select_cb(profile, src_relpath, dst_relpath, contents):
  """Select which files go into the default profile.

  The default profile has all of the audio files, plus a single image
  per directory (if there are any images at all in the source
  directory). This profile is used for playing music (with simple
  cover art).
  """

  # Files to keep.
  keep = []

  # Image files.
  images = []

  for entry in contents:
    if entry.is_dir():
      keep.append(entry)
      continue

    # Guess the mime type of the file.
    mime, encoding = mimetypes.guess_type(entry.path, strict=False)
    if mime is None:
      profile.log(
        logging.ERROR,
        'Unknown mimetype for %r',
        entry.path,
        )
      keep.append(entry)
      continue

    # Get the major part of the mime type.
    mime_major, __discard, __discard = mime.partition('/')

    # Decide whether to definitely keep the file (put it in keep),
    # possibly keep the file as an image (put it in images), or ignore
    # the file.
    if mime_major in (
        'audio',
        ):
      keep.append(entry)
    elif mime_major in (
        'application',
        'text',
        'video',
        ):
      # Skip it.
      continue
    elif mime_major in (
        'image',
        ):
      images.append(entry)
    else:
      profile.log(
        logging.ERROR,
        'Unsure what to do with %r, %r',
        mime,
        entry.path,
        )
      keep.append(entry)

  # If possible, select a single image to keep, and ignore all others.
  for prefix in (
      'front',
      'cover',
      'cd',
      'back',
      None,
      ):
    for suffix in (
        'png',
        'tif',
        'jpg',
        'gif',
        None,
        ):
      for image in images:
        if prefix is not None and suffix is not None:
          if image.name == prefix + '.' + suffix:
            return keep + [image]
        elif prefix is not None:
          if image.name.startswith(prefix + '.'):
            profile.log(
              logging.INFO,
              'Using sub-optimal image %r',
              image.path,
              )
            return keep + [image]
        elif suffix is not None:
          if image.name.endswith('.' + suffix):
            profile.log(
              logging.INFO,
              'Using sub-optimal image %r',
              image.path,
              )
            return keep + [image]
        else:
          profile.log(
            logging.WARNING,
            'Using arbitrary image %r',
            image.path,
            )
          return keep + [image]

  # This happens only if there are no images.
  return keep

# The default profile.
music_default = cohydra.profile.FilterProfile(
  top_dir='/home/dseomn/Music/profiles/default',
  parent=music_master,
  select_cb=music_default_select_cb,
  )


def music_large_select_cb(profile, src_relpath):
  """Select which files in the large profile get converted.

  The large profile recodes only lossless audio.
  """

  if src_relpath.endswith('.flac'):
    # Convert to ogg vorbis.
    return src_relpath + '.ogg'
  else:
    # Do not convert.
    return None

def music_large_convert_cb(profile, src, dst):
  """Convert a single file for the large profile.
  """

  subprocess.run(
    [
      'oggenc',
      '-Q',
      '-q', '8',
      '-o', dst,
      '--', src,
      ],
    stdout=subprocess.PIPE,
    stderr=subprocess.PIPE,
    check=True,
    )

# The large profile. This is derived from the default profile, but
# FLAC files are converted to OGG Vorbis, to save some space.
music_large = cohydra.profile.ConvertProfile(
  top_dir='/home/dseomn/Music/profiles/large',
  parent=music_default,
  select_cb=music_large_select_cb,
  convert_cb=music_large_convert_cb,
  )


# If I wanted, I could add other profiles that use different
# compression formats or parameters, to save even more space than
# music_large saves.


def music_videos_select_cb(profile, src_relpath, dst_relpath, contents):
  """Select which files to keep in the video profile.

  The video profile has only videos, and ISO images (which may contain
  videos).
  """

  keep = []

  for entry in contents:
    if entry.is_dir():
      keep.append(entry)
      continue

    mime, encoding = mimetypes.guess_type(entry.path, strict=False)
    if mime is None:
      continue

    if mime.startswith('video/') or mime in (
        'application/x-iso9660-image',
        ):
      keep.append(entry)

  return keep

# The video profile.
music_videos = cohydra.profile.FilterProfile(
  top_dir='/home/dseomn/Videos/[music]',
  parent=music_master,
  select_cb=music_videos_select_cb,
  )


if __name__ == '__main__':
  # Generate all descendant profiles of music_master.
  music_master.generate_all()