Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Comparing changes

Choose two branches to see what's changed or to start a new pull request. If you need to, you can also compare across forks.

Open a pull request

Create a new pull request by comparing changes across two branches. If you need to, you can also compare across forks.
base fork: meitham/albumiser
base: c89e34f23d
...
head fork: meitham/albumiser
compare: master
Checking mergeability… Don't worry, you can still create the pull request.
  • 14 commits
  • 3 files changed
  • 0 commit comments
  • 2 contributors
Commits on Apr 15, 2012
meitham fixed issue with --help cd549db
meitham Merge branch 'master' of github.com:meitham/albumiser 2c1ab26
Commits on May 10, 2012
Meitham Jamaa added fixes to images curropted with f-spot 63340e5
Commits on May 25, 2012
Meitham Jamaa fixed README 00fe216
Meitham Jamaa updated README 39538d2
Commits on Dec 26, 2012
meitham added exivfind d7dca60
meitham updated exiv to match more tags 6631630
meitham more fun 91ecc03
Commits on Dec 28, 2012
meitham added shebang 4040f5e
meitham Merge branch 'master' of github.com:meitham/albumiser 1d455bd
meitham pep8 af91a24
Commits on Dec 29, 2012
meitham more pep8 updates 7b4b2e2
meitham albumiser priortise date tags correctly
exivfind can do print-all-tags as a new command
0681cef
Commits on Jan 27, 2013
meitham removed exivfind 5e4613d
Showing with 319 additions and 674 deletions.
  1. +29 −31 README
  2. +290 −316 albumiser.py
  3. +0 −327 sqlalbumiser.py
View
60 README
@@ -1,38 +1,36 @@
Organise your photos.
-WORK IN __PROGRESS__
-
-NAME
- albumiser.py - flexable images organiser based on EXIF details.
-SYNOPSIS
- [python] albumiser.py [OPTION]... [SOURCE] [DESTINATION]
+WORK IN __PROGRESS__
-EXAMPLE
- [python] albumiser.py -r -v
+albumiser [-h] [--version] [-v | -q] [-r] [--rotate] [--dry-link] [-s]
+ [-m] [--delete-duplicates] [-d DEPTH] [-g LOG]
+ [--ignore-no-exif]
+ source target
-DESCRIPTION
- A flexiable python script that can operate on a tree of images and copy or
- move the files into directories based on the image EXIF details. It can detect
- duplicates using either MD5 or SHA of either the entire image file just the
- thumbnail. It has the ability to build or fix EXIF details from images that
- has wrong EXIF details, such as it can correct picture taken date on images.
+Organise your photos the way they should be
- The script can build a symlink of date based hierarchical directories. So you
- can have your images sorted in date based folders, a photo file that was taken
- at YYYY/MM/DD will end up in a folder of YYYY/MM/DD (posix) YYYY\MM\DD (win)
+positional arguments:
+ source
+ target
-Options:
- --version show program's version number and exit
+optional arguments:
-h, --help show this help message and exit
- -v, --verbose make lots of noise [default]
- -q, --quiet unless errors don't output anything
- -r, --recursive operate recursively [default]
- -s, --symlink follow symbolic linked directories
- -m, --move delete original file from SOURCE
- -d DEPTH, --depth=DEPTH
- unlimited [default: 0]
- -g LOG, --log=LOG log all actions [default: sys.stderr]
- -i, --ignore ignore photos with missing EXIF header [default]
- -p NOEXIFPATH, --process-no-exif=NOEXIFPATH
- copy/moves images with no EXIF data to [default:
- undated]
+ --version show program's version number and exit
+ -v, --verbose make loads of noise, useful for debugging!
+ -q, --quiet unless errors don't output anything.
+ -r, --recursive operate recursively.
+ --rotate rotate images according to their EXIF rotation tag.
+ --dry-link creates a tree of symbolic links under destination
+ with date time hierarchy preserving the original
+ images states.
+ -s, --follow_links follow symbolic linked directories.
+ -m, --move delete original file from SOURCE, by default it makes
+ a copy of the file.
+ --delete-duplicates delete duplicate files from SOURCE, by default it
+ ignores them and keep them intact.
+ -d DEPTH, --depth DEPTH
+ default is unlimited.
+ -g LOG, --log LOG log all actions, default is console.
+ --ignore-no-exif ignore photos with missing EXIF date, otherwise use
+ UNIX epoch.
+
View
606 albumiser.py
@@ -1,348 +1,322 @@
#!/usr/bin/env python
-#Meitham 01 July 2011
-__doc__ = """Organise your photos.
-
-NAME
- albumiser.py - flexable images organiser based on EXIF details.
-
-SYNOPSIS
- [python] albumiser.py [OPTION]... [SOURCE] [DESTINATION]
-
-EXAMPLE
- [python] albumiser.py -r -v
-
-DESCRIPTION
- A flexiable python script that can operate on a tree of images and copy or
- move the files into directories based on the image EXIF details. It can detect
- duplicates using either MD5 or SHA, it can based the duplication on the entire
- image or just the thumbnails. It has the ability to build or fix EXIF details
- from images that has wrong EXIF details, such as it can correct picture taken
- date on images.
-
- The script can build a symlink of date based hierarchical directories. So you
- can have your images sorted in date based folders, a photo file that was taken
- at YYYY/MM/DD will end up in a folder of YYYY/MM/DD (posix) YYYY\MM\DD (win)
-
-Options:
- --version show program's version number and exit
- -h, --help show this help message and exit
- -v, --verbose make lots of noise [default]
- -q, --quiet unless errors don't output anything
- -r, --recursive operate recursively [default]
- -s, --symlink follow symbolic linked directories
- -m, --move delete original file from SOURCE
- -d DEPTH, --depth=DEPTH
- unlimited [default: 0]
- -g LOG, --log=LOG log all actions [default: sys.stderr]
- -i, --ignore ignore photos with missing EXIF header [default]
- -p NOEXIFPATH, --process-no-exif=NOEXIFPATH
- copy/moves images with no EXIF data to [default:
- undated]
-"""
-from optparse import OptionParser
-from datetime import datetime, timedelta
-import Image
-import sys
+import argparse
+import functools
+import hashlib
+import logging
import os
import shutil
-import string
-import hashlib
-import logging, logging.handlers
+import sqlite3
+import sys
+import tempfile
+import traceback
try:
- import pyexiv2
+ import ipdb as pdb
except ImportError:
- print "Missing dependency: please install pyexiv2 in order to use this script"
+ import pdb
-__version__ = "0.101"
+from datetime import datetime
-IMAGE_EXT = ['.jpg','jpeg','.png']
-
-class ImageException(Exception):
- pass
-
-class InvalidImageFile(ImageException):
- pass
+try:
+ import pyexiv2
+except ImportError:
+ print("Missing dependency: pyexiv2 is required.")
+ sys.exit(1)
-class InvalidDateTag(ImageException):
- pass
+__version__ = '0.05'
+IMAGE_EXT = ('.jpg', 'jpeg', '.png')
-class MissingDateTag(ImageException):
- pass
-class MissingImageFile(ImageException):
- pass
-
-class DuplicateImage(ImageException):
- pass
+def get_options():
+ """CLI interface
+ """
+ parser = argparse.ArgumentParser(prog='albumiser',
+ version=__version__,
+ description="Organise your photos the way they should be",
+ conflict_handler='resolve')
+ # either be too verbose or quite, you cannot be both
+ verbose = parser.add_mutually_exclusive_group()
+ verbose.add_argument('-v', '--verbose',
+ action='count', dest='verbose',
+ default=2, # 1 being debug, 2 info, 3 warning, etc ...
+ help="make loads of noise, useful for debugging!")
+ verbose.add_argument('-q', '--quiet',
+ action='store_false', dest='verbose',
+ help="unless errors don't output anything.")
+ parser.add_argument('-r', '--recursive',
+ action='store_true', dest='recursive',
+ help="operate recursively.")
+ parser.add_argument('--rotate',
+ action='store_true', dest='rotate',
+ help="rotate images according to their EXIF rotation tag.")
+ parser.add_argument('--dry-link',
+ action='store_true', dest='link',
+ help="creates a tree of symbolic links under destination with "
+ "date time hierarchy preserving the original images states.")
+ parser.add_argument('-s', '--follow_links',
+ action='store_true', dest='follow_links',
+ help="follow symbolic linked directories.")
+ parser.add_argument('-m', '--move',
+ action='store_true', dest='move',
+ help="delete original file from SOURCE, by default it makes a copy"
+ " of the file.")
+ parser.add_argument('--delete-duplicates',
+ action='store_true', dest='delete_duplicates',
+ help="delete duplicate files from SOURCE, by default it ignores "
+ "them and keep them intact.")
+ parser.add_argument('-d', '--depth',
+ dest='depth',
+ help="default is unlimited.")
+ parser.add_argument('-g', '--log',
+ default=None, dest='log',
+ help="log all actions, default is console.")
+ parser.add_argument('--ignore-thumbnails',
+ action='store_true', dest='ignore_thumbnails',
+ default=False,
+ help="Don't use thumbnails for image comparison, slower.")
+ parser.add_argument('--ignore-no-exif',
+ action='store_true', dest='ignore_no_exif',
+ default=False,
+ help="ignore photos with missing EXIF date, otherwise use epoch.")
+ parser.add_argument('source', action="store")
+ parser.add_argument('target', action="store")
+ ns = parser.parse_args()
+ # handle options
+ logger = logging.getLogger('')
+ if ns.log:
+ hdlr = logging.FileHandler(ns.log)
+ formatter = logging.Formatter('%(asctime)s %(levelname)s %(message)s')
+ hdlr.setFormatter(formatter)
+ logger.addHandler(hdlr)
+ else:
+ logger.addHandler(logging.StreamHandler())
+ if ns.verbose:
+ # I wish Python used 0, 1, 2 rather than 0, 10, 20 for level numerics
+ level = 60 - ns.verbose * 10
+ logger.setLevel(level)
+ logger.debug("verbose mode on")
+ else:
+ level = 50 # 50 being error/fatal
+ logger.setLevel(level)
+ if ns.source == ns.target:
+ ns.target = tempfile.mkdtemp()
+ ns.target_is_temp = True
+ logger.info("using temp dir at %(target)s" % {'target': ns.target})
+ if ns.follow_links:
+ logger.debug("following symlinks")
+ else:
+ logger.debug("ignoring symlinks")
+ if ns.link:
+ logger.debug("creating a tree of links to photos")
+ ns.move = False
+ # check for non implemented options
+ if ns.rotate:
+ raise NotImplementedError("rotating images is not supported yet")
+ if ns.depth is not None:
+ ns.depth = int(ns.depth)
+ logger.debug("depth is: %(depth)r" % {'depth': ns.depth})
+ return ns
-class NoThumbnailFound(ImageException):
- pass
-def getOptions():
- ''' creates the options and return a parser object
- '''
- parser = OptionParser(usage="%prog [options] src dest", version="%prog 0.1")
- parser.add_option("-v", "--verbose",
- action="store_true", dest="verbose",
- default=True,
- help="make lots of noise, mainly used for debugging")
- parser.add_option("-q", "--quiet",
- action="store_false", dest="verbose",
- help="unless errors don't output anything.")
- parser.add_option("-r", "--recursive",
- action="store_true", dest="recursive",
- help="operate recursively.")
- parser.add_option("-R", "--rotate",
- action="store_true", dest="rotate",
- help="rotate images according to their EXIF rotation tag.")
- parser.add_option("-k", "--dry-link",
- action="store_true", dest="link",
- help="creates a tree of symbolic links under destination with date time\
- hierarchy preserving the original images states.")
- parser.add_option("-s", "--symlink",
- action="store_true", dest="symlink",
- help="follow symbolic linked directories.")
- parser.add_option("-m", "--move",
- action="store_true", dest="move",
- help="delete original file from SOURCE, by default it makes a copy of the file.")
- parser.add_option("-d", "--depth",
- default="0", type='int', dest="depth",
- help="unlimited [default: %default].")
- parser.add_option("-g", "--log",
- default="sys.stderr", dest="log",
- help="log all actions [default: %default].")
- # parser.add_option("-i", "--ignore",
- # action="store_true", dest="ignore", default=True,
- # help="ignore photos with missing EXIF header.")
- parser.add_option("-p", "--process-no-exif",
- default="undated", dest="noExifPath",
- help="copy/moves images with no EXIF data to [default: %default].")
- return parser
+class TreeWalker(object):
+ """provides a functionality similar to os.walk but can do
+ pre defined depth when needed.
+ """
+ def __init__(self, top='/', max_depth=None, *args, **kwargs):
+ self._top = top
+ self._max_depth = max_depth
+ self.logger = kwargs.get('logger', logging.getLogger(''))
+ self._follow_links = kwargs.get("follow_links", False)
+ self._depth = 0
+ self._recursive = self._max_depth is None or self._max_depth > 0
-def makeLogger(log=None):
- ''' returns a logger class, call first when used in shell, otherwise
- all objects complain of missing logger
- '''
- logger = logging.getLogger('')
- if log == None:
- # "sys.stderr"
- console = logging.StreamHandler()
- logger.addHandler(console)
- else:
- fileHandler = logging.FileHandler(log)
- logger.addHandler(fileHandler)
- return logger
+ def __repr__(self):
+ return ("TreeWalker(top='%'top)s', max_depth='%(depth)r')" %
+ {'top': self._top, 'depth': self._max_depth})
-def treewalk(top, recursive=False, followlinks=False, depth=0):
- ''' generator similar to os.walk(), but with limited subdirectory depth
- '''
- global logger
- if not maxdepth is None:
- if depth > maxdepth:
+ def walk(self, top=None, depth=0):
+ if not top:
+ top = self._top
+ if self._max_depth is not None and depth > self._max_depth:
return
- for f in os.listdir(top):
- file_path = os.path.join(top, f)
- if os.path.isdir(file_path):
- # its a dir recurse into it
- if recursive:
- islink = os.path.islink(file_path)
- if (islink and followlinks) or not islink:
- for dirpath, filename in treewalk(file_path, recursive, followlinks, depth+1):
- yield dirpath, filename
- elif os.path.isfile(file_path):
- yield top, f
- else:
- # Unknown file type, print a message
- logger.info('Skipping %s' % pathname)
+ for f in os.listdir(top):
+ file_path = os.path.join(top, f)
+ if os.path.isdir(file_path) and self._recursive:
+ # its a dir recurse into it
+ is_link = os.path.islink(file_path)
+ if is_link and not self._follow_links:
+ continue # we won't follow links
+ for dir_path, file_name in self.walk(file_path, depth + 1):
+ yield dir_path, file_name
+ elif os.path.isfile(file_path):
+ yield top, f
+ else:
+ # Unknown file type, print a message
+ self.logger.warning("Skipping %(path)s" % {'path': file_path})
-def isPhoto(path):
- '''return true if image has a known image extension file
- '''
+
+def is_image_file(path):
+ """Return true if image has a known image extension file
+ """
try:
_, ext = os.path.splitext(path)
- return string.lower(ext) in IMAGE_EXT
+ return ext.lower() in IMAGE_EXT
except:
return False
-def isDuplicate(file1, file2, func=None):
- ''' checks if two files have exactly same content
- '''
- if func == None:
- func = thumbDigest
- try:
- # lets try thumbnail
- return func(file1) == func(file2)
- except NoThumbnailFound:
- # resort to full file digest
- func = sha256HexDigest
- return func(file1) == func(file2)
-def sha256HexDigest(filename):
- '''returns an sha256 hex digest of an input
- '''
- buf = open(filename).read()
+def sha_digest(indata):
+ """returns an sha256 hex digest of an input
+ """
h = hashlib.sha256()
- h.update(buf)
+ h.update(indata)
return h.hexdigest()
-def thumbDigest(filename):
- '''returns a digest of the thumb image
- '''
- metadata = pyexiv2.ImageMetadata(filename)
- metadata.read()
- if not metadata.previews:
- raise NoThumbnailFound(filename)
- h = hashlib.sha256()
- h.update(metadata.previews[-1].data)
- return h.hexdigest()
-
-class ImageFile:
- ''' an image file object handler
- '''
- def __init__(self, fullfilepath):
- ''' constructor
- '''
- global logger # use global logger / should really argument this
- self.logger = logger
-
- # file existance
- if not os.path.exists(fullfilepath): # file missing
- raise MissingImageFile('file not found %s' %fullfilepath)
-
- # file extension
- self.basename, self.ext = os.path.splitext(fullfilepath)
- self.ext = self.ext.lower() # i hate uppercased extensions
- if not self.ext in IMAGE_EXT:
- raise InvalidImageFile('invalid image file %s' %fullfilepath)
-
- self.metadata = pyexiv2.ImageMetadata(fullfilepath)
- try:
- self.metadata.read()
- except IOError:
- raise InvalidImageFile('invalid image file %s' %fullfilepath)
-
- if not self.metadata:
- raise MissingImageFile('missing exif info %s' %fullfilepath)
- self.srcfilepath = fullfilepath
+def main():
+ ns = get_options()
+ conn = sqlite3.connect('/tmp/albumise.sqlite')
+ conn.text_factory = str
+ c = conn.cursor()
+ c.execute("drop table if exists images")
+ #TODO add a new column for the year/month/day/hour/minute/second
+ c.execute("""
+ create table images (
+ hash text unique,
+ path text unique,
+ created text,
+ destination text,
+ status text,
+ error text)
+ """)
+ #TODO add a new index for the new column
+ c.execute("create index idx_hash on images(hash)")
+ c.execute("create index idx_path on images(path)")
+ c.execute("create index idx_status on images(status)")
+ c.execute("create index idx_created on images(created)")
+ conn.commit()
+ undated_counter = 0
+ logger = logging.getLogger('')
+ logger.debug("about to walk files")
+ tree_walker = TreeWalker(ns.source, ns.depth, logger=logger,
+ follow_symlinks=ns.follow_links)
+ for dir_path, file_name in tree_walker.walk():
+ undated = False
+ imd = None
try:
- self.imagedate = self.metadata['Exif.Image.DateTime'].value
- except KeyError:
- self.logger.debug("Exif.Image.DateTime key is missing from exif")
+ path = os.path.join(dir_path, file_name)
+ _,ext = os.path.splitext(path)
+ ext = ext.lower()
+ logger.debug("processing %(path)s" % locals())
+ if not is_image_file(path):
+ # ignore non image extensions
+ logger.warning("ignoring non image file %(path)s" % locals())
+ c.execute("insert or ignore into images values(?,?,?,?,?,?)",
+ (path, path, None, None, "IGNORE", "non image file"))
+ continue
try:
- # to make my htc android happy
- self.imagedate = self.metadata['Exif.Photo.DateTimeOriginal'].value
- except KeyError:
- # here we handle images that has no date, this will come later
- raise InvalidDateTag('Exif.Photo.DateTimeOriginal key is missing from exif')
- self.datedFileName = self.imagedate.strftime('%Y_%m_%d-%H_%M_%S')
-
- def getUniquePath(self, base):
- ''' gets a proper name and path with no duplications
- '''
- if self.imagedate is None:
- # i should handle images with no date here
- self.logger.info("no image date for file %s " %self.srcfilepath)
- # i could either return the file creation date
- # or just a string.
- return None
- seq = 0
- imd = self.imagedate
- self.dstdir = os.path.join(base, str(imd.year), str(imd.month), str(imd.day))
- while True:
- fileName = self.datedFileName
- if seq:
- fileName+='_%0.0f'%seq
- fileName += self.ext
- self.dstfilename = os.path.join(self.dstdir, fileName)
- if os.path.exists(self.dstfilename):
- self.logger.info("image with similar date/time stamp already exists %s " %self.dstfilename)
- if isDuplicate(self.srcfilepath, self.dstfilename):
- self.logger.info(".. this appars to be a duplicate %s " %self.dstfilename)
- raise DuplicateImage('image %s already exists - duplicate'%self.dstfilename)
- else:
- self.logger.info(".. this is not a duplicate %s " %self.dstfilename)
- seq+=1
+ metadata = pyexiv2.ImageMetadata(path)
+ metadata.read()
+ except(IOError, UnicodeDecodeError) as e:
+ logger.error("unable to get EXIF from %(path)s" % locals())
+ # use file digest
+ digest = sha_digest(path)
+ c.execute("insert or ignore into images values(?,?,?,?,?,?)",
+ (digest, path, None, None, "FAILED", str(e)))
+ conn.commit()
+ continue
+ # handle digest
+ try:
+ # reading thumbnails
+ logger.debug("trying thumbnail digest")
+ if not ns.ignore_thumbnails and len(metadata.previews) > 0:
+ largest = metadata.previews[-1]
+ data = largest.data
+ else: # no thumbnails available
+ data = metadata.buffer
+ except Exception as e:
+ # when metadata is not available use the file content hash
+ logger.exception(e)
+ logger.debug("unable to use metadata - using file digest")
+ with open(path) as f:
+ data = f.read()
+ finally:
+ digest = sha_digest(data)
+ # check if we have had this file before
+ c.execute("""select * from images where hash=?""",(digest,))
+ if len(c.fetchall())>0:
+ logger.warning("file %(path)s is a duplicate" % locals())
+ if not ns.delete_duplicates:
continue
- else:
- return self.dstfilename
-
- def move(self, base, deleteOriginal=False, link=False):
- try:
- if self.getUniquePath(base) is None:
- self.logger.info("unknown destination for image %s " %self.srcfilepath)
- return
- except DuplicateImage, e:
- self.logger.info(e)
- if deleteOriginal:
- os.remove(self.srcfilepath)
- return
- dstdir = os.path.dirname(self.dstfilename)
- if not (os.path.exists(dstdir) and os.path.isdir(dstdir)):
- self.logger.info("creating dir %s" %dstdir)
- os.makedirs(dstdir)
- if link:
- self.logger.info("linking %s ==> %s " %(self.srcfilepath, self.dstfilename))
- if os.path.islink(self.srcfilepath):
- self.srcfilepath = os.readlink(self.srcfilepath)
- os.symlink(self.srcfilepath, self.dstfilename)
- return
- if deleteOriginal:
- self.logger.info("moving %s ==> %s " %(self.srcfilepath, self.dstfilename))
- shutil.move(self.srcfilepath, self.dstfilename)
+ # delete duplicate file
+ logger.debug('deleting duplicate file %(path)s' % locals())
+ c.execute("insert or ignore into images values(?,?,?,?,?,?)",
+ (digest, path, None, None, "DUPLICATE", None))
+ os.remove(path)
+ continue
+ imd = None
+ tag = metadata.get('Exif.Photo.DateTimeOriginal', None)
+ if tag is None:
+ tag = metadata.get('Exif.Photo.DateTimeDigitized', None)
+ if tag is None:
+ tag = metadata.get('Exif.Image.DateTime', None)
+ if tag is not None:
+ imd = tag.value
+ if imd is None or not isinstance(imd, datetime):
+ logger.warning("Exif Date Tags are missing - using UNIX epoch")
+ undated = True
+ undated_counter+=1
+ imd = datetime(1970,1,1)
+ yyyy = str(imd.year)
+ mm = str(imd.month)
+ dd = str(imd.day)
+ dstdir = os.path.join(ns.target,
+ yyyy,
+ '%(y)04d-%(m)02d' % {'y': imd.year, 'm': imd.month},
+ '%(y)04d-%(m)02d-%(d)02d' %
+ {'y': imd.year, 'm': imd.month, 'd': imd.day})
+ # ospj is short for os.path.join where dstdir is already in
+ ospj = functools.partial(os.path.join, dstdir)
+ dtmf = '%Y_%m_%d-%H_%M_%S'
+ if undated:
+ dstdir = ospj(''.join([imd.strftime(dtmf), '_',
+ str(undated_counter), ext]))
+ else: # dated
+ c.execute("select * from images where created=?", (str(imd),))
+ images_in_second = len(c.fetchall())
+ if images_in_second > 0:
+ # we have more than one image in this second
+ dstdir = ospj(''.join([imd.strftime(dtmf), '.',
+ str(images_in_second), ext]))
+ else:
+ dstdir = ospj(''.join([imd.strftime(dtmf), ext]))
+ # TODO
+ # check here if the dated image already exists in the database
+ # select * from images where path like dstdir
+ # use the count of the query result
+ c.execute("insert or ignore into images values(?,?,?,?,?,?)",
+ (digest, path, str(imd), dstdir, "READY", None))
+ conn.commit()
+ logger.info("%(path)s added to DB" % locals())
+ except Exception as e:
+ logger.exception(traceback.print_exc())
+ continue
+ conn.commit()
+ # database is now built - process files
+ c.execute("""select * from images where status='READY'""")
+ for row in c:
+ source = row[1]
+ target = row[3]
+ target_dir = os.path.dirname(target)
+ if not os.path.exists(target_dir):
+ os.makedirs(target_dir)
+ if ns.move:
+ shutil.move(source, target)
+ elif ns.link:
+ if os.path.islink(source):
+ source = os.readlink(source)
+ os.symlink(source, target)
else:
- self.logger.info("copying %s ==> %s " %(self.srcfilepath, self.dstfilename))
- shutil.copy2(self.srcfilepath, self.dstfilename)
+ shutil.copy2(source, target)
-
if __name__=='__main__':
- ''' main
- '''
- parser = getOptions()
- (options, args) = parser.parse_args()
- global logger
- logger = makeLogger(options.log)
- if len(args) == 1:
- src = dst = args[0]
- # this needs to build the tree under temp and move it to dest when done
- elif len(args) == 2:
- src = args[0]
- dst = args[1]
- else:
- logger.error("invalid number of arguments")
- parser.print_help()
- sys.exit()
- if options.verbose:
- logger.setLevel(logging.DEBUG)
- logger.debug('verbose mode on')
- else:
- logger.debug('verbose mode off')
- if options.symlink:
- logger.debug("following symlinks")
- else:
- logger.debug("ignoring symlinks")
- if options.link:
- logger.debug("creating a tree of links to photos")
- options.move = False
- # check for non implemented options
- if options.rotate:
- raise NotImplementedError('rotating images is not supported yet')
- maxdepth = options.depth = int(options.depth)
- logger.debug("depth is: "+ str(options.depth))
- if maxdepth == 0:
- maxdepth = None
- for dirpath, filename in treewalk(src, options.recursive, options.symlink, options.depth):
- fullfilepath = os.path.join(dirpath, filename)
- logger.debug("processing %s" %fullfilepath)
- if not isPhoto(fullfilepath): # ignore non image extensions
- logger.debug("ignoring non image file %s" %fullfilepath)
- continue
- try:
- imagefile = ImageFile(fullfilepath)
- except ImageException, e:
- logger.debug(e)
- continue
- except Exception, e:
- logger.debug(e)
- continue
- imagefile.move(dst, options.move, options.link)
+ main()
View
327 sqlalbumiser.py
@@ -1,327 +0,0 @@
-import argparse
-import hashlib
-import logging, logging.handlers
-import os
-import shutil
-import sqlite3
-import sys
-import tempfile
-
-from datetime import datetime, timedelta
-from optparse import OptionParser
-
-try:
- import pyexiv2
-except ImportError:
- print('Missing dependency: please install pyexiv2 in order to use this script')
- sys.exit(1)
-
-__version__ = '0.05'
-IMAGE_EXT = ['.jpg','jpeg','.png']
-
-def get_options():
- """ Creates the options and return a parser object
- """
- #parser = OptionParser(usage='sqlalbumiser.py [options] src dest', version='%s'%__version__)
- parser = argparse.ArgumentParser(prog='albumiser',
- version=__version__,
- description='Organise your photos as you wish.',
- conflict_handler='resolve')
-
- # either be too verbose or quite, you cannot be both
- verbose = parser.add_mutually_exclusive_group()
-
- verbose.add_argument('-v', '--verbose',
- action='store_true', dest='verbose', default=True,
- help='make loads of noise, useful for debugging!')
-
- verbose.add_argument('-q', '--quiet',
- action='store_false', dest='verbose',
- help="unless errors don't output anything.")
-
- parser.add_argument('-r', '--recursive',
- action='store_true', dest='recursive',
- help='operate recursively.')
-
- parser.add_argument('--rotate',
- action='store_true', dest='rotate',
- help='rotate images according to their EXIF rotation tag.')
-
- parser.add_argument('--dry-link',
- action='store_true', dest='link',
- help='creates a tree of symbolic links under destination with date time\
- hierarchy preserving the original images states.')
-
- parser.add_argument('-s', '--follow_links',
- action='store_true', dest='follow_links',
- help='follow symbolic linked directories.')
-
- parser.add_argument('-m', '--move',
- action='store_true', dest='move',
- help='delete original file from SOURCE, by default it makes a copy of the file.')
-
- parser.add_argument('--delete-duplicates',
- action='store_true', dest='delete_duplicates',
- help='delete duplicate files from SOURCE, by default it ignores them and keep them intact.')
-
- parser.add_argument('-d', '--depth',
- type=int, dest='depth',
- help='unlimited [default: %default].')
-
- parser.add_argument('-g', '--log',
- default=None, dest='log',
- help='log all actions, default is console.')
- parser.add_argument('source', action="store", nargs='?', default=os.getcwd())
- parser.add_argument('target', action="store", nargs='?', default=os.getcwd())
- # photos with no EXIF dates will either be copied into a special directory
- # or will assume a UNIX epoch 1970-01-01, latter is default
- no_exif = parser.add_mutually_exclusive_group()
-
- no_exif.add_argument('--process-no-exif',
- default='undated', dest='process_no_exif',
- help='copy/moves images with no EXIF data to [default: %default].')
-
- no_exif.add_argument('--ignore-no-exif',
- action='store_true', dest='ignore_no_exif', default=False,
- help='ignore photos with missing EXIF date, otherwise use UNIX epoch.')
- return parser
-
-def get_logger(log=None):
- """ returns a logger object
- """
- logger = logging.getLogger('albumiser')
- if log is None:
- # 'sys.stderr'
- logger.addHandler(logging.StreamHandler())
- else:
- logger.addHandler(logging.FileHandler(log))
- return logger
-
-def handle_options():
- parser = get_options()
- ns = parser.parse_args()
- #ipdb.set_trace()
- logger = get_logger(ns.log)
- if ns.source == ns.target:
- ns.target = tempfile.mkdtemp()
- ns.target_is_temp = True
- if ns.verbose:
- logger.setLevel(logging.DEBUG)
- logger.debug('verbose mode on')
- else:
- logger.debug('verbose mode off')
- if ns.follow_links:
- logger.debug('following symlinks')
- else:
- logger.debug('ignoring symlinks')
- if ns.link:
- logger.debug('creating a tree of links to photos')
- ns.move = False
- # check for non implemented options
- if ns.rotate:
- raise NotImplementedError('rotating images is not supported yet')
- if ns.depth is None:
- max_depth = ns.depth = None
- else:
- max_depth = ns.depth = int(ns.depth)
-
- logger.debug('depth is: %r'%ns.depth)
- if max_depth == 0:
- max_depth = None
-
- return ns
-
-
-class TreeWalker:
- """provides a functionality similar to os.walk but can do
- pre defined depth when needed.
- """
- def __init__(self, top='/', max_depth=None, *args, **kwargs):
- self._top = top
- self._max_depth = max_depth
- self.logger = kwargs.get('logger', None)
- self._depth = 0
-
- if self._max_depth is None or self._max_depth>0:
- self._recursive = True
- else:
- self._recursive = False
-
- self._follow_links = kwargs.get('follow_links', False)
-
- def __repr__(self):
- return '<TreeWalker top=%s, max_depth=%r>'%(self._top, self._max_depth)
-
- def walk(self, top=None, depth=0):
- if not top:
- top = self._top
- if self._max_depth is not None:
- if depth > self._max_depth:
- return
- for f in os.listdir(top):
- file_path = os.path.join(top, f)
- if os.path.isdir(file_path):
- # its a dir recurse into it
- if self._recursive:
- islink = os.path.islink(file_path)
- if (islink and self._follow_links) or not islink:
- for dirpath, filename in self.walk(file_path, depth+1):
- yield dirpath, filename
- elif os.path.isfile(file_path):
- yield top, f
- else:
- # Unknown file type, print a message
- if self.logger:
- self.logger.info('Skipping %s' % file_path)
-
-
-def isPhoto(path):
- """return true if image has a known image extension file
- """
- try:
- _, ext = os.path.splitext(path)
- return ext.lower() in IMAGE_EXT
- except:
- return False
-
-def sha256HexDigest(indata):
- """returns an sha256 hex digest of an input
- """
- h = hashlib.sha256()
- h.update(indata)
- return h.hexdigest()
-
-
-def main():
- ns = handle_options()
- conn = sqlite3.connect('/tmp/albumise.sqlite')
- conn.text_factory = str
- c = conn.cursor()
- c.execute("DROP TABLE IF EXISTS IMAGES")
- c.execute("""
- CREATE TABLE IMAGES (
- HASH TEXT UNIQUE,
- PATH TEXT UNIQUE,
- DESTINATION TEXT,
- STATUS TEXT,
- ERROR TEXT)
- """)
- c.execute("""create index idx1 on images(hash)""")
- c.execute("""create index idx2 on images(path)""")
- c.execute("""create index idx3 on images(status)""")
- conn.commit()
- undated_counter = 0
- logger = logging.getLogger('albumiser')
- logger.debug('about to walk files')
- tree_walker = TreeWalker(ns.source, ns.depth, logger=logger, follow_symlinks=ns.follow_links)
- #ipdb.set_trace()
- for dirpath, filename in tree_walker.walk():
- undated = False
- try:
- path = os.path.join(dirpath, filename)
- _,ext = os.path.splitext(path)
- ext = ext.lower()
- logger.debug('processing %s' %path)
- if not isPhoto(path):
- # ignore non image extensions
- logger.warning('ignoring non image file %s' %path)
- c.execute("""insert or ignore into images values(?,?,?,?,?)""",
- (path, path, None, 'IGNORE', 'non image file'))
- continue
-
- try:
- metadata = pyexiv2.ImageMetadata(path)
- metadata.read()
- except(IOError, UnicodeDecodeError) as e:
- logger.error('unable to read image metadata %s' %path)
- # use file digest
- digest = sha256HexDigest(path)
- c.execute("""insert or ignore into images values(?,?,?,?,?)""",
- (digest, path, None, 'FAILED', str(e)))
- conn.commit()
- continue
-
- # handle digest
- try:
- # reading thumbnails
- logger.debug('trying thumbnail digest')
- if len(metadata.previews) > 0:
- largest = metadata.previews[-1]
- data = largest.data
- else: # no thumbnails available
- data = metadata.buffer
- except Exception as e:
- # when metadata is not available use the entire file content hash
- logger.exception(e)
- logger.debug('unable to use metadata - using file content digest')
- with open(path) as f:
- data = f.read()
- finally:
- digest = sha256HexDigest(data)
- # check if we have had this file before
- c.execute("""select * from images where hash=?""",(digest,))
- if len(c.fetchall())>0:
- logger.warning('file %s already processed'%path)
- if ns.delete_duplicates:
- logger.debug('deleting duplicate file %s'%path)
- c.execute("""insert or ignore into images values(?,?,?,?,?)""",
- (digest, path, None, 'DUPLICATE', None))
- os.remove(path)
- continue # we have this object already
-
- try:
- imd = metadata['Exif.Image.DateTime'].value
- except KeyError:
- logger.warning('Exif.Image.DateTime key is missing from exif')
- try:
- # to make my htc android happy
- imd = metadata['Exif.Photo.DateTimeOriginal'].value
- except KeyError:
- logger.warning('Exif.Photo.DateTimeOriginal key is missing from exif using UNIX epoch')
- if imd is None or not isinstance(imd, datetime):
- undated = True
- undated_counter+=1
- imd = datetime(1970,1,1)
- yyyy = str(imd.year)
- mm = str(imd.month)
- dd = str(imd.day)
-
- dstdir = os.path.join(ns.target, yyyy, '%04d-%02d'%(imd.year,imd.month),
- '%04d-%02d-%02d'%(imd.year,imd.month,imd.day))
- if undated:
- dstdir = os.path.join(dstdir, imd.strftime('%Y_%m_%d-%H_%M_%S-')
- +str(undated_counter)+ext)
- else:
- dstdir = os.path.join(dstdir, imd.strftime('%Y_%m_%d-%H_%M_%S')+ext)
- c.execute("""insert or ignore into images values(?,?,?,?,?)""",
- (digest, path, dstdir, 'READY', None))
- conn.commit()
- logger.info('%s added to DB'%path)
- except Exception as e:
- logger.debug('ATTENTION: unexpected error')
- logger.exception(e)
- continue
- conn.commit()
- #c.close()
-
- # database is now built - process files
- c.execute("""select * from images where status='READY'""")
- for row in c:
- source = row[1]
- target = row[2]
- target_dir = os.path.dirname(target)
- if not os.path.exists(target_dir):
- os.makedirs(target_dir)
- if ns.move:
- shutil.move(source, target)
- elif ns.link:
- if os.path.islink(source):
- source = os.readlink(source)
- os.symlink(source, target)
- else:
- shutil.copy2(source, target)
-
-if __name__=='__main__':
- """ main
- """
- main()

No commit comments for this range

Something went wrong with that request. Please try again.