Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP
Browse files

working dual js data scheme

  • Loading branch information...
commit 5378252d140640966cfaba4b6c3656573496b72d 1 parent 91d2432
@reedwade authored
Showing with 174 additions and 77 deletions.
  1. +112 −45 README
  2. +8 −8 photos.html
  3. +54 −24 picbackflick.py
View
157 README
@@ -1,71 +1,138 @@
PicBackFlick -- my favourite Flickr back up utility so far
+Reed Wade <reedwade@gmail.com>, 2011-04-16
+
+Features
-Features:
- - fetch recently changed (or all) images
- - collects some additional data about each photo
- - locally web browsable content store
- - suitable for running via cron long term to keep collection current
- - NEW! support for video
+ - fetch recently changed (or all) images
+ - collects some additional data about each photo
+ - locally web browsable content store
+ - suitable for running via cron long term to keep collection current
+ - support for video
-This is an early but highly functional Flickr backup utility. It works
-fine for getting recently changed images and a small slice of the related
-photo data. My intent is to expand this so that all related data is collected.
+ This is an early but highly functional Flickr backup utility. It works
+ fine for getting recently changed images and a small slice of the related
+ photo data. My intent is to expand this so that all related data is collected.
-Not yet but in the works:
- - better html templates
- - store all photo data Flickr can provide
- - sets, collections, galleries
- - improve photo data storage scheme
- - several items noted inside the script (look for TODO)
- - etc
+Not yet but in the works
+ - better html templates
+ - store all photo data Flickr can provide
+ - sets, collections, galleries
+ - etc
Quick Start Guide:
-1- You'll need the Python Flickr API (previously known as "Beej's Python Flickr API")
- If you have a recent Debian / Ubuntu setup you should be able to install it via--
- sudo apt-get install python-flickrapi
- Otherwise, see the installation instructions at http://stuvel.eu/flickrapi
+ 1- You'll need the Python Flickr API (previously known as "Beej's Python Flickr API")
+ If you have a recent Debian / Ubuntu setup you should be able to install it via--
+ sudo apt-get install python-flickrapi
+ Otherwise, see the installation instructions at http://stuvel.eu/flickrapi
+
+ 2- You need your own Flickr API key and secret, this only takes a moment from:
+ http://www.flickr.com/services/api/keys/apply/
+
+ 3- Set up your config file. picbackflick.py will create an example for you:
+
+ bash$ ./picbackflick.py -d 3
+ ~/.picbackflick.conf doesn't exist, a new one has been created for you which you must now edit
+
+ 4- Do the trick. This will grab any photos updated in the last 3 days.
+
+ bash$ ./picbackflick.py -d 3
+
+ The first time you run the application it will walk you through an authorisation step
+ which is described here: http://stuvel.eu/media/flickrapi-docs/documentation/#authentication
+
+ 5- Admire your efforts. Copy photos.html into ~/flickr_backups and view it in a web browser.
+ (Make sure you're looking at the one in that directory.)
+ You should see a thumbnail for each photo along with some other info.
+
+ Run the script with -h to see more options. Run the script with no options and it will
+ look for any new photos which have been added since the last time.
+
+
+Copying:
+
+ PicBackFlick is licensed under the AGPL.
+
+
+Configuration
-2- You need your own Flickr API key and secret, this only takes a moment from:
- http://www.flickr.com/services/api/keys/apply/
+ ~/.picbackflick.conf is the default application (INI formatted) config file.
+
+ PicBackFlick will create one if it doesn't already exist but you will need to edit it
+ in order to do anything useful.
+
+ Settings:
+
+ api_key
+ api_secret
+ You'll need to place your Flickr API key and secret there before you can download photos.
+ Go to http://www.flickr.com/services/api/keys/apply/ and fill in the form to get yours.
+
+ flickr_username
+ Set this to the username which appears in your Flickr photo page URL. This is used to
+ compose links to your photo pages
+
+ photos_path = ~/flickr_backups
+ This is where all your photos and related data will be stored.
+
+
+What Gets Fetched?
+
+ On startup, PicBackFlick looks for ~/flickr_backups/last_updated to see how for back in time to
+ look for new or updated photos. On completion, this file is updated. If the file doesn't exist
+ it collects all your photos.
+
+ If the --days-ago option is given then last_updated is not consulted and any changes in the past
+ given days are fetched. Once done, last_updated is set to now.
+
+ If an image or video has already been fetched it will not be downloaded again.
-3- Set up your config file. picbackflick.py will create an example for you:
- bash$ ./picbackflick.py -d 3
- ~/.picbackflick.conf doesn't exist, a new one has been created for you which you must now edit
+Image Store
-4- Do the trick. This will grab any photos updated in the last 3 days.
+ Images are placed in img/ under the photos_path directory. Under img/ are dictories for each image size (o,s,b for original, small and big).
- bash$ ./picbackflick.py -d 3
-
- The first time you run the application it will walk you through an authorisation step
- which is described here: http://stuvel.eu/media/flickrapi-docs/documentation/#authentication
+ Under the size directory is another named using the last 2 characters of the photo id. The image is then named id_[osb].jpg (originals might have a different extension).
-5- Admire your efforts. Copy photos.html into ~/flickr_backups and view it in a web browser.
- (Make sure you're looking at the one in that directory.)
- You should see a thumbnail for each photo along with some other info.
-
-Run the script with -h to see more options.
+ Examples:
+ ~/flickr_backups/img/s/89/123456789_s.jpg
+ ~/flickr_backups/img/o/95/552833895_o.png
+
+ See http://www.flickr.com/services/api/misc.urls.html for more about the image sizes available from Flickr
+
+ Video files are stored under ~/flickr_backups/video/ per ~/flickr_backups/video/95/552833895.mov
+ Videos will also have poster images in the img/ directories including an 'original'.
-FILES:
+Info Store
- ~/.picbackflick.conf is the application config file
- ~/flickr_backups/img/ contain your photos
- ~/flickr_backups/info/ contains a pile of json style files, one per photo
- ~/flickr_backups/photo_db.js is all the json style files concatenated plus a little additional info
- ~/flickr_backups/last_updated is the timestamp of the last successful update
+ The detailed data for each photo is stored in an individual json file under ~/flickr_backups/info/ and
+ named per:
+ ~/flickr_backups/info/95/552833895_full.js
+
+ Each photo also gets an expurgated json file which contains: title, video file extension (if there is
+ one), and the extension of the original photo url. It is named per:
+ ~/flickr_backups/info/95/552833895_.js
+
+ Each photo record file is not strictly a json file, the photo value is given a variable assignment in
+ an object array called picbackflick_images. The key is composed of the photo upload time and the photo
+ ID. This allows the list to be reliably sorted as they appear in your photo stream. Photo IDs are not
+ assigned in a strict time ordered sequence.
+
+ picbackflick_images['1300347013_5534286708'] = { "id": "5534286708", "title": "Dancing"... }
+
+ The expurgated json files are also collected in a single file, ~/flickr_backups/photo_db.js
+ As the json files are created they are appended to this file. Updated image data might be repeated
+ but when evaluated the last one is the value used.
+
+ You can rebuild photo_db.js by running PicBackFlick with the -w command line option.
-COPYING:
-PicBackFlick is licensed under the AGPL instead of GPL because it's plausibly useful in a web
-context.
-Reed Wade <reedwade@gmail.com>, 2011-04-16
View
16 photos.html
@@ -66,23 +66,23 @@
$('#photo_spots').empty()
for (i=0; i < page_size; i++) {
p = image_list[i +offset]
+ idir = '/'+picbackflick_images[p].id.slice(-2)+'/'+picbackflick_images[p].id
if (picbackflick_images[p]) {
v = ''
- if (picbackflick_images[p].video_orig_path) {
- v = '<a href="'+picbackflick_images[p].video_orig_path+'" target="_blank">[MOVIE]</a> '
+ if (picbackflick_images[p].v_ext) {
+ v = '<a href="video'+idir+'.'+picbackflick_images[p].v_ext+'" target="_blank">[MOVIE]</a> '
}
$('#photo_spots').append(
'<div class="photo_entry">'+
- '<img src="'+picbackflick_images[p].image_s+'" width="75" height="75" />' +
+ '<img src="img/s'+idir+'_s.jpg" width="75" height="75" />' +
'<span class="photo_title">'+picbackflick_images[p].title+'</span>'+
'<br/>'+
- '<span class="photo_desc">'+picbackflick_images[p].description+'</span>'+
- '<br/>'+
'<span class="photo_options">'+
- '<a href="'+picbackflick_images[p].image_+'" target="_blank">[M]</a> '+
- '<a href="'+picbackflick_images[p].image_o+'" target="_blank">[O]</a> '+
- '<a href="http://flickr.com/photos/'+flickr_username+'/'+picbackflick_images[p].id+'" target="_blank">[P]</a> '+
+ '<a href="img/b'+idir+'_b.jpg" target="_blank">[Big]</a> '+
+ '<a href="img/o'+idir+'_o.'+picbackflick_images[p].originalformat+'" target="_blank">[Orig]</a> '+
+ '<a href="http://flickr.com/photos/'+flickr_username+'/'+picbackflick_images[p].id+'" target="_blank">[Page]</a> '+
v+
+ '<a href="info/'+idir+'_full.js" target="_blank">[JSON]</a> '+
'</span>'+
'</div>')
}
View
78 picbackflick.py
@@ -114,11 +114,9 @@ def get_info_from_id(self):
## TODO: extract much more information from photo_info which we want to archive
- def get_image_url(self, size='o', prefix='http://'):
+ def get_image_url(self, size='o'):
"""
Returns the url for this image at the given size.
-
- We use this function to generate local file names. So, the prefix is settable with the default being correct for the Flickr copy of the image.
"""
valid_sizes = ['o','s','t','m','','z','b']
@@ -126,12 +124,21 @@ def get_image_url(self, size='o', prefix='http://'):
raise RuntimeError("bad option for size: %s must be one of %s" % (size, str(valid_sizes)) )
if size == 'o':
- return "%sfarm%s.static.flickr.com/%s/%s_%s_o.%s" \
- % (prefix, self.vals['farm'], self.vals['server'], self.id, self.vals['originalsecret'], self.vals['originalformat'])
+ return "http://farm%s.static.flickr.com/%s/%s_%s_o.%s" \
+ % (self.vals['farm'], self.vals['server'], self.id, self.vals['originalsecret'], self.vals['originalformat'])
else:
- return "%sfarm%s.static.flickr.com/%s/%s_%s%s%s.jpg" \
- % (prefix, self.vals['farm'], self.vals['server'], self.id, self.vals['secret'], '' if size=='' else '_', size)
+ return "http://farm%s.static.flickr.com/%s/%s_%s%s%s.jpg" \
+ % (self.vals['farm'], self.vals['server'], self.id, self.vals['secret'], '' if size=='' else '_', size)
+ def get_image_path(self, size):
+ """
+ Returns the relative path for this image at the given size.
+ """
+ ext = 'jpg'
+ if size == 'o':
+ ext = self.vals['originalformat']
+ return "img/%s/%s/%s_%s.%s" % (size, self.vals['id'][-2:], self.vals['id'], size, ext)
+
@network_retry
def save(self):
"""
@@ -140,7 +147,7 @@ def save(self):
## store images
for size in self.pbf.options.store_image_sizes:
- image_filename = self.get_image_url(size=size, prefix='img/')
+ image_filename = self.get_image_path(size=size)
self.vals['image_'+size] = image_filename
f = os.path.join(self.pbf.options.photos_path,image_filename)
@@ -161,7 +168,8 @@ def save(self):
out.write(buf)
out.close()
img.close()
-
+
+ self.vals['v_ext'] = False
if self.vals['media'] == 'video':
# example:
# http://www.flickr.com/photos/reedwade/5597186999/play/orig/e45022b02e/
@@ -172,9 +180,8 @@ def save(self):
#
url = "http://www.flickr.com/photos/%s/%s/play/orig/%s/" % (self.pbf.options.flickr_username, self.id, self.vals['originalsecret'])
- self.vals['video_orig_path'] = os.path.join('video',self.id[-2:], self.id) # 'video/89/123456789'
+ f = os.path.join(self.pbf.options.photos_path,'video',self.id[-2:], self.id) # 'video/89/123456789'
- f = os.path.join(self.pbf.options.photos_path,self.vals['video_orig_path'])
# ok, now we run into a problem. We don't know the extension for the video file. It could be one of several things.
# We have to fetch the file and check the content-disposition header to learn it.
@@ -188,6 +195,7 @@ def save(self):
found = glob.glob(f+'.*')
if len(found):
self.pbf.info("skipping "+found[0])
+ self.vals['v_ext'] = found[0].split('.')[-1]
else:
if not os.path.exists(os.path.dirname(f)):
@@ -204,7 +212,7 @@ def save(self):
self.pbf.info("failed to determine video file extension, using 'video' instead")
ext = 'video'
- self.vals['video_orig_path'] += '.'+ext
+ self.vals['v_ext'] = ext
f += '.'+ext
self.pbf.info("writing "+f)
@@ -220,14 +228,35 @@ def save(self):
img.close()
## meta data
- f = os.path.join(self.pbf.options.photos_path,'info',self.id[-2:],self.id+".js")
- self.pbf.info("writing "+f)
+
+ # we use dateuploaded as the key along with ID because we want to sort on this later
+ # it turns out Flickr photo IDs aren't strictly sequential by time
+ json_full = "picbackflick_images['"+self.vals['dateuploaded']+"_"+self.id+"'] =\n "+json.dumps(self.vals)+"\n"
+
+ simple = {}
+ for k in ['title','id','v_ext','originalformat']:
+ simple[k] = self.vals[k]
+
+ json_simple = "picbackflick_images['"+self.vals['dateuploaded']+"_"+self.id+"'] =\n "+json.dumps(simple)+"\n"
+
+ f = os.path.join(self.pbf.options.photos_path,'info',self.id[-2:],self.id)
if not os.path.exists(os.path.dirname(f)):
os.makedirs(os.path.dirname(f))
- out = open(f,'wb')
- # we use dateuploaded as the key along with ID because we want to sort on this later
- # it turns out Flickr photo IDs aren't strictly sequential
- out.write("picbackflick_images['"+self.vals['dateuploaded']+"_"+self.id+"'] = "+json.dumps(self.vals)+"\n")
+
+ self.pbf.info("writing "+f+'_full.js')
+ out = open(f+'_full.js','wb')
+ out.write(json_full)
+ out.close()
+ self.pbf.info("writing "+f+'_.js')
+ out = open(f+'_.js','wb')
+ out.write(json_simple)
+ out.close()
+
+ # add to photo_db.js
+ f = os.path.join(self.pbf.options.photos_path,"photo_db.js")
+ self.pbf.info("appending "+f)
+ out = open(f,'ab')
+ out.write(json_simple)
out.close()
class PicBackFlick:
@@ -261,9 +290,10 @@ def update_web_pages(self):
for root, dirs, files in os.walk(os.path.join(self.options.photos_path,"info")):
for f in files:
- f = open(os.path.join(root,f))
- out.write(f.read())
- f.close()
+ if f[-4:] == '_.js':
+ f = open(os.path.join(root,f))
+ out.write(f.read())
+ f.close()
out.close()
@@ -334,7 +364,7 @@ def load_config_file(self):
self.options.last_updated_filename = os.path.join(self.options.photos_path,"last_updated")
- self.options.store_image_sizes = ['s','','o']
+ self.options.store_image_sizes = ['s','b','o']
@@ -453,8 +483,8 @@ def _get_recent_photos(self):
photo.save()
- if photo_count > 0:
- self.update_web_pages()
+ #if photo_count > 0:
+ # self.update_web_pages()
self.set_last_updated_timestamp()
Please sign in to comment.
Something went wrong with that request. Please try again.