Downloading SmugMug Captions with Python and Jupyter
=========================================

![](jupysmug.png)

## Prerequistes

This notebook assumes you have set up your environment to use `SmugPyter.py`. 
Refer to this notebook for details on how to do this.

[Getting Ready to use the SmugMug API with Python and Jupyter](https://github.com/bakerjd99/smugpyter/tree/master/notebooks)

## Why am I doing this?
My [photo captions](https://conceptcontrol.smugmug.com) have
evolved into a form of *milliblogging*. *Milliposts* (milliblog posts) are terse and
tiny; many are single sentences or paragraphs. Taken one-at-a-time
milliposts seldom impress but when gathered in hundreds or
thousands accidental epics emerge. So, to prevent "epic loss" I want
a simple way of downloading and archiving my captions off-line.

## If you don't control it you cannot trust it!
When I started [blogging](https://analyzethedatanotthedrivel.org) I knew
that you could not depend on blogging websites to archive and
preserve your documents. We had already seen cases of websites mangling content, shutting
down without warning, and even worse, *censoring* bloggers. It was a classic case
of, “If you don't control it you cannot trust it." I resolved to maintain complete off-line *version
controlled* copies of my blog posts.

Maintaining off-line copies was made easier by
[WordPress.com](https://wordpress.com/)'s excellent [blog export
utility](https://en.blog.wordpress.com/2006/06/12/xml-import-export/). A
simple button push downloads a large `XML` file that contains all your
blog posts with embedded references to images and other inclusions. `XML`
is not my preferred archive format. I am a huge fan of `LaTeX` and
`Markdown`: two text formats that are directly supported in Jupyter
Notebooks. I wrote a little system that parses the WordPress `XML` file and [generates
LaTeX and Markdown](https://analyzethedatanotthedrivel.org/2012/02/11/wordpress-to-latex-with-pandoc-and-j-prerequisites-part-1/) files. Yet, despite milliblogging long before blogging, I don't have a similar system for
downloading and archiving Smugmug *metadata*. This Jupyter notebook addresses this omission 
and shows how you can use Python and the Smugmug API to extract gallery 
and image metadata and store it in version controlled local directories as `CSV` files.

## *SmugPyter.py* runs in the Python 3.6/Jupyter/Win64 environment.

A lot of the code in this notebook was derived from:

1. [https://github.com/marekrei/smuploader/blob/master/smuploader/smugmug.py](https://github.com/marekrei/smuploader/blob/master/smuploader/smugmug.py)

2. [https://github.com/kevinlester/smugmug_download](https://github.com/kevinlester/smugmug_download)

3. [https://github.com/AndrewsOR/MugMatch](https://github.com/AndrewsOR/MugMatch) 

The originals did not run in the Python 3.6/Jupyter/Win64 environment and lacked some of 
the facilities I wanted so I adjusted, tweaked and modified the scripts. The result
is incompatible with the originals so I renamed the main class *SmugPyter* to avoid confusion. 
Finally, being new to Python and Jupyter, I used the [2to3](http://pythonconverter.com/) tool to help make the changes.

In [1]:
import os
import smugpyter

help(smugpyter)

Help on module smugpyter:

NAME
    smugpyter

DESCRIPTION
    # code from:
    # https://github.com/speedenator/smuploader/blob/master/bin/smregister
    # https://github.com/kevinlester/smugmug_download/blob/master/downloader.py
    # https://github.com/AndrewsOR/MugMatch/blob/master/mugMatch.py
    # modified for python 3.6/jupyter environment - modifications assisted by 2to3 tool

CLASSES
    builtins.object
        SmugPyter
    
    class SmugPyter(builtins.object)
     |  Methods defined here:
     |  
     |  __init__(self, verbose=False)
     |      Constructor. 
     |      Loads the config file and initialises the smugmug service
     |  
     |  case_mask_decode(self, smug_key, case_mask)
     |      Restore letter case to (smug_key).
     |  
     |  case_mask_encode(self, smug_key)
     |      Encode the case mask as an integer.
     |  
     |  create_album(self, album_name, password=None, folder_id=None, template_id=None)
     |      Create a new album.
     |  
     | 

## Create the *SmugPyter* configuration file.

The SmugPyter class constuctor reads a config file. If this file is missing 
you cannot create instances of the SmugMug class or connect to your SmugMug account. 


In [2]:
# the SmugPyter class constuctor reads a config file in this location.
os.path.join(os.path.expanduser("~"), '.smugpyter.cfg')

'C:\\Users\\john\\.smugpyter.cfg'

The following prompts for your SmugMug API keys.  You can apply for SmugMug keys on your SmugMug account by browsing to the API KEYS section of your account settings.

In [None]:
# code from https://github.com/speedenator/smuploader/blob/master/bin/smregister
# modified for python 3.6/jupyter environment - modifications assisted by 2to3 tool 

from rauth.service import OAuth1Service
import requests
import http.client
import httplib2
import hashlib
import urllib.request, urllib.parse, urllib.error
import time
import sys
import os
import json
import configparser
import re
import shutil

# depends on previously run cells 
#from smuploader import SmugMug

def write_config(configfile, params):
    config = configparser.ConfigParser()
    config.add_section('SMUGMUG')
    for key, value in params:
        config.set('SMUGMUG', key, value)
    with open(SmugMug.smugmug_config, 'w') as f:
        config.write(f)

if __name__ == '__main__':
    print("\n\n\n#######################################################")
    print("## Welcome! ")
    print("## We are going to go through some steps to set up this SmugMug photo manager and make it connect to the API.")
    print("## Step 0: What is your SmugMug username?")
    username = input("Username: ")
    print('## Step 1: Enter your local directory, e.g. c:/SmugMirror/')
    localdir = input("Directory: ")

    print("## Step 2: Go to https://api.smugmug.com/api/developer/apply and apply for an API key.")
    print("## This gives you unique identifiers for connecting to SmugMug.")
    print("## When done, you can find the API keys in your SmugMug profile.")
    print("## Account Settings -> Me -> API Keys")
    print(("## Enter them here and they will be saved to the config file (" + SmugMug.smugmug_config + ") for later use."))
    consumer_key = input("Key: ")
    consumer_secret = input("Secret: ")

    write_config(SmugMug.smugmug_config, [("username", username), ("consumer_key", consumer_key), 
                                          ("consumer_secret", consumer_secret), ("access_token", ''), 
                                          ("access_token_secret", '')])

    smugmug = SmugMug()
    authorize_url = smugmug.get_authorize_url()
    print(("## Step 2: Visit this address in your browser to authenticate your new keys for access your SmugMug account: \n## " + authorize_url))
    print("## After that, enter the 6-digit key that SmugMug provided")
    verifier = input("6-digit key: ")

    access_token, access_token_secret = smugmug.get_access_token(verifier)

    write_config(SmugMug.smugmug_config, [("username", username), ("consumer_key", consumer_key), 
                                          ("consumer_secret", consumer_secret), ("access_token", access_token),
                                          ("access_token_secret", access_token_secret)])

    print("## Great! All done!")

## Try out the *SmugPyter* class with credentials saved in the previous cell.

In [3]:
smugmug = smugpyter.SmugPyter()
len(smugmug.get_album_names())

64

In [4]:
help(smugmug)

Help on SmugPyter in module smugpyter object:

class SmugPyter(builtins.object)
 |  Methods defined here:
 |  
 |  __init__(self, verbose=False)
 |      Constructor. 
 |      Loads the config file and initialises the smugmug service
 |  
 |  case_mask_decode(self, smug_key, case_mask)
 |      Restore letter case to (smug_key).
 |  
 |  case_mask_encode(self, smug_key)
 |      Encode the case mask as an integer.
 |  
 |  create_album(self, album_name, password=None, folder_id=None, template_id=None)
 |      Create a new album.
 |  
 |  create_nice_name(self, name)
 |  
 |  download_image(self, image_info, image_path, retries=5)
 |      Download an image from a url.
 |  
 |  download_smugmug_mirror(self, func_album=None, func_folder=None)
 |      Walk SmugMug folders and albums and apply functions (func_album) and (func_folder).
 |      
 |          smugmug = SmugPyter()
 |          smugmug.download_smugmug_mirror(func_album=smugmug.write_album_manifest)
 |  
 |  get_access_token(self, v

In [6]:
smugmug.get_albums()

[{'AlbumKey': 'ZdrcmM',
  'Title': 'Cover Images',
  'Uri': '/api/v2/album/ZdrcmM'},
 {'AlbumKey': 'rXGggF',
  'Title': 'Profile Images',
  'Uri': '/api/v2/album/rXGggF'},
 {'AlbumKey': 'XghWcL',
  'Title': 'Great and Greater Forebearers',
  'Uri': '/api/v2/album/XghWcL'},
 {'AlbumKey': 'cMf2ft',
  'Title': 'Minnie Raver',
  'Uri': '/api/v2/album/cMf2ft'},
 {'AlbumKey': '5bbtrT',
  'Title': 'Grandparents',
  'Uri': '/api/v2/album/5bbtrT'},
 {'AlbumKey': 'sThkPh', 'Title': 'My Kids', 'Uri': '/api/v2/album/sThkPh'},
 {'AlbumKey': 'X8X9pK',
  'Title': 'The Way We Were',
  'Uri': '/api/v2/album/X8X9pK'},
 {'AlbumKey': 'FZK4j4',
  'Title': "From Hazel's Albums",
  'Uri': '/api/v2/album/FZK4j4'},
 {'AlbumKey': 'ctPGZq',
  'Title': 'Inlaws Outlaws and Friends',
  'Uri': '/api/v2/album/ctPGZq'},
 {'AlbumKey': 'Gq5ZQB',
  'Title': "My Wife's Family",
  'Uri': '/api/v2/album/Gq5ZQB'},
 {'AlbumKey': 'KVcLKD',
  'Title': 'Helen Hamilton',
  'Uri': '/api/v2/album/KVcLKD'},
 {'AlbumKey': '8vcPbH', '

In [7]:
smugmug.get_folders()

[{'Name': 'People', 'NodeID': 'qHr93', 'UrlName': 'People'},
 {'Name': 'Places', 'NodeID': 'J5wzG', 'UrlName': 'Places'},
 {'Name': 'Trips', 'NodeID': 'PgwWL', 'UrlName': 'Trips'},
 {'Name': 'Themes', 'NodeID': 'j7VKf', 'UrlName': 'Themes'},
 {'Name': 'Other', 'NodeID': 'L7kKx', 'UrlName': 'Other'}]

In [8]:
caught_my_eye = smugmug.get_album_id('Caught My Eye')
forebearers = smugmug.get_album_id('Great and Greater Forebearers')
idaho_instants = smugmug.get_album_id("Idaho Instants")
cell_phoning = smugmug.get_album_id("Cell Phoning It In")
[caught_my_eye, forebearers, idaho_instants, cell_phoning]

['9cxcL6', 'XghWcL', 'gLd4hT', 'PfCsJz']

In [9]:
smugmug.get_album_info(caught_my_eye)

{'AlbumKey': '9cxcL6',
 'AllowDownloads': False,
 'Backprinting': '',
 'BoutiquePackaging': 'Inherit from User',
 'CanRank': False,
 'CanShare': True,
 'Clean': False,
 'Comments': True,
 'Date': '2009-02-26T23:04:55+00:00',
 'Description': '',
 'EXIF': True,
 'External': True,
 'FamilyEdit': False,
 'Filenames': False,
 'FriendEdit': False,
 'Geography': True,
 'HasDownloadPassword': False,
 'Header': 'Custom',
 'HideOwner': False,
 'HighlightAlbumImageUri': '/api/v2/album/9cxcL6/image/KkkHHKH-0',
 'ImageCount': 105,
 'ImagesLastUpdated': '2017-08-27T01:15:22+00:00',
 'InterceptShipping': 'Inherit from User',
 'Keywords': '',
 'LargestSize': '5K',
 'LastUpdated': '2017-08-27T01:15:10+00:00',
 'Name': 'Caught My Eye',
 'NiceName': 'Caught-My-Eye-1',
 'NodeID': 'pCmh2',
 'OriginalSizes': 279723936,
 'PackagingBranding': True,
 'Password': '',
 'PasswordHint': '',
 'Printable': False,
 'Privacy': 'Public',
 'ProofDays': 0,
 'Protected': True,
 'ResponseLevel': 'Full',
 'SecurityType': 'N

In [10]:
album_images = smugmug.get_album_images(forebearers)
len(album_images)

13

In [11]:
album_captions = smugmug.get_album_image_captions(album_images)
album_latitude_longitude = smugmug.get_latitude_longitude_altitude(album_images)
album_latitude_longitude

[{'ImageKey': 'qjQxPCS',
  'LatLongAlt': ('0.00000000000000', '0.00000000000000', 0)},
 {'ImageKey': 'HwwwgDs',
  'LatLongAlt': ('0.00000000000000', '0.00000000000000', 0)},
 {'ImageKey': 'fjcPpCk',
  'LatLongAlt': ('0.00000000000000', '0.00000000000000', 0)},
 {'ImageKey': 'jq54JDC',
  'LatLongAlt': ('0.00000000000000', '0.00000000000000', 0)},
 {'ImageKey': 'hVw9ng9',
  'LatLongAlt': ('0.00000000000000', '0.00000000000000', 0)},
 {'ImageKey': 'xT2ptsn',
  'LatLongAlt': ('0.00000000000000', '0.00000000000000', 0)},
 {'ImageKey': '4sPSDfZ',
  'LatLongAlt': ('0.00000000000000', '0.00000000000000', 0)},
 {'ImageKey': 'P2QzsRD',
  'LatLongAlt': ('0.00000000000000', '0.00000000000000', 0)},
 {'ImageKey': 'QNQGMVg',
  'LatLongAlt': ('0.00000000000000', '0.00000000000000', 0)},
 {'ImageKey': 'KSMVDvH',
  'LatLongAlt': ('0.00000000000000', '0.00000000000000', 0)},
 {'ImageKey': 'k6pnSJ4',
  'LatLongAlt': ('0.00000000000000', '0.00000000000000', 0)},
 {'ImageKey': 'LX8HmDV',
  'LatLongAlt': ('

In [12]:
album_real_dates = smugmug.get_album_image_real_dates(album_images)
album_real_dates

[{'AlbumKey': 'XghWcL',
  'FileName': 'albert raver aunt may cousin marci daughter.jpg',
  'ImageKey': 'qjQxPCS',
  'RealDate': '1910-03-03T01:01:01'},
 {'AlbumKey': 'XghWcL',
  'FileName': 'augustus a burdick 1864.jpg',
  'ImageKey': 'HwwwgDs',
  'RealDate': '1864-02-25T00:00:00'},
 {'AlbumKey': 'XghWcL',
  'FileName': 'bert raver 1900 [189878796].jpg',
  'ImageKey': 'fjcPpCk',
  'RealDate': '1900-01-01T00:00:00'},
 {'AlbumKey': 'XghWcL',
  'FileName': 'eliza jane gilbert b1840 d1915.jpg',
  'ImageKey': 'jq54JDC',
  'RealDate': '1880-07-04T00:00:00'},
 {'AlbumKey': 'XghWcL',
  'FileName': 'gert parents wedding gilbert paula 1905.jpg',
  'ImageKey': 'hVw9ng9',
  'RealDate': '1906-02-02T01:01:01'},
 {'AlbumKey': 'XghWcL',
  'FileName': 'gilbert tilly eliza leone 1914.jpg',
  'ImageKey': 'xT2ptsn',
  'RealDate': '1914-03-03T00:00:00'},
 {'AlbumKey': 'XghWcL',
  'FileName': 'julia ann weber rock 1894.jpg',
  'ImageKey': '4sPSDfZ',
  'RealDate': '1894-06-01T00:00:00'},
 {'AlbumKey': 'XghWc

Try out other "unsupported" version 2.0 API calls. Documentation for SmugMug Version 2.0 API calls is best obtained by hacking with SmugMug's live API browser tool at:

[https://api.smugmug.com/api/v2](https://api.smugmug.com/api/v2)

The live API tool is far more useful if you log into your SmugMug account and point at your own images.

## Walk SmugMug folders and albums and download coveted metadata.

The next cell calls the main function that walks SmugMug folders and writes metadata to local directories. Metadata is saved in `TAB` delimited `CSV` manifest files. `TAB` delimited files are also called `TSV` files. The function writes one file per album. If local directories do not exist they are created. If manifest files already exist they are are overwritten. The entire SmugMug tree is walked. You might want to adjust where the walk starts if you have hundreds or thousands of albums.

Manifest files follow this naming convention.

    manifest-<deblanked-album-name>-<smugmug-album-key>.txt
    
    Here are some examples:
    
    manifest-ZambiaEclipseTrip-k65QRs.txt
    manifest-FromHazelsAlbums-FZK4j4.txt

In [13]:
smugmug = smugpyter.SmugPyter()

smugmug.download_smugmug_mirror(func_album=smugmug.write_album_manifest)

visiting album GreatandGreaterForebearers
visiting album MinnieRaver
visiting album Grandparents
visiting album MyKids
visiting album TheWayWeWere
visiting album FromHazelsAlbums
visiting album InlawsOutlawsandFriends
visiting album MyWifesFamily
visiting album HelenHamilton
visiting album Video
visiting album IdahoInstants
visiting album InandAroundOttawa
visiting album MontanaNowandThen
visiting album IndianaImages
visiting album Minnesota
visiting album NewMexicoMontage
visiting album MissouriMoments
visiting album KingstonOntario
visiting album CaliforniaCaptures
visiting album Iran1960s
visiting album Ghana1970s
visiting album BeirutLebanon1960s
visiting album DivingatBellairsBarbadosBW
visiting album Weekenders
visiting album WesternRoadTrip2015
visiting album NorthbyNorthwest
visiting album TetonsYellowstone2013
visiting album ArizonaToodling
visiting album AlongtheYukonRiver
visiting album VirginiaFall2010
visiting album Chicago2007
visiting album NewYork2005
visiting album Ban

## Next on the Agenda!

Remember how "no good dead goes unpunished." Well, running code will be "enhanced" whether it's necessary or not. 
Now that I have a local directories that contain relevant SmugMug metadata in an easily consumed `CSV` form other notebooks will use these directories to generate what I call "long duration documents."  My prefered long duration sources are `Markdown` and `LaTex`. Both of these text formats will be readable for centuries if printed on high quality acid free paper and stored in numerous "secure and undisclosed locations."

Remember, always [Analyze the Data not the Drivel](https://analyzethedatanotthedrivel.org/).

John Baker, Meridian Idaho