# The IMAGE_META module Elevator Pitch

**_I am a photographer who is tired of maintaining image metadata (EXIF, ITPC, ...) in various tools, especially normal tags and geo coordinates and geo reverse tags like location, city, ... and so on._**

**_This Python module provides a command line level solution to this problem by using exiftool to write data into jpg images._**

# Showcasing IMAGE_META package
The GPS_WRITER_SHOWCASE notbook provides numerous manipulation features for manipulating jpg Image metadata leveraging the great [EXIF Tool](https://exiftool.org/) and using open street map geo API https://nominatim.org/release-docs/develop/api/Overview/ for getting geo meta data.

To get more information on IPTC-IIM metadata (International Press Telecommunications Council-Information Interchange Model) check out the following documentation sources: https://www.iptc.org/std/photometadata/documentation/

The package contains the following modules:

* **geo.py** coordinate calculations, access to nominatim API for reverse geo encoding (coordinates to site plain text information), gpx file handling
* **persistence.py** reading + writing plain + json files
* **exif.py** exiftool interface + image metadata handling / transformation 
* **util** datetime calculations, binary search in list, ...
* **controller** packaging functions into helper methods for convenient use

For an operational example refer to the notebook [IMAGE_META_WORKFLOW](./IMAGE_META_WORKFLOW.ipynb)

Caveat: Mind the usage terms from Nominatim https://operations.osmfoundation.org/policies/nominatim/ ! So reverse search is only accceptable for a small amount of requests!

## Exiftool Command Lines
You need to install exiftool and set path variables accordingly to be able to execute it in target directory. Find some examples her, for more info check out the following sources:

* **[EXIFTOOL FAQ](https://exiftool.org/faq.html 'EXIFTOOL FAQ')**
* **[EXIFTOOL EXAMPLES](https://exiftool.org/examples.html 'EXIFTOOL EXAMPLES')**
* **[EXIFTOOL DOCUMENTATION](https://exiftool.org/exiftool_pod.html 'EXIFTOOL DOCUMENTATION')**
* **[EXIFTOOL GEOTAGGING](https://exiftool.org/geotag.html 'EXIFTOOL GEOTAGGING')**

In [None]:
# here's import of all packages required to execute below examples
import os
from importlib import reload
from datetime import datetime
from datetime import timedelta
import pytz

import image_meta
import image_meta.persistence
import image_meta.util
import image_meta.geo
reload(image_meta)
reload(image_meta.persistence)
reload(image_meta.util)
reload(image_meta.geo)

# Import classes
from image_meta.persistence import Persistence as P
from image_meta.util import Util as U
from image_meta.geo import Geo as G
from image_meta.exif import ExifTool as E

## Preparation
Showcasing Geo URLs and reading data from a json config file

In [None]:
# Sample Data
coords = {"Stuttgart":{"lat":48.7835,"lon":9.1850},
          "Tübingen":{"lat":48.52027,"lon":9.05361}}
lat,lon = list(coords["Tübingen"].values())
# OSM Link can be constructed like
print(f"Tübingen OSM Link -> https://www.openstreetmap.org/#map=15/{lat}/{lon}")
print("Reverse Seaarch link:")
# Reverse Search url for this link is (click to see the data)
print(f"""https://nominatim.openstreetmap.org/reverse?format=jsonv2&lat={lat}\
&lon={lon}&addressdetails=16&namedetails=1&extratags=1""")
# timezone
tz_local = pytz.timezone("Europe/Berlin")
tz_utc = pytz.timezone("UTC")




In [None]:
# copy sample files to work directory (so you can always use the original samples)
work_dir = os.getcwd()
sample_dir = os.path.join(work_dir,"Sample")
# files customized to your own requirements go into 00_OWN_STUFF
work_dir_ownstuff = os.path.join(sample_dir,"00_OWN_STUFF")
# files with GPS data
work_dir_gps = os.path.join(sample_dir,"10_RAW_GPS")
# files without GPS data
work_dir_nogps = os.path.join(sample_dir,"10_RAW_NOGPS")
# files with exif metadata
work_dir_meta = os.path.join(sample_dir,"10_RAW_WITH_META")
# original metadata
meta_dir = os.path.join(sample_dir,"00_META")
# work directory
target_dir = os.path.join(sample_dir,"img_test")
print(f"Original Jupyter path: {work_dir}, \n           Target path {target_dir}\n")
# path to sample images / check the other Jupyter file
img_path = target_dir
print(f"\n- Image path {img_path}, is valid: {os.path.isdir(img_path)}")   

Copying all sample files into working subdirectory `img_test`

In [None]:
# copy all test file to target work dir
# check the docuemtation
regex_filter=None
regex_subst=None
s_subst=""
debug=True
save=True
# copy images amnd meta files
dirs = [work_dir_ownstuff,work_dir_gps,work_dir_nogps,meta_dir,work_dir_meta]
for d in dirs:
    P.copy_rename(d,target_dir,
                  regex_filter=regex_filter,regex_subst=regex_subst,
                  s_subst=s_subst,debug=debug,save=save)  

# path to sample images / check the other Jupyter file
img_path = target_dir
print(f"\n- Image path {img_path}, is valid: {os.path.isdir(img_path)}")    

**Reading work_settings.json** 

=> it should contain at least reference to your exiftool file:
Copy `work_settings_template.json`in working folder `\Sample\img_test` (if it exists already, see below) to file name `work_settings.json`, adjust the path of value `exiftool` pointing to your exiftool.exe file. Additionally you can place this file to `\Sample\00_OWN_STUFF` so that it will be copied every time when you'll copy over all working sample files as done with the cell below. 

In [None]:
# Reading configuration data (modify config.json to your own environment) from json file
curr_path = os.path.abspath(os.getcwd())
config_path = os.path.join(img_path,"work_settings.json")
config = P.read_json(config_path)
exiftool_path = config["exiftool"]
# image path for sample
print(f"\n- Exiftool path {exiftool_path}, is valid: {os.path.isfile(exiftool_path)}")
work_dir = os.path.join(curr_path,"Sample")

## Exiftool Command Line Examples
This code shows how to use Exiftool in the command line to get metadata (in this case all fields containing date) and write them to a json file

In [None]:
import os

curr_path = os.path.abspath(os.getcwd())
print(f"Current Path {curr_path}\n")

#get test.jpg samples in samples subdirectory
sample_file = os.path.join(img_path, "IMG_20200615_115045984_GPS.jpg")
sample_json = os.path.join(img_path, "IMG_20200615_115045984_GPS.json")
if os.path.isfile(sample_file):
    # exiftool needs to be installed and available at command line in work dir
    print("--- Output of all EXIF subsegment metadata Containing date information ---")
    !exiftool -G -s -exif:*date* {sample_file}
    print("--- Same Data as json ---") 
    !exiftool -G -s -j -exif:*date* {sample_file}
    print("--- Same Data as json / Short Version without groups ---") 
    !exiftool -s -j -exif:*date* {sample_file}        
    print("--- Output into json file (in samples folder ) ---")
    !exiftool -G -s -j -exif:*date* {sample_file} > {sample_json}
    #now we can read the json into dict
    print("reading from json into dict:")
    metadata_dict = P.read_json(sample_json)
    U.print_dict_info(d=metadata_dict[0],s="Metadata with date information")
else:
    print(f"{sample_file} is not a file")

# Persistence Module
Operations for saving / loading / copying data in various formats (txt, json, gpx xml format)

### Copy and Rename

Method to copy and rename files recursively from a folder hierarchy and filter / rename files using regex. See the `Persistence.copy_rename` method above, for further explanation call up the help:

In [None]:
help(P.copy_rename)

**Example**: Recursively copy all files with extension `.xyz` in path `\20_FileCopy\src` to path `\20_FileCopy\src` and renaming them with a prefix `'rename_'`

In [None]:
file_src_path = os.path.join(work_dir,"20_FileCopy\\src")
file_trg_path = os.path.join(work_dir,"20_FileCopy\\trg")
regex_filter = r"xyz$" # filter by xyz extension
regex_subst = r"^(.{1})" # capture the first character
s_subst = r"renamed_\1" #replace the 1st character by rename_ and itself
debug = True # show debug Information
save = False # really save the results
P.copy_rename(fp=file_src_path,trg_path_root=file_trg_path, 
            regex_filter=regex_filter, regex_subst=regex_subst, 
            s_subst=s_subst, debug=debug, save=save)


### Read GPX files
Read gpx xml files, also support heart rate and cadence from fitness watch. Key is UTC timestamp.

In [None]:
# reading gpx data from work directory
work_path = img_path
gpx_path = os.path.join(work_path,"track.gpx")
gpx = P.read_gpx(gpx_path)
# print the first 3 gps points
gpx_keys = list(gpx.keys())[:3] # timestamps used as keys
[(k,datetime.utcfromtimestamp(k).strftime("%m/%d/%Y, %H:%M:%S")
  ,gpx[k]) for k in gpx_keys]

### Get File Paths
The `Persistence.get_file_list(path,file_type_filter=None)` method allows you to put in a single file ref, a list of files, or a path or a list of paths or a combinaton to get the full paths of files in a list. the file type filter allows you to filter for files with only specified extensions

In [None]:
work_path = img_path
print("--- all files in work path ---") 
fl = P.get_file_list(work_path)
for f in fl:
    print(f) 
print("\n--- only gpx ---") 
print(P.get_file_list(work_path,file_type_filter="gpx"))
print("\n--- only certain files ---")
f1 = os.path.join(work_path,"KeywordHierarchy.txt")
f2 = os.path.join(work_path,"default.geo")
f = [f1,f2]
print(P.get_file_list(f))

# Geo Module

In [None]:
# convert lat lon from preparation step above into cartesian (X,Y,Z) coordinates
c1 = list(coords["Tübingen"].values())
G.latlon2cartesian(c1)

In [None]:
# calculate the distance in km of two coordinates (to initialize data, run the first cell above)
c1 = list(coords["Tübingen"].values())
c2 = list(coords["Stuttgart"].values())
G.get_distance(c1,c2,debug=True)

# Utils Module

### Datetime Conversion
get_timestamp returns timestamp, assumption time string is given in UTC (needs to be converted into UTC before)

In [None]:
from pytz import timezone
tz = timezone('Europe/Berlin')
utc = timezone('UTC')
# get UTC Timestamp from Date String conforming to format ####-##-##T##:##:##Z / (+/-)##:##  
now = datetime.now().astimezone(utc)
print("Now:",now)
#now = datetime(2020, 1, 17,20,10,12)
now_s = now.strftime("%Y-%m-%dT%H:%M:%SZ")
now_ts = U.get_timestamp(now_s)
print(f"Now DateTime {now} -> Now String: {now_s} -> UTC Timestamp {now_ts}")
#convert back from timestamp
utc_dt = tz_utc.localize(datetime.utcfromtimestamp(now_ts))
cet_dt = utc_dt.astimezone(tz_local)
print("Timestamp -> Datetime UTC",utc_dt," -> Datetime Local",cet_dt)
print("UTC Offset",cet_dt.utcoffset()," Timezone",cet_dt.tzinfo,
      " Daylight Saving Time OFFSET",cet_dt.tzinfo.dst(cet_dt))

In [None]:
# More examples > all same dates but differently formatted / default time Europe / Berlin 
dates = ["2020-05:12 13:23:12",
         "2020-05-12T11:23:12Z",
         "2020-05-12T11:23:12.000Z",
         "2020-05-12T13:23:12+02:00"]
for date_s in dates:
    print(U.get_timestamp(date_s,debug=True))

### Timestamp Offset
Calculate offset when GPS time is differing from Camera time

In [None]:
# Different time formats as string allowed see above
s_gps = "2020-05-12T13:23:20+02:00" 
s_cam = "2020-05:12 13:23:12"

offset = U.get_time_offset(time_camera=s_cam,time_gps=s_gps,debug=True) // timedelta(seconds=1)
print(f"Offset Camera - GPS is {offset} seconds")

### Binary Approximate Search
Find the "floor" element in a sorted list of numbers that comes close to passed value 

In [None]:
sorted_list = sorted([5,2.2,3.5,2,6,9,12])
print(sorted_list)
value1 = 5
idx1 = U.get_nearby_index(value1,sorted_list)
print("value",value1,"index ",idx1," list value ->",sorted_list[idx1])
value1 = 4
idx1 = U.get_nearby_index(value1,sorted_list)
print("value",value1,"index ",idx1," list value ->",sorted_list[idx1])
value1 = 0
idx1 = U.get_nearby_index(value1,sorted_list)
print("value",value1,"index ",idx1," list value ->",sorted_list[idx1])
value1 = 13
idx1 = U.get_nearby_index(value1,sorted_list)
print("value",value1,"index ",idx1," list value ->",sorted_list[idx1])

In [None]:
# here you can see how it chunks the sorted list into halfs
value1 = 11.5
idx1 = U.get_nearby_index(value1,sorted_list,debug=True)
print("value",value1,"index ",idx1," list value ->",sorted_list[idx1])

# Exif Module

### Metadata Hierarchy
In photo management programs you often can maintain tags as hierarchies and export them as text file. In this file, a hierarchy level is represented as tab character. From this, you can construct hierarchical meta tags (stored as XMP:HierarchicalSubject in image metadata). The following method will read a hierarchy metadata file and put them into a dict with the "leaf" tag as dict key. This way, you can maintain a hierarchy and automatically get the hierachical meta tag by just maintaining the hierarchy in a text file.   

In [None]:

#get test hierarchy samples in samples subdirectory
sample_hier = os.path.join(img_path, "KeywordHierarchy.txt")
if not os.path.isfile(sample_hier):
    raise Exception(f"{sample_hier} NOT FOUND")
    
lines = P.read_file(sample_hier)
print("-----------HIERARCHY-------------")
for line in lines:
    print(line.strip('\n'))
print("-----------OUTPUT-------------")
h_tag_dict = E.create_metahierarchy_from_str(lines,debug=False)
U.print_dict_info(h_tag_dict)
tag = "Nature"
print("-----------Example-------------")
print(f"Tag <{tag}> has hierarchical attribute <{h_tag_dict[tag]}>")

### Process Images with Exiftool
In Class `ExifTool` executable will be triggered by `execute` method receiving control parameters and file list. In the constructor the image folder and the path to the Exiftool executable needs to be supplied. 
Convenience wrapper methods for handling metadata are supplied and described here. 

In [None]:
curr_path = img_path
sample_jpg = os.path.join(curr_path, "IMG_20200615_115045984_GPS.jpg")
exif_tool_loc = exiftool_path # define location in file config.json  
print("Exiftool: ",exif_tool_loc)

if not os.path.isfile(exif_tool_loc):
    raise Exception(f"EXIFTOOL NOT FOUND at location {exif_tool_loc}")

if not os.path.isfile(sample_jpg):
    raise Exception(f"file {sample_jpg} NOT FOUND")

# # important: needs to be handled via "with" command (-> executing "__enter__" method)    
with E(exif_tool_loc) as exiftool:
    # collects data of several files in one dictionary
    try:
        meta_dict = exiftool.get_metadict_from_img(sample_jpg)
    except:
        print(f"error reading file {sample_jpg} check if it is there")

file_list = meta_dict.keys()

for jpg_file in file_list:
    print(f"--- File {jpg_file} ---")    
    meta_list = meta_dict[jpg_file]
    
    for meta in meta_list:
        print(f"[{meta}] ->  {meta_list[meta]}")            

In [None]:
# exif example: calculate time offset with Utility Module
s_gps = "2020:06:15 11:55:00" # time read from image
# reading meta dbata, from example above
s_cam = meta_list['CreateDate']
offset = U.get_time_offset(time_camera=s_cam,time_gps=s_gps,debug=False) // timedelta(minutes=1)
print(f"GPS time:{s_gps} Cam Time:{s_cam}, Offset Camera - GPS is {offset} minutes")

# Controller Module
The Controller module plugs together all functionalities in order to put together all kind of metadata (GPS, metadata default values,...) and write them to image files. The main method is process_images ...
For an operational example refer to the notebook [IMAGE_META_WORKFLOW](./IMAGE_META_WORKFLOW.ipynb)