# Automating the georeferencing process
The notebook below will process a single image: 1936 Olney Springs via figshare.  This code will eventually be turned into chunks of .py files and full of loops, but this will give you an idea of how the workflow currently stands.  For this workflow, you will need to hard code:

CELL 2:
- `img_format`:the image format (e.g. .jpg vs .tif)
- `specified_df`: the .csv name with image and centroid data
- `pixel_size_estim`: the pixel size as a float (for this workflow, pixel size was caculated by hand using QGIS)
- `neg_pixel_size_estim`: the negative form of the above float

CELL 3: 
- `img_input_path`: path to images
- `output_dir`: path to write the processed image 
- `centroid_df`: path to the .csv file with image and centroid data

CELL 4:
- This notebook is assuming images are in .tif format.  If images are in .jpg format, cell 4 will need to be commented out.  If images are in a different format, '.tif' will need to be replaced by the string of the image format type.  

TO DOs:
- Need to make reproducible
- Need to clean comments

In [1]:
# Import packages
import os
from glob import glob
from osgeo import gdal, osr

import pandas as pd
import pyproj
from pyproj import Proj

import earthpy as et

In [2]:
# Answer these questions to run through the remainder of the notebook:

# What format are the images in: .jpg vs. .tif?
img_format = '.tif'

# What is the .csv name with images and centroid data?
# FOR TESTING: 'ElPaso_Batch1_YL_20180124_geometa.csv'
specified_df = 'Weld_20200430132018_DD.csv'

# What is the pixel size and width?
# Remember to use to use negative for second variable
pixel_size_estim = 0.950938
neg_pixel_size_estim = -0.950938

In [3]:
# Set paths and import all images with specified format
img_input_path = glob(os.path.join('images', '*' + img_format))

# Create/Set path for georeferenced images
output_dir = os.path.join('outputs')
if not os.path.exists(output_dir):
    os.makedirs(output_dir)

# Import dataframe with images and centroid data
centroid_df_path = os.path.join(specified_df)
centroid_df = pd.read_csv(centroid_df_path)

In [4]:
# COMMENT OUT IF WORKING WITH .JPG IMAGES
# Format name in dataframe from .jpg to .tif
centroid_df.lnexp_MEDIAFILENAME = centroid_df.lnexp_MEDIAFILENAME.replace({'.jpg':'.tif'}, regex=True)

In [5]:
# Loop through image names for centroid and georeferencing
for images in img_input_path:
    src_ds = gdal.Open(images)
        
    # Selecting only name of source image (removes diectory part of path)
    img_src_name = os.path.basename(os.path.normpath(images))
    
    # Filtering to images with corresponding centroid data
    if img_src_name in centroid_df.lnexp_MEDIAFILENAME.values:  
        print('Successfully working through image:', img_src_name)
        
        # Describe source image size
        x_height = src_ds.RasterXSize
        y_width = src_ds.RasterYSize
        #print(x_height, y_width)

        # Grabbing image centroid coordinates from dataframe
        lon = centroid_df.loc[centroid_df['lnexp_MEDIAFILENAME']==img_src_name, 'DDX'].iloc[0]
        lat = centroid_df.loc[centroid_df['lnexp_MEDIAFILENAME']==img_src_name, 'DDY'].iloc[0]
        #print(lon, lat)
        
        # Reprojection
        myProj = Proj("+proj=utm +zone=13N, +north +ellps=WGS84 +datum=WGS84 +units=m +no_defs")
        easting,northing = myProj(lon, lat) 
        #print('Centroid UTM coordinates: ', easting, northing)

        # Counting with pixels from center to top left corner
        top_img_pixel = x_height/2
        left_img_pixel = y_width/2

        # Calculating with coordinates from centroid to top left corner
        x_topleft = easting - (pixel_size_estim/2) - (pixel_size_estim * top_img_pixel)
        y_topleft = northing + (pixel_size_estim/2) + (pixel_size_estim * left_img_pixel)
        #print('Top left UTM coordinates: ', x_topleft, y_topleft)

        # Reformatting the image to geotiff
        format = 'GTiff'
        driver = gdal.GetDriverByName(format)

        # Specify year image was taken
        year_details = centroid_df.loc[centroid_df['lnexp_MEDIAFILENAME']==img_src_name, 'Date'].iloc[0][5:]
        county_details = centroid_df.loc[centroid_df['lnexp_MEDIAFILENAME']==img_src_name, 'County#2'].iloc[0]
                
        # Create copy with new name
        dst_ds = driver.CreateCopy(output_dir + "//" + images[7:-4] + "-" + year_details + "-" + county_details + img_format, src_ds, 0)

        # Set top left corner coordinates in UTM with pixel size and rotation
        gt = [x_topleft, pixel_size_estim, 0, y_topleft, 0, neg_pixel_size_estim]
        dst_ds.SetGeoTransform(gt)

        # Assign CRS
        epsg = 32613 #utm zone 13n
        srs = osr.SpatialReference()
        srs.ImportFromEPSG(epsg)
        dst_wkt = srs.ExportToWkt()
        dst_ds.SetProjection(dst_wkt)

        # Close and finalize dst_ds 
        dst_ds = None 
        src_ds = None
    
    else:
        print('Corresponding image and coordinate data missing from:',img_src_name)

Corresponding image and coordinate data missing from: ag274055.tif
Corresponding image and coordinate data missing from: BOW001021.tif
Corresponding image and coordinate data missing from: BOW001021_UnProj.tif
Corresponding image and coordinate data missing from: olneySprings_modified.tif
Corresponding image and coordinate data missing from: olneySprings_UTMZ13.tif
Successfully working through image: yb002045.tif
