# Looping over a series of images to batch process

An essential technique is the ability to batch process images using a 'for loop'. 

We have already covered 'for loops' earlier in this class. 

You should remember that they allow you to repeat a specific block of code a known number of times.

For example, if we want to carry out a set of processing once per multi-spectral image. 

Here, we will use a small example from the mid-section of Shennandoah, from Spring 2023. 

First, we need to load in our required packages and scale function. 

In [1]:
# Example
import os
import rasterio
import numpy as np 
from matplotlib import pyplot

def scale(band): # scale values for display purposes
    return band / 20000.0

Next, we will want to specify the names of the files we want to load in (e.g., `my_time_series_set` below). 

We will do this using a list of lists. This allows us to loop over each list one at a time. 

Within each list, we will have the names of the bands we want to use. 

And also, if necessary, any other relevant information, such as a filename.


In [2]:
# Example
my_time_series_set = [
    [
        'time_series/2023-03-04-00_00_2023-03-04-23_59_Sentinel-2_L2A_B04_(Raw).tiff',
        'time_series/2023-03-04-00_00_2023-03-04-23_59_Sentinel-2_L2A_B08_(Raw).tiff',
        '2023-03-04-00_00_2023'
    ],
    [
        'time_series/2023-04-13-00_00_2023-04-13-23_59_Sentinel-2_L2A_B04_(Raw).tiff',
        'time_series/2023-04-13-00_00_2023-04-13-23_59_Sentinel-2_L2A_B08_(Raw).tiff',
        '2023-04-13-00_00_2023'
    ],
    [
        'time_series/2023-04-18-00_00_2023-04-18-23_59_Sentinel-2_L2A_B04_(Raw).tiff',
        'time_series/2023-04-18-00_00_2023-04-18-23_59_Sentinel-2_L2A_B08_(Raw).tiff',
        '2023-04-18-00_00_2023'
    ],
]

Using this structure, we can access each list as follows:


In [3]:
# Example
for my_time_series in my_time_series_set:
    print(my_time_series)

['time_series/2023-03-04-00_00_2023-03-04-23_59_Sentinel-2_L2A_B04_(Raw).tiff', 'time_series/2023-03-04-00_00_2023-03-04-23_59_Sentinel-2_L2A_B08_(Raw).tiff', '2023-03-04-00_00_2023']
['time_series/2023-04-13-00_00_2023-04-13-23_59_Sentinel-2_L2A_B04_(Raw).tiff', 'time_series/2023-04-13-00_00_2023-04-13-23_59_Sentinel-2_L2A_B08_(Raw).tiff', '2023-04-13-00_00_2023']
['time_series/2023-04-18-00_00_2023-04-18-23_59_Sentinel-2_L2A_B04_(Raw).tiff', 'time_series/2023-04-18-00_00_2023-04-18-23_59_Sentinel-2_L2A_B08_(Raw).tiff', '2023-04-18-00_00_2023']


And we can also access each individual element of a list by using the square bracket index, e.g., `[0]` for the first element. 

In [4]:
# Example
for my_time_series in my_time_series_set:
    print(my_time_series[0])

time_series/2023-03-04-00_00_2023-03-04-23_59_Sentinel-2_L2A_B04_(Raw).tiff
time_series/2023-04-13-00_00_2023-04-13-23_59_Sentinel-2_L2A_B04_(Raw).tiff
time_series/2023-04-18-00_00_2023-04-18-23_59_Sentinel-2_L2A_B04_(Raw).tiff


Given this, we can now loop over our image, scale each pixel and allocate the whole list of lists structure to a variable e.g., the band name.

In [5]:
# Example
for my_time_series in my_time_series_set:

    my_raster_image = rasterio.open(my_time_series[0])
    red = scale(my_raster_image.read()[0])

    my_raster_image = rasterio.open(my_time_series[1])
    nir = scale(my_raster_image.read()[0])
    
    print(red, nir)

[[0.18675 0.19265 0.22675 ... 0.25495 0.24575 0.2536 ]
 [0.1953  0.18415 0.19595 ... 0.2382  0.2287  0.2382 ]
 [0.1999  0.2369  0.09665 ... 0.2176  0.24445 0.233  ]
 ...
 [0.12155 0.12845 0.1232  ... 0.17235 0.16515 0.19595]
 [0.11205 0.116   0.10355 ... 0.19105 0.18675 0.1953 ]
 [0.1396  0.12515 0.10255 ... 0.1812  0.18385 0.2238 ]] [[0.5043  0.5492  0.61605 ... 0.5551  0.5472  0.56755]
 [0.53805 0.58785 0.50725 ... 0.5459  0.54985 0.55445]
 [0.5043  0.6344  0.30965 ... 0.49545 0.56885 0.5603 ]
 ...
 [0.31425 0.3552  0.29885 ... 0.42565 0.44105 0.5269 ]
 [0.31    0.29915 0.26445 ... 0.46955 0.47315 0.48625]
 [0.37485 0.3431  0.2595  ... 0.443   0.45745 0.5302 ]]
[[0.22805 0.22805 0.291   ... 0.27525 0.2795  0.2936 ]
 [0.213   0.2215  0.2818  ... 0.2631  0.26675 0.269  ]
 [0.2631  0.2936  0.1809  ... 0.2582  0.2877  0.2723 ]
 ...
 [0.1966  0.2202  0.18385 ... 0.21955 0.2117  0.23755]
 [0.1835  0.1953  0.1586  ... 0.2287  0.24215 0.24215]
 [0.19925 0.18515 0.16515 ... 0.23135 0.23855 0.

The world is now your oyster. You can essentially carry out any bespoke processing you want to carry out, to your next set of images. 

For example, you could begin by taking the ndvi for a series of images, as follows:

In [8]:
# Example
for my_time_series in my_time_series_set:

    my_raster_image = rasterio.open(my_time_series[0])
    red = scale(my_raster_image.read()[0])

    my_raster_image = rasterio.open(my_time_series[1])
    nir = scale(my_raster_image.read()[0])

    ndvi = (nir - red) / (nir + red)
    
    ndvi_index = np.zeros(nir.shape)
    threshold = 0.4
    ndvi_index[(ndvi > threshold)] = 1

    print(ndvi_index)

[[1. 1. 1. ... 0. 0. 0.]
 [1. 1. 1. ... 0. 1. 0.]
 [1. 1. 1. ... 0. 0. 1.]
 ...
 [1. 1. 1. ... 1. 1. 1.]
 [1. 1. 1. ... 1. 1. 1.]
 [1. 1. 1. ... 1. 1. 1.]]
[[1. 1. 1. ... 0. 0. 0.]
 [1. 1. 1. ... 0. 0. 0.]
 [0. 1. 1. ... 0. 0. 0.]
 ...
 [1. 1. 1. ... 0. 1. 1.]
 [1. 1. 1. ... 0. 0. 0.]
 [1. 1. 1. ... 0. 0. 0.]]
[[1. 1. 1. ... 0. 0. 0.]
 [1. 1. 1. ... 0. 0. 0.]
 [1. 1. 1. ... 0. 1. 0.]
 ...
 [1. 1. 1. ... 1. 1. 1.]
 [1. 1. 1. ... 1. 1. 1.]
 [1. 1. 1. ... 0. 1. 0.]]


Obviously, you need some way to save this information. 

You could extract these values to a .csv, which is a good way to capture processed data for later visualization. 

Here, let us export the results in graphical form. The trick is exporting each iteration to a new file, carried out here using the filename added in each list (`my_time_series[2]`).

This can be joined together using our favorite `format()` function:

In [7]:
# Example
for my_time_series in my_time_series_set:

    my_raster_image = rasterio.open(my_time_series[0])
    red = scale(my_raster_image.read()[0])

    my_raster_image = rasterio.open(my_time_series[1])
    nir = scale(my_raster_image.read()[0])

    ndvi = (nir - red) / (nir + red)
    
    ndvi_index = np.zeros(nir.shape)
    ndvi_index[(ndvi > 0.4)] = 1

    pyplot.imshow(ndvi_index)
    pyplot.colorbar(shrink=0.7)

    filename = "{}.png".format(my_time_series[2])
    pyplot.savefig(filename, bbox_inches='tight')
    pyplot.close()


You will need to evaluate the data/graphics you export in order to understand the implications of your code. As part of this, you can complete the following exercise.

## Exercise

Please undertake the following to reinforce the key learning steps in this tutorial:
    
- Extract a set of multiband images, e.g., from SentinelHub.
- Choose a small area, but try to find somewhere with a key temporal change sequence. 
- Loop over those images and implement a spectral index of interest, e.g., to detect changes in snow cover, burnt area, algae blooms, etc. We have covered plenty in the class. You are not allowed to use ndvi. 
- Write adequate notes for your processing code. 
- Export the final graphics. 
- Critically evaluate.