# SaxaVord EO Challenge

Welcome to your challenge: Focussing on an Area of Interest (AOI) over the Faroe Islands, you can look at either: 

* Quantify the extent of algae bloom changes, and their impact on nearby Salmon Fishing Farms
* Quantify the extent of recent coastal erosion 

## Check Environment Active

In [None]:
# Desktop UI:
# It should also show this on the top RHS of your screen, below the LogOut button

In [None]:
# Conda
#  -- Activate environment will show a '*' next to it
!conda env list

In [None]:
# Pip
# TODO: Add this for pip !pip XYZ

## Import Packages

In [2]:
# Packages
import os
import glob

import rasterio as rio
import numpy as np
import matplotlib.pyplot as plt

from zipfile import ZipFile


In [None]:
# Visualisation Params
%matplotlib inline    # plots images in the Jupyter notebook cells
%matplotlib qt5       # opens a seperate browser window to view images

# Sentinel 2

### The Data Folder Structure
Satellite imagery products are folders which contain satellite imagery data in a standardised way, including standard naming conventions for folders etc. Please see the SaxaVord PDF for an explanation of the Sentinel 2 naming convention. 

A single image  product of a region at DateTime YYYYMMDD-HHMMSS is packaged in a folder with a hierarchical file structure. In our case this is:

    |- S2A_MSIL2A_20220531T120411_N0400_R066_T30VUQ_20220531T183212 *folder for a single day-time image product*
        |-S2A_MSIL2A_20220531T120411_N0400_R066_T30VUQ_20220531T183212.SAFE
            |-GRANULE
                |-L2A_T30VUQ_A036242_20220531T120405
                    |-IMG_DATA
                        |- R10m  *resolution folders*
                            |-T30VUQ_20220531T120411_B02_10m.jp2 *individual band file*
                            |-T30VUQ_20220531T120411_B03_10m.jp2
                            |-...
                            |-T30VUQ_20220531T120411_TCI_10m.jp2 *TCI = true color image, multi-band file *

                        |- R20m
                            |-T30VUQ_20220531T120411_B02_20m.jp2
                            |-T30VUQ_20220531T120411_B03_20m.jp2
                            |-...
                        |- R60m
                            |-T30VUQ_20220531T120411_B02_60m.jp2
                            |-T30VUQ_20220531T120411_B03_60m.jp2
                            |-...


### Image Bands
* True Colour Images (RGB) are comprised of 3 bands, Red, Green and Blue - the visible light section of the Electromagnetic Spectrum. 

* Satellite sensors cover a far broader section of this spectrum, and therefore capture a larger number of bands. 

* Different phenomena on earth and within the atmosphere react differently to wavelengths across the spectrum (e.g. different bands), allowing us to make inferences about what is occuring below.

* Certain bands are therefore useful for different environmental analyses.


### Working with This Data
For our data we have:
* Images across dates
* Each image has multiple bands in multiple resolutions
* The filepath for each band is going to be very verbose and difficult to track due to the hierarchical structure

To make all these paths easier to work with, we shall give each Image into it's own dictionary and store all filepaths there. This will enable easy referencing.

```
img_band_dict = {
                'img_folder': 'S2A_MSIL2A_20220531T120411_N0400_R066_T30VUQ_20220531T183212',

                 'img_date': '20220531',
                 'img_time': '183212',
                 
                 'R10m':{'B02': 'T30VUQ_20220531T120411_B02_10m.jp2',
                         'B03': 'T30VUQ_20220531T120411_B03_10m.jp2',
                         ...
                         'TCI': 'T30VUQ_20220531T120411_TCI_10m.jp2'
                        },
                        
                  'R20m':{'B02': 'T30VUQ_20220531T120411_B02_20m.jp2',
                          'B03': 'T30VUQ_20220531T120411_B03_20m.jp2',
                           ...
                          },
                          
                  'R60m':{'B02': 'T30VUQ_20220531T120411_B02_60m.jp2',
                          'B03': 'T30VUQ_20220531T120411_B03_60m.jp2',
                           ...
                          }
                 }
                 
```
                 

**For Example:** To create a True Colour Image (RGB) at 10m resolution, we use Red (B04), Green (B03), Blue (B02). The individual bands can then be accessed:

```
## 10m resolution:
r_10m_filepath = img_band_dict['R10m']['B04']
r_10m_band_img = get_band(r_10m_filepath)

g_10m_filepath = img_band_dict['R10m']['B03']
g_10m_band_img = get_band(g_10m_filepath)

b_10m_filepath = img_band_dict['R10m']['B02']
b_10m_band_img = get_band(b_10m_filepath)
```

## Zipped Image Folders: Rename & Extract All Data

In [3]:
# Function to extract source folder.zip to targer folder
def extract_from_zip(source, target):
    with ZipFile(source, 'r') as zip_ref:
        zip_ref.extractall(target)       


# Path from notebook to folder containing zipped folder for each day's data
s2_imgs_folder = 'Faroe Islands Satellite Data'


# Windows: Zip Folder names may be too long upon extraction, shorten names by removing characters from name from '_N0400' onwards
for file_name in os.listdir(s2_imgs_folder):
    file = ''
    file = s2_imgs_folder + '\\' + file_name #TODO: switch '\\' to '/' for Mac
    
    # Only perform operation of zipped folders
    if file.endswith('.zip'):
        t = '_N0400'
        ind = file.find(t)
        
        
        # Renaming required
        if ind>=0:    
            file_new = file[:ind] + '.zip'
            os.rename(file, file_new)

        # No renaming required
        else:
            file_new = file
            
        # Extract all files from .zip
        file_new_unzip = file_new[:-4]
        extract_from_zip(file_new, file_new_unzip)
        
             
# Check renamed correctly        
os.listdir(s2_imgs_folder)

['S2A_MSIL2A_20220531T120411',
 'S2A_MSIL2A_20220531T120411.zip',
 'S2A_MSIL2A_20220607T115411',
 'S2A_MSIL2A_20220607T115411.zip',
 'S2B_MSIL2A_20220327T120359',
 'S2B_MSIL2A_20220327T120359.zip',
 'S2B_MSIL2A_20220419T121349',
 'S2B_MSIL2A_20220419T121349.zip',
 'S2B_MSIL2A_20220705T120359',
 'S2B_MSIL2A_20220705T120359.zip',
 'S2B_MSIL2A_20220831T115359',
 'S2B_MSIL2A_20220831T115359.zip']

## Create Dictionaries of Band File Paths

In [5]:
# Get list of unzipped folders

lst = os.listdir(s2_imgs_folder)

unzipped_img_folders = [x for x in lst if x.endswith('.zip')==False]

unzipped_img_folders


['S2A_MSIL2A_20220531T120411',
 'S2A_MSIL2A_20220607T115411',
 'S2B_MSIL2A_20220327T120359',
 'S2B_MSIL2A_20220419T121349',
 'S2B_MSIL2A_20220705T120359',
 'S2B_MSIL2A_20220831T115359']

20220419
121349


In [12]:
all_img_dicts = []

for img_folder in unzipped_img_folders:

    ## List all band filenames
#     img_folder = 'S2B_MSIL2A_20220419T121349'
    date = img_folder[11:19]
    time = img_folder[20:28]
    f = s2_imgs_folder + '\\' + img_folder + '\\'

    # Sub-folders containing band data
    res_set = ['R10m', 'R20m', 'R60m']

    # Create a dictionary of the band names
    bands = ['B01','B02','B03','B04','B05','B06', 'B07','B08', 'B8A', 'B09' ,'B11','B12', 'AOT','TCI','WVP', 'SCL']

    all_band_files = [ x for x in glob.glob(f + '**\*', recursive=True) if any(res in x for res in res_set) and x.endswith('jp2')]


    img_band_dict = {'img_folder':img_folder,
                     'img_date':date,
                     'img_time':time}

    # Create a key for each resolution
    for res in res_set:
        img_band_dict[res] = {}

    # Assign all band filepaths to dict
    for band_file in all_band_files:

        # For each resolution, get all band filepaths
        for res in res_set:

            # Band file is in the resolution folder (contains res = 'R10m')
            if band_file.find(res)>=0:

                # Get the band type (B08, B8A, TCI etc)
                band_type = band_file[len(band_file)-11:-8]

                # Store e.g. img_band_dict['R10m']['B03'] = 'filepath/to/R10m/img_B03.jp2'
                img_band_dict[res][band_type] = band_file
                
    
    all_img_dicts.append(img_band_dict)
            

In [23]:
all_img_dicts

[{'img_folder': 'S2A_MSIL2A_20220531T120411',
  'img_date': '20220531',
  'img_time': '120411',
  'R10m': {'AOT': 'Faroe Islands Satellite Data\\S2A_MSIL2A_20220531T120411\\S2A_MSIL2A_20220531T120411_N0400_R066_T30VUQ_20220531T183212.SAFE\\GRANULE\\L2A_T30VUQ_A036242_20220531T120405\\IMG_DATA\\R10m\\T30VUQ_20220531T120411_AOT_10m.jp2',
   'B02': 'Faroe Islands Satellite Data\\S2A_MSIL2A_20220531T120411\\S2A_MSIL2A_20220531T120411_N0400_R066_T30VUQ_20220531T183212.SAFE\\GRANULE\\L2A_T30VUQ_A036242_20220531T120405\\IMG_DATA\\R10m\\T30VUQ_20220531T120411_B02_10m.jp2',
   'B03': 'Faroe Islands Satellite Data\\S2A_MSIL2A_20220531T120411\\S2A_MSIL2A_20220531T120411_N0400_R066_T30VUQ_20220531T183212.SAFE\\GRANULE\\L2A_T30VUQ_A036242_20220531T120405\\IMG_DATA\\R10m\\T30VUQ_20220531T120411_B03_10m.jp2',
   'B04': 'Faroe Islands Satellite Data\\S2A_MSIL2A_20220531T120411\\S2A_MSIL2A_20220531T120411_N0400_R066_T30VUQ_20220531T183212.SAFE\\GRANULE\\L2A_T30VUQ_A036242_20220531T120405\\IMG_DATA\\R10

In [24]:
img_band_dict = all_img_dicts[0]

## Import Band Data

We use Rasterio to import the bands in as 2D arrays.

Note that Rasterio = uses top LHS (img in from the internet)

TODO: Find picture of this to explain

In [None]:
## function that loads in a raster 


    
## stack bands for calculation/manipulation
np.pstack

## normalisation??

## Perform calcs

In [121]:
## function to extract a single band file
## - Note won't extract all 
def get_band(filepath): 
    with rasterio.open(filepath) as f:
        
        print('Extracting band file')
        band_img = f.read(1, masked=True)
        
        print('Extracting band meta')
        band_img_meta = f.profile
        
        print('--> # bands:', band_img_meta['count'])
        print('--> datatype:', band_img_meta['dtype'])
        print('--> nodata:', band_img_meta['nodata'])
        print('--> height:', band_img_meta['height'])
        print('--> width:', band_img_meta['width'])
        
        return band_img     
    
        
band_img = get_band(img_band_dict['R10m']['B02'])



Extracting band file
Extracting band meta
--> # bands: 1
--> datatype: uint16
--> nodata: None
--> height: 10980
--> width: 10980


In [95]:
# Understand Metadata: Explore metadata of file
for i in band_img_meta.keys():
    print(i,':', band_img_meta[i])

driver : JP2OpenJPEG
dtype : uint16
nodata : None
width : 10980
height : 10980
count : 1
crs : EPSG:32630
transform : | 10.00, 0.00, 300000.00|
| 0.00,-10.00, 7000020.00|
| 0.00, 0.00, 1.00|
blockxsize : 1024
blockysize : 1024
tiled : True


Bands:
    https://gisgeography.com/sentinel-2-bands-combinations/
    
    https://gdal.org/drivers/raster/sentinel2.html
    
AOT: Aerosol Optical Thickness
WVP: Scene average water vapour
TCI: True Colour Image (3 Band)

In [116]:
## Extract individual bands

## 10m picture 
b10 = img_band_dict['R10m']['B02']
g10 = img_band_dict['R10m']['B03']
r10 = img_band_dict['R10m']['B04']
n10 = img_band_dict['R10m']['B08']


# 20m picture
img_band_dict['R20m'].keys()



# 60m picture
img_band_dict['R60m'].keys()

dict_keys(['AOT', 'B01', 'B02', 'B03', 'B04', 'B05', 'B06', 'B07', 'B09', 'B11', 'B12', 'B8A', 'SCL', 'TCI', 'WVP'])

In [105]:
## Example of extracting TCI (RGB) image 

# with rasterio.open(img_band_dict['R10m']['TCI']) as f:
#     # Extract the band array (single array only = 1)
#     print('Number of bands in file:',f.profile['count'])
    
#     band_img = f.read([1,2,3], masked=True)
    

Number of bands in file: 3


In [None]:
## Create a 3-band raster (under the hood, this is a numpy ndarray)

print('Creating Raster')
rgb_10 = np.dstack([r10,g10,b10])

print('Check datatype')
# Bit depth

## Visualisation

In [None]:
## Plot raster

## Bit conversion if out of range