This Jupyter Notebook is used for loading the Sentinel Images provided by Greenspin, and extracting the sugarbeet fields as 'Patches' of size patch_size (currently 64 x 64) given in config file. These patches are then saved to the directory. This process is run only twice throughout the project - for Train data and Evaluation data. The following steps are performed within the function:

1. Load Sentinel Images and Corresponding Masks
2. Mask images - blacken the pixels that don't belong to sugarbeet fields
3. Extract patches (sugar-beet fields) from the masked temporal images 
4. Refine the temporal stack: 7 cloud-free images per patch with atleast 5 day gap in the acquistion dates of consecuetive images
5. Define the base directory to save patches
6. Save the patches to disk in their respective temporal folders

### Imports

In [1]:
%load_ext autoreload
%autoreload 2

In [2]:
import os, sys
from pathlib import Path

os.environ['PYTORCH_ENABLE_MPS_FALLBACK'] = '1'
sys.path.append('/home/k64835/Master-Thesis-SITS')
sys.path.append('/Users/bhumikasadbhave007/Documents/THWS/Semester-4/MASTER-THESIS/GITHUB/Master-Thesis-SITS')

scripts_path = Path("../Data-Preprocessing/").resolve()
sys.path.append(str(scripts_path))

scripts_path = Path("../Evaluation/").resolve()
sys.path.append(str(scripts_path))

scripts_path = Path("../Modeling/").resolve()
sys.path.append(str(scripts_path))

import pickle
from scripts.data_visualiser import *
from scripts.data_loader import *
from scripts.data_preprocessor import *
from scripts.temporal_data_preprocessor import *
from scripts.temporal_data_loader import *
from scripts.temporal_visualiser import *
from scripts.temporal_chanel_refinement import *
from Pipeline.temporal_preprocessing_pipeline import *
import numpy as np
import config as config

### Single function to extract and save patches

In [3]:
# pipeline = PreProcessingPipelineTemporal()

# pipeline.run_temporal_patch_save_pipeline(type='train')
# pipeline.run_temporal_patch_save_pipeline(type='eval')

### Step-by-step extract and save patches

The below cells are step by step functions for the script, because my system was not able to load all of the images at once and I had to load 15 at a time and save their patches and so on.

Loading Sentinel-2 Images

In [4]:
images = load_sentinel_images_temporal('/Users/bhumikasadbhave007/Documents/THWS/Semester-4/MASTER-THESIS/2024_data/buffer')

(1000, 1000, 13)
(1000, 1000, 13)
(1000, 1000, 13)
(1000, 1000, 13)
(1000, 1000, 13)
(1000, 1000, 13)
(1000, 1000, 13)
(1000, 1000, 13)
(1000, 1000, 13)
(1000, 1000, 13)
(1000, 1000, 13)
(1000, 1000, 13)
(1000, 1000, 13)


Masking Images to blacken pixels that don't belong to sugarbeet fields

In [5]:
masked_images = mask_images_temporal(images)

Extracting fields (as patches) for all temporal instances

In [6]:
fields = extract_fields_temporal(masked_images,  config.patch_size)

--- Processed 88 regions for scene 0
--- Processed 1 regions for scene 1
--- Processed 5 regions for scene 2
--- Processed 22 regions for scene 3
--- Processed 22 regions for scene 4
--- Processed 3 regions for scene 5
--- Processed 2 regions for scene 6
--- Processed 1 regions for scene 7
--- Processed 1 regions for scene 8
--- Processed 16 regions for scene 9
--- Processed 5 regions for scene 10
--- Processed 4 regions for scene 11
--- Processed 3 regions for scene 12


In [7]:
len(fields)

173

In [8]:
# np.unique(fields[1][0][:,:,-3])

In [9]:
# visualise_all_bands(fields[1][0])

Refining Temporal Stack -- Selecting 7 Images according to temporal ranges in config (with at-least 5 day gap between consecutive images)

In [10]:
refined_fields = refine_temporal_stack_interval_2024(fields, config.temporal_stack_size_2024, config.temporal_points_2024)

cloud-free
cloud-free
cloud-free
cloud-free
cloud-free
cloud-free
cloud-free
cloud-free
cloud-free
cloud-free
cloud-free
Patch discarded: Missing images for some date ranges.
Flag array was:  [1. 0. 1. 1.]
cloud-free
cloud-free
cloud-free
Patch discarded: Missing images for some date ranges.
Flag array was:  [1. 0. 1. 1.]
cloud-free
cloud-free
cloud-free
cloud-free
cloud-free
cloud-free
cloud-free
cloud-free
cloud-free
cloud-free
cloud-free
Patch discarded: Missing images for some date ranges.
Flag array was:  [0. 1. 1. 1.]
cloud-free
cloud-free
cloud-free
cloud-free
cloud-free
cloud-free
cloud-free
Patch discarded: Missing images for some date ranges.
Flag array was:  [1. 0. 1. 1.]
cloud-free
cloud-free
cloud-free
Patch discarded: Missing images for some date ranges.
Flag array was:  [1. 0. 1. 1.]
cloud-free
cloud-free
cloud-free
cloud-free
cloud-free
cloud-free
cloud-free
Patch discarded: Missing images for some date ranges.
Flag array was:  [0. 1. 1. 1.]
cloud-free
cloud-free
cloud-

In [11]:
len(refined_fields)

55

Saving Images

In [12]:
success = save_field_images_temporal(config.save_directory_temporal_train, refined_fields)
success

True

Number of Patches Saved

In [13]:
temporal_images_train = load_field_images_temporal(config.save_directory_temporal_train)
len(temporal_images_train)

1491

Number of Fields (One patch can contain more than 1 fields)

In [14]:
import os

folder_path = config.save_directory_temporal_train
unique_ids = set()  # Use a set to ensure uniqueness

for folder_name in os.listdir(folder_path):
    ids = folder_name.split("_")
    unique_ids.update(ids)  # Add all IDs to the set

count = len(unique_ids)
print(count)


2026


In [15]:
# # Load your CSV with field numbers
# csv_path = config.sugarbeet_content_csv_path
# df = pd.read_csv(csv_path)
# field_numbers_set = set(df['FIELDUSNO'].astype(str))  # convert to strings and to set

# # Your folder path
# folder_path = config.save_directory_temporal_train

# unique_ids = set()

# for folder_name in os.listdir(folder_path):
#     ids = folder_name.split("_")
#     filtered_ids = [id_ for id_ in ids if id_ in field_numbers_set]
#     unique_ids.update(filtered_ids)

# count = len(unique_ids)
# print(count)


Visualising the Patch and it's temporal instances

In [16]:
# visualize_temporal_stack_rgb(temporal_images_train[1410])