# SAR2SAR: a self-supervised despeckling algorithm for SAR images
## Emanuele Dalsasso, Loïc Denis, Florence Tupin

Please note that the training set is only composed of **GRD** SAR images, thus this testing code is specific to this data.

## Resources
- [Paper (ArXiv)](https://arxiv.org/abs/2006.15037)

To cite the article:

    @article{dalsasso2020sar2sar,
        title={{SAR2SAR}: a self-supervised despeckling algorithm for {SAR} images},
        author={Emanuele Dalsasso and Loïc Denis and Florence Tupin},
        journal={arXiv preprint arXiv:2006.15037},
        year={2020}
    }

## 1. Install required libraries

In [None]:
!pip install -r requirements.txt

## 2. Set up the data environment
The model.py contains the run_model function. Here is the docstring for the run_model function:
    
    Runs the despeckling algorithm

    Arguments:
        input_dir: Path to a directory containing the files to be despeckled. Files need to be in .npy
                   format
        save_dir: Path to a directory where the files will be saved
        checkpoint_dir: Path to a directory containing the tensorflow checkpoints, if left as None, the
                        despeckling algorithm will use the grd_checkpoint directory
        stride: U-Net is scanned over the image with a default stride of 64 pixels when the image dimension
                exceeds 256. This parameter modifies the default stride in pixels. Lower pixel count = higher quality
                results, at the cost of higher runtime
        store_noisy: Whether to store the "noisy" or input in the save_dir. Default is False
        generate_png: Whether to generate PNG of the outputs in the save_dir. Default is True
        debug: Whether to generate print statements at runtime that communicate what is going on

    Returns:
        None

**You will need to set the input_dir and save_dir filepaths.** Here is an example of what that could look like.

In [None]:
import os
from pathlib import Path

# current folder this file is in
current_dir = Path(os.getcwd())

# set the path of the input and save directories
example_input_dir = str(current_dir / "src" / "test_data" / "example_test_data")
example_save_dir = str(current_dir / "example_output")

# set the path of your own input and save directory
input_dir = str(current_dir / "my_data" / "input")
save_dir = str(current_dir / "my_data" / "results")

print(f"Input directory set to: {input_dir}")
print(f"Save directory set to: {save_dir}")

# Converting .tif to .npy
If your data is in .tif form, and you would like to run this algorithm, you will need to convert it to .npy (and subsequently convert it back to tif when it is done, though this can be more complicated based on your input and how you want your results). An easy way to do that is with the rasterio python library and this function, which converts all single band .tif and .TIF files in the input_dir directory to .npy files. Multi-band rasters are more complicated, and should be split up into single band rasters on your own terms.

In [None]:
import rasterio
import numpy as np

# file extensions
tif_extensions = [".tif", ".TIF"]
npy_extensions = [".npy"]

def tifToNpy(input_dir: str, extensions: list=[".tif", ".TIF"]) -> None:
    """
    Converts the files in the input_dir directory with the given extensions to .npy files

    Arguments:
        input_dir: Path to the input directory
        extensions: list of valid extensions for files
    Returns:
        None
    """
    input_dir = Path(input_dir)
    # get each .tif/.TIF file in the input_dir directory
    for file in input_dir.iterdir():
        if not file.is_dir() and file.suffix in extensions:
            # open the file and read the data
            with rasterio.open(str(file)) as src:
                data = src.read()
                # save as a .npy
                if data.shape[0] == 1:
                    # if there is only one band
                    path_to_output = file.with_suffix(".npy")
                    np.save(path_to_output, np.squeeze(data))
                else:
                    # if there are multiple bands, this introduces many complications when trying to re-combine later. Everyone's setup and needs
                    # are different so you will have to write your own code to handle these cases. Here is an example of what it could look like
                    """
                    for i in range(data.shape[0]):
                        filename = str(Path(file.name).with_suffix(""))
                        path_to_output = file.with_name(filename + f"_B{i}").with_suffix(
                            ".npy"
                        )
                        np.save(path_to_output, np.squeeze(data[i]))
                    """
                    raise ValueError("Multiple bands, please split up into single band rasters for easier processing")

# convert tif files in the input directory to .npy
tifToNpy(input_dir, tif_extensions)

## 3. Run the example model
Run the example model to make sure that everything has been installed correctly and is ready to run. **This code was originally written for Tensorflow V1, so the tensorflow library will throw a lot of warnings.**
When the model is done, you should see a folder named example_output with the results

***The model will print this line when it has finished:***

[!!!] Done

In [None]:
import tensorflow as tf
tf.compat.v1.reset_default_graph()
from src.model import run_model

run_model(example_input_dir, example_save_dir)

## 4. Run the model on your data

In [None]:
tf.compat.v1.reset_default_graph()
run_model(input_dir, save_dir)

## 5. Convert the data back to .tif
If your data was in .tif form, you probably want your results to be in .tif form too. Converting back can be more complicated because of the tif's associated metadata. Here is a simple approach that uses the original .tif as a mirror for the metadata of the original file.

In [None]:
def npyToTif(processed_npy_file: str, metadata_mirror: str) -> None:
    """
    Converts the processed .npy file back to .tif using the metadata from the metadata mirror tif,
    the original .tif before processing

    Arguments:
        processed_npy_file: Path to the processed .npy file
        metadata_mirror: Path to the original .tif file this .npy file as generated from. The metadata mirror
                         allows rasterio to write the metadata of the original .tif onto the despeckled result.
    Returns:
        None
    """
    # open the orignal tif to get its metadata
    with rasterio.open(metadata_mirror) as src:
        # rename the file
        filename = str(Path(processed_npy_file).with_suffix("").name)
        path_to_output = Path(processed_npy_file).with_name("denoised_" + filename).with_suffix(".tif")
        # open the new denoised_ .tif and write the .npy w/ the mirror metadata
        with rasterio.open(path_to_output, "w", **src.meta) as dst:
            dst.write(np.stack([np.load(processed_npy_file)]))

# get the original files and the despeckled results and search for matches on file name
original_files, despeckled_files = [file for file in Path(input_dir).iterdir()  if not file.is_dir() and file.suffix in tif_extensions], [file for file in Path(save_dir).iterdir()  if not file.is_dir() and file.suffix in npy_extensions]
for i in range(len(despeckled_files)):
    for j in range(len(original_files)):
        # if the file names are the same, convert .npy to .tif
        if despeckled_files[i].with_suffix("").name == original_files[j].with_suffix("").name:
            print(f"Converting {despeckled_files[i].name} to tif with metadata mirror at {original_files[j].name} to {'denoised_' + original_files[j].name}")
            npyToTif(str(despeckled_files[i]), str(original_files[j]))
                