# Raster Tile Mosaicing Workflow
This notebook demonstrates the complete workflow for merging multiple georeferenced satellite raster tiles (GeoTIFFs) into a single, seamless, cloudless mosaic. The steps follow the requirements outlined in the task README.

## 1. Import Required Libraries
We will use geospatial and scientific Python libraries such as rasterio, numpy, matplotlib, and glob for file handling and visualization.

In [1]:
!pip install rasterio matplotlib numpy --quiet

In [2]:
# Install required libraries if running in Colab/Kaggle (uncomment if needed)
# !pip install rasterio matplotlib numpy

import os
import glob
import rasterio
from rasterio.merge import merge
from rasterio.plot import show
import numpy as np
import matplotlib.pyplot as plt

## 2. Verify Data Folder Contents
Check that the 'data' folder exists and is accessible.

In [3]:
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [4]:
DATA_DIR = '/content/drive/My Drive/data/data'

## 3. List Extracted Files
List all files and directories inside the 'data' folder to confirm extraction and see available tiles.

In [5]:
files = os.listdir(DATA_DIR)
print(f"Files in '{DATA_DIR}':")
for f in files:
    print(f)

Files in '/content/drive/My Drive/data/data':
17_20241129_054359_147_SN50_L1C_MS_ortho_8bit_ncc_rendered_7_4.tif
18_20241129_054358_499_SN50_L1C_MS_ortho_8bit_ncc_rendered_7_4.tif
19_20241129_054357_865_SN50_L1C_MS_ortho_8bit_ncc_rendered_7_4.tif
32_20240716_043003_536_SN32_L1C_MS_ortho_8bit_ncc_rendered_7_4.tif
33_20240716_043002_901_SN32_L1C_MS_ortho_8bit_ncc_rendered_7_4.tif
34_20240716_043002_264_SN32_L1C_MS_ortho_8bit_ncc_rendered_7_4.tif
4_20241124_054616_030_SN50_L1C_MS_ortho_8bit_ncc_rendered_7_4.tif
5_20241124_054615_396_SN50_L1C_MS_ortho_8bit_ncc_rendered_7_4.tif
.DS_Store
6_20241124_054614_762_SN50_L1C_MS_ortho_8bit_ncc_rendered_7_4.tif
7_20241124_054614_128_SN50_L1C_MS_ortho_8bit_ncc_rendered_7_4.tif


## 4. Preview a Sample Raster Tile
Open and display basic information about a sample GeoTIFF file from the data folder.

In [6]:
sample_files = glob.glob(os.path.join(DATA_DIR, '*.tif'))

## 5. Load and Validate All Tiles
Read all GeoTIFF tiles, check their CRS, resolution, and spatial extent to ensure they can be mosaiced together.

In [7]:
src_files_to_mosaic = []
crs_set = set()
res_set = set()
for fp in sample_files:
    src = rasterio.open(fp)
    src_files_to_mosaic.append(src)
    crs_set.add(str(src.crs))
    res_set.add(src.res)
    print(f"{os.path.basename(fp)}: CRS={src.crs}, Resolution={src.res}, Shape={src.shape}")

print(f"Unique CRS: {crs_set}")
print(f"Unique Resolutions: {res_set}")

if len(crs_set) > 1 or len(res_set) > 1:
    print("Not all tiles have the same CRS or resolution. Resampling is needed.")

17_20241129_054359_147_SN50_L1C_MS_ortho_8bit_ncc_rendered_7_4.tif: CRS=EPSG:3857, Resolution=(0.9356054896919204, 0.9356054896918405), Shape=(11977, 10927)
18_20241129_054358_499_SN50_L1C_MS_ortho_8bit_ncc_rendered_7_4.tif: CRS=EPSG:3857, Resolution=(0.9492307103382531, 0.9492307103382034), Shape=(11655, 10862)
19_20241129_054357_865_SN50_L1C_MS_ortho_8bit_ncc_rendered_7_4.tif: CRS=EPSG:3857, Resolution=(0.9413985629369338, 0.9413985629369082), Shape=(11629, 10877)
32_20240716_043003_536_SN32_L1C_MS_ortho_8bit_ncc_rendered_7_4.tif: CRS=EPSG:3857, Resolution=(1.2099118177057069, 1.2099118177057238), Shape=(6978, 7206)
33_20240716_043002_901_SN32_L1C_MS_ortho_8bit_ncc_rendered_7_4.tif: CRS=EPSG:3857, Resolution=(1.2048776377584478, 1.204877637758516), Shape=(7027, 7205)
34_20240716_043002_264_SN32_L1C_MS_ortho_8bit_ncc_rendered_7_4.tif: CRS=EPSG:3857, Resolution=(1.1683168794145475, 1.168316879414494), Shape=(7284, 7187)
4_20241124_054616_030_SN50_L1C_MS_ortho_8bit_ncc_rendered_7_4.tif:

## 5. Resample and store in temporary directory
We store in temporary to avoid excessive usage of RAM.
We will convert all the .tifs to an average resolution to maintain resolution and processing speed.

In [8]:
import shutil

# Create a temp directory for resampled files
TEMP_DIR = '/content/temp_resampled' if 'google.colab' in str(get_ipython()) else 'temp_resampled'
os.makedirs(TEMP_DIR, exist_ok=True)

# Calculate average resolution
all_res = [src.res for src in src_files_to_mosaic]
avg_xres = sum([r[0] for r in all_res]) / len(all_res)
avg_yres = sum([r[1] for r in all_res]) / len(all_res)
avg_res = (avg_xres, avg_yres)
print(f"Average resolution: {avg_res}")

from rasterio.enums import Resampling
from rasterio.warp import calculate_default_transform, reproject

resampled_paths = []
for idx, src in enumerate(src_files_to_mosaic):
    dst_transform, width, height = calculate_default_transform(
        src.crs, src.crs, src.width, src.height, *src.bounds, resolution=avg_res
    )
    dst_kwargs = src.meta.copy()
    dst_kwargs.update({
        'height': height,
        'width': width,
        'transform': dst_transform
    })
    out_path = os.path.join(TEMP_DIR, f'resampled_{idx}.tif')
    with rasterio.open(out_path, 'w', **dst_kwargs) as dst:
        for i in range(1, src.count + 1):
            data = np.empty((height, width), dtype=src.dtypes[0])
            reproject(
                source=rasterio.band(src, i),
                destination=data,
                src_transform=src.transform,
                src_crs=src.crs,
                dst_transform=dst_transform,
                dst_crs=src.crs,
                resampling=Resampling.bilinear
            )
            dst.write(data, i)
    resampled_paths.append(out_path)

print(f"Resampled files written to: {TEMP_DIR}")

Average resolution: (1.026769275254066, 1.0267692752540374)
Resampled files written to: /content/temp_resampled


## 6. Mosaic Creation
Merge all tiles into a single seamless raster, handling overlaps and NoData values.

In [10]:
!apt-get update
!apt-get install -y gdal-bin python3-gdal

0% [Working]            Hit:1 http://archive.ubuntu.com/ubuntu jammy InRelease
0% [Waiting for headers] [Waiting for headers] [Connected to cloud.r-project.or                                                                               Get:2 http://archive.ubuntu.com/ubuntu jammy-updates InRelease [128 kB]
                                                                               Get:3 http://security.ubuntu.com/ubuntu jammy-security InRelease [129 kB]
                                                                               Get:4 https://cloud.r-project.org/bin/linux/ubuntu jammy-cran40/ InRelease [3,632 B]
0% [Waiting for headers] [Connecting to r2u.stat.illinois.edu (192.17.190.167)]                                                                               Get:5 http://archive.ubuntu.com/ubuntu jammy-backports InRelease [127 kB]
0% [5 InRelease 96.7 kB/127 kB 76%] [Connecting to r2u.stat.illinois.edu (192.10% [Connecting to r2u.stat.illinois.edu (192.17.

## 7. Export the Mosaic as GeoTIFF
Save the resulting mosaic to 'combined_mosaic.tif' with proper georeferencing and metadata.

In [None]:
try:
    import subprocess
    subprocess.run(['gdal_merge.py', '--help'], stdout=subprocess.PIPE, stderr=subprocess.PIPE, check=True)
except Exception:
    print('Installing GDAL...')
    !apt-get install -y gdal-bin

output_path = "combined_mosaic.tif"
input_files = " ".join(resampled_paths)
cmd = f"gdal_merge.py -o {output_path} -of GTiff {input_files}"
print("Running:", cmd)
subprocess.run(cmd, shell=True, check=True)
print(f"Mosaic saved as {output_path}")

## 8. Visualize the Final Mosaic
Display the output mosaic for visual inspection.

In [None]:
import shutil
drive_dest = '/content/drive/My Drive/data/combined_mosaic.tif'
shutil.copy('combined_mosaic.tif', drive_dest)
print(f'Mosaic copied to {drive_dest}')