## Uploading as HF datasets
This notebook prepares and publishes patched satellite images and image–mask pairs to Hugging Face Datasets. The code assumes the correct huggingface-cli login. It includes:

- A script to publish SENTINEL2RGB image patches to HF (nikolkoo/Sentinel2RGBNorway).
- A script to build and push paired satellite images and masks (nikolkoo/SatelliteSegmentation)
- A utility to correct images accidentally saved in BGR to RGB.
- Notes on expected folder layout and allowed image extensions.

#### SENTINEL2RGB
These images contain selected patches from Norway used for pretraining. The code publishes them to my repository on Huggingface with the correct credentials. 

In [None]:
from datasets import Dataset, Image
import os

image_paths = []
for root, _, files in os.walk("patched_images"):
    for fname in files:
        if fname.endswith((".png", ".jpg", ".jpeg")):
            image_paths.append({"image": os.path.join(root, fname)})

ds = Dataset.from_list(image_paths)
ds = ds.cast_column("image", Image())
ds.push_to_hub("nikolkoo/Sentinel2RGBNorway", private=False)

#### SatelliteSegmentation
The code below contains the code used to upload the satellite-image pairs to Huggingface. It requires login to an account using huggingface-cli. 

In [None]:
# python
from datasets import Dataset, Image
from pathlib import Path

imgs_dir = Path("images/img")
masks_dir = Path("images/mask")

allowed_ext = {".png", ".jpg", ".jpeg"}
rows = []

for img_path in sorted(imgs_dir.iterdir()):
    if img_path.suffix.lower() not in allowed_ext:
        continue
    stem = img_path.stem                  
    mask_candidates = list(masks_dir.glob(f"{stem}_mask.*"))
    if not mask_candidates:
        # no matching mask, skip or log
        continue
    mask_path = mask_candidates[0]            # take the first match
    rows.append({"image": str(img_path), "mask": str(mask_path)})

# build and push
ds = Dataset.from_list(rows)
ds = ds.cast_column("image", Image())
ds = ds.cast_column("mask", Image())
# make sure you're logged in (huggingface-cli login) before pushing
ds.push_to_hub("nikolkoo/SatelliteSegmentation", private=False)

### Converting from BGR -> RGB
I initially did a mistake and loaded the images as BGR. The code below transforms the images from BGR (blue, green, red) to RGB (red, green, blue).

In [None]:
import cv2
import os
from glob import glob

# Folder containing your incorrect BGR images
folder = "/images2/"

# Image extensions to process
extensions = ("*.png")

for ext in extensions:
    for img_path in glob(os.path.join(folder, ext)):
        img = cv2.imread(img_path)  # loads as BGR
        if img is None:
            print(f"Skipping unreadable file: {img_path}")
            continue

        # Convert BGR → RGB
        img_rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)

        # Overwrite the original file
        cv2.imwrite(img_path, img_rgb)

        print(f"Fixed: {img_path}")
