<a href="https://colab.research.google.com/github/enzococca/geoslam/blob/main/GeoSAM-Image-Encoder/examples/geosam-image-encoder.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# GeoSAM-Image-Encoder (Python package)

[![PyPI Version](https://img.shields.io/pypi/v/GeoSAM-Image-Encoder)](https://pypi.org/project/GeoSAM-Image-Encoder)
[![Downloads](https://static.pepy.tech/badge/GeoSAM-Image-Encoder)](https://pepy.tech/project/GeoSAM-Image-Encoder)


This package is part of the [Geo-SAM](https://github.com/coolzhao/Geo-SAM) project and is a standalone Python package that does not depend on QGIS. This package allows you to **encode remote sensing images into features that can be recognized by Geo-SAM using a remote server**, such as ``Colab``, ``AWS``, ``Azure`` or your own ``HPC``.

## Installation

Installing `GeoSAM-Image-Encoder` may directly install the CPU version of `PyTorch`. Therefore, it is recommended to install the appropriate version of `PyTorch` before installing `GeoSAM-Image-Encoder` in your machine. You can install the corresponding version based on the official PyTorch website:
<https://pytorch.org/get-started/locally/>

After installing PyTorch, you can install `GeoSAM-Image-Encoder` via pip.


In Colab, PyTorch is already built-in, so you can install it directly.

In [1]:
!pip install GeoSAM-Image-Encoder
# or
# !pip install git+https://github.com/Fanchengyan/GeoSAM-Image-Encoder.git

Collecting GeoSAM-Image-Encoder
  Downloading GeoSAM_Image_Encoder-1.0.4-py3-none-any.whl.metadata (3.9 kB)
Collecting torchgeo (from GeoSAM-Image-Encoder)
  Downloading torchgeo-0.6.2-py3-none-any.whl.metadata (19 kB)
Collecting segment-anything (from GeoSAM-Image-Encoder)
  Downloading segment_anything-1.0-py3-none-any.whl.metadata (487 bytes)
Collecting fiona>=1.8.21 (from torchgeo->GeoSAM-Image-Encoder)
  Downloading fiona-1.10.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (56 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m56.6/56.6 kB[0m [31m2.0 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting kornia>=0.7.3 (from torchgeo->GeoSAM-Image-Encoder)
  Downloading kornia-0.8.0-py2.py3-none-any.whl.metadata (17 kB)
Collecting lightly!=1.4.26,>=1.4.5 (from torchgeo->GeoSAM-Image-Encoder)
  Downloading lightly-1.5.18-py3-none-any.whl.metadata (36 kB)
Collecting lightning!=2.3.*,>=2 (from lightning[pytorch-extra]!=2.3.*,>=2->torchgeo->GeoSAM-Imag

Download example dataset and sam `vit_l` checkpoint

In [1]:
!wget https://raw.githubusercontent.com/coolzhao/Geo-SAM/main/rasters/beiluhe_google_img_201211_clip.tif
!wget https://dl.fbaipublicfiles.com/segment_anything/sam_vit_l_0b3195.pth
!wget https://raw.githubusercontent.com/coolzhao/Geo-SAM/main/GeoSAM-Image-Encoder/examples/data/setting.json

--2025-02-05 17:34:42--  https://raw.githubusercontent.com/coolzhao/Geo-SAM/main/rasters/beiluhe_google_img_201211_clip.tif
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.109.133, 185.199.110.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 17152742 (16M) [application/octet-stream]
Saving to: ‘beiluhe_google_img_201211_clip.tif’


2025-02-05 17:34:44 (346 MB/s) - ‘beiluhe_google_img_201211_clip.tif’ saved [17152742/17152742]

--2025-02-05 17:34:45--  https://dl.fbaipublicfiles.com/segment_anything/sam_vit_l_0b3195.pth
Resolving dl.fbaipublicfiles.com (dl.fbaipublicfiles.com)... 108.157.254.124, 108.157.254.121, 108.157.254.102, ...
Connecting to dl.fbaipublicfiles.com (dl.fbaipublicfiles.com)|108.157.254.124|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1249524607 (1.2G) [binary/octet-stream]
Savin

## Usage

There are **two ways** to use GeoSAM-Image-Encoder. You can call it in Python or Terminal. We recommend using Python interface directly which will have greater flexibility.

### Using Python

After install GeoSAM-Image-Encoder, you can import it using `geosam`

In [2]:
import geosam
from geosam import ImageEncoder

In [3]:
# check if gpu available
geosam.gpu_available()

True

#### Run by specify parameters directly

If you want to specify the parameters directly, you can run it like this:

In [None]:
checkpoint_path = '/content/sam_vit_l_0b3195.pth'
image_path = '/content/beiluhe_google_img_201211_clip.tif'
feature_dir = './'

## init ImageEncoder
img_encoder = ImageEncoder(checkpoint_path)
## encode image
img_encoder.encode_image(image_path,feature_dir)

Initializing SAM model...


----------------------------------------------
     Start encoding image to SAM features
----------------------------------------------

Input Parameters:
----------------------------------------------
 Input data value range to be rescaled: [0, 255] (automatically set based on min-max value of input image inside the processing extent.)
 Image path: /content/beiluhe_google_img_201211_clip.tif
 Bands selected: ['1', '2', '3']
 Target resolution: 0.9999395530145561
 Processing extent: [471407.9709, 3882162.2353, 473331.8546, 3884389.1008]
 Processing image size: (width 1924, height 2227)
----------------------------------------------


RasterDataset info 
----------------------------------------------
 filename_glob: beiluhe_google_img_201211_clip.tif, 
 all bands: ['1', '2', '3', '4'], 
 input bands: ['1', '2', '3'], 
 resolution: 0.9999395530145561, 
 bounds: [471407.9709, 473331.8546571067, 3882162.2353493366, 3884389.1008, 0.0, 9.223372036854776e+18], 
 nu

Encoding image: 100%|██████████| 12/12 [00:20<00:00,  1.70s/batch]

"Output feature path": .





#### Run by parameters from setting.json file

If you want to using `settings.json` file which exported from Geo-SAM plugin to provide parameters, you can run it like this:

In [4]:
setting_file = "/content/setting.json"
feature_dir = './'

### parse settings from the setting,json file
settings = geosam.parse_settings_file(setting_file)

### setting file not contains feature_dir, you need add it
settings.update({"feature_dir":feature_dir})

### split settings into init_settings, encode_settings
init_settings, encode_settings = geosam.split_settings(settings)

print(f"settings: {settings}")
print(f"init_settings: {init_settings}")
print(f"encode_settings: {encode_settings}")

settings: {'image_path': '/content/beiluhe_google_img_201211_clip.tif', 'bands': [1, 1, 1], 'value_range': '0.0,255.0', 'extent': '471407.9709, 473331.8546, 3882162.2353, 3884389.1008 [EPSG:32646]', 'resolution': 0.9999395530145561, 'stride': 512, 'checkpoint_path': '/content/sam_vit_l_0b3195.pth', 'model_type': 1, 'batch_size': 1, 'gpu_id': 0, 'feature_dir': './'}
init_settings: {'checkpoint_path': '/content/sam_vit_l_0b3195.pth', 'model_type': 1, 'batch_size': 1, 'gpu_id': 0}
encode_settings: {'image_path': '/content/beiluhe_google_img_201211_clip.tif', 'bands': [1, 1, 1], 'value_range': '0.0,255.0', 'extent': '471407.9709, 473331.8546, 3882162.2353, 3884389.1008 [EPSG:32646]', 'resolution': 0.9999395530145561, 'stride': 512, 'feature_dir': './'}


In [5]:
## Then, you can run image encoding by parameters from setting.json file
img_encoder = ImageEncoder(**init_settings)
img_encoder.encode_image(**encode_settings)

Initializing SAM model...



  state_dict = torch.load(f)



----------------------------------------------
     Start encoding image to SAM features
----------------------------------------------

Input Parameters:
----------------------------------------------
 Input data value range to be rescaled: (0.0, 255.0) (set by user)
 Image path: /content/beiluhe_google_img_201211_clip.tif
 Bands selected: ['1', '1', '1']
 Target resolution: 0.9999395530145561
 Processing extent: (471407.9709, 473331.8546, 3882162.2353, 3884389.1008)
 Processing image size: (width 3410960, height 3411263)
----------------------------------------------


RasterDataset info 
----------------------------------------------
 filename_glob: beiluhe_google_img_201211_clip.tif, 
 all bands: ['1', '2', '3', '4'], 
 input bands: ['1', '1', '1'], 
 resolution: 0.9999395530145561, 
 bounds: [471407.9709, 473331.8546, 3882162.2353, 3884389.1008, 0.0, 9.223372036854776e+18], 
 image number: 1
----------------------------------------------

-----------------------------------------

Encoding image: 100%|██████████| 12/12 [00:13<00:00,  1.08s/batch]

"Output feature path": .





### Using Terminal

Since this is a Colab example, Python will be used to demonstrate running it in the terminal.

In [6]:
import os

## change cwd to geosam folder
os.chdir(geosam.folder)
print(os.getcwd())

/usr/local/lib/python3.11/dist-packages/geosam


In [7]:
## get the command for terminal
cmd = f"image_encoder.py -i {image_path} -c {checkpoint_path} -f {feature_dir}"
print(cmd)

NameError: name 'image_path' is not defined

In [None]:
## run in terminal
!python image_encoder.py -i /content/beiluhe_google_img_201211_clip.tif -c /content/sam_vit_l_0b3195.pth -f ./

## You can overwrite the settings from file by specify the parameter values. For Example:
# !python image_encoder.py -s /content/setting.json  -f ./ --stride 256 --value_range "10,255"


settings:
 {'feature_dir': PosixPath('/usr/local/lib/python3.10/dist-packages/geosam'), 'image_path': PosixPath('/content/beiluhe_google_img_201211_clip.tif'), 'checkpoint_path': PosixPath('/content/sam_vit_l_0b3195.pth'), 'stride': 512, 'batch_size': 1, 'gpu_id': 0}

Initializing SAM model...


----------------------------------------------
     Start encoding image to SAM features
----------------------------------------------

Input Parameters:
----------------------------------------------
 Input data value range to be rescaled: [0, 255] (automatically set based on min-max value of input image inside the processing extent.)
 Image path: /content/beiluhe_google_img_201211_clip.tif
 Bands selected: ['1', '2', '3']
 Target resolution: 0.9999395530145561
 Processing extent: [471407.9709, 3882162.2353, 473331.8546, 3884389.1008]
 Processing image size: (width 1924, height 2227)
----------------------------------------------


RasterDataset info 
----------------------------------------

In [None]:
## check all available parameters:
!python image_encoder.py -h


This script is for encoding image to SAM features.

=====
Usage
=====
using settings.json:

    image_encoder.py -s <settings.json> -f <feature_dir>
 
 
or directly using parameters:
 
    image_encoder.py -i <image_path> -c <checkpoint_path> -f <feature_dir>
    
All Parameters:
-------------------
-s, --settings:         Path to the settings json file.
-i, --image_path:       Path to the input image.
-c, --checkpoint_path:  Path to the SAM checkpoint.
-f, --feature_dir:      Path to the output feature directory.
--model_type: one of ["vit_h", "vit_l", "vit_b"] or [0, 1, 2] or None, optional
    The type of the SAM model. If None, the model type will be 
    inferred from the checkpoint path. Default: None. 
--bands: list of int, optional .
    The bands to be used for encoding. Should not be more than three bands.
    If None, the first three bands (if available) will be used. Default: None.
--stride: int, optional
    The stride of the sliding window. Default: 512.
--extent: str, o