LAM is an open-vocabulary detection and segmentation stack for still imagery. You describe what to find in natural language; the model returns masks, boxes, and scores, exposed as GeoJSON-friendly features for GIS workflows and downstream services.
At its core, LAM uses the same text-prompted grounding idea as SAM 3: a single RGB view goes through the image backbone and decoder, and a text prompt selects which objects to segment. That per-image behavior is what you get when you call the processor on one PIL image or when the HTTP service decodes one request body.
LAM extends that core with a native, in-repo geospatial path for large GeoTIFFs. You do not have to pre-tile rasters in a separate GIS application or pre-processing logic. The lam CLI (and the underlying run_geotiff_inference API) opens the mosaic with rasterio, walks the raster in configurable windows (tile size and overlap), and runs the same SAM 3–class inference on each window. Detections are lifted from tile-local coordinates into full-mosaic pixel space: bounding boxes are offset by each window’s origin, and masks are embedded into a full-height, full-width raster canvas so every instance is expressed in one common pixel grid. Because neighboring tiles overlap, the same object can appear more than once; LAM then applies class-agnostic non-maximum suppression (NMS) on those full-raster boxes (by default using IoU on axis-aligned boxes) so the merged result is a single set of instances for the whole extent.
Georeferencing is handled inside that same pipeline, not as a post-processing hack. Each GeoTIFF carries an affine geotransform and a coordinate reference system (CRS). After inference and optional cross-tile merging, mask outlines and boxes are transformed from pixel space into real-world coordinates using that transform and CRS. You can reproject outputs to another CRS (for example EPSG:4326 for WGS 84 longitude/latitude) before writing GeoJSON, so features are ready for web maps, GIS databases, or orchestration stacks—while the HTTP /invocations path still returns image-coordinate GeoJSON for single-image payloads where no geotransform is in play.
The codebase is a standalone packaging of the SAM 3–class image architecture and compatible weights (for example facebook/sam3 on Hugging Face). It does not depend on Meta’s upstream sam3 Python package—everything ships under this repository’s lam package.
| Mode | Best for | Output |
|---|---|---|
HTTP server (app.py) |
One image per request (tiles, chips, small rasters in memory) | GeoJSON in pixel / image coordinates |
CLI (lam) |
Large GeoTIFF mosaics | Georeferenced GeoJSON (reprojectable; cross-tile NMS to drop duplicates) |
| Python | Custom pipelines | build_lam_image_model + Sam3Processor for set_image / set_text_prompt |
- Python 3.10+
- PyTorch and torchvision (CUDA build on GPU hosts; see pytorch.org)
- Geospatial CLI: optional extras pull in rasterio, shapely, pyproj, etc.
- Visualization (upstream-style overlays): optional
[viz]extra forlam.visualization_utils(plot_mask,COLORS, video/COCO helpers)—see Python API below. - HTTP server: GDAL-backed decode via
osgeo.gdal(provided by the conda environment below or your system GDAL bindings)
Using conda-forge for PyTorch, NumPy, TorchVision, and GDAL avoids common OpenMP / libomp conflicts when mixing wheels.
# From the repository root (directory containing pyproject.toml)
conda env create -f conda/lam-sagemaker.yml
conda activate lam_sagemaker
pip install -e . --no-depsIf this environment previously had PyPI torch, torchvision, or numpy, remove them so only conda-forge builds load:
pip uninstall -y torch torchvision numpy
conda env update -f conda/lam-sagemaker.ymlLinux + NVIDIA GPU (e.g. SageMaker): conda-forge’s pytorch is often CPU-only. Use scripts/setup_sagemaker_notebook.sh with LAM_PYTORCH_CUDA=12.4 (maps to pip index cu124) so the script removes conda pytorch/torchvision and installs CUDA wheels from download.pytorch.org. Override the index with LAM_PYTORCH_WHL=cu124 if needed. On macOS, do not set these (keep conda torch for OpenMP sanity).
pip install -e .
pip install -e ".[geospatial]" # large GeoTIFF CLI
pip install -e ".[serve]" # Flask + Waitress + server deps
pip install -e ".[viz]" # matplotlib/OpenCV/pandas/sklearn/skimage for lam.visualization_utilsInstall a matching PyTorch build for your platform. On macOS, if you see duplicate libomp errors, prefer the conda route above or, as a last resort, KMP_DUPLICATE_LIB_OK=TRUE.
Install [geospatial] so the lam and lam-viz entry points are available (pip install -e ".[geospatial]").
Run inference (writes a georeferenced FeatureCollection):
lam --prompt "aircraft" -o out.geojson assets/images/small.tifVisualize the same GeoJSON on top of the source GeoTIFF (matplotlib). Save a PNG:
lam-viz assets/images/small.tif out.geojson -o out_overlay.pngOr open an interactive plot (needs a display; omit -o):
lam-viz assets/images/small.tif out.geojsonUse --min-score (for example 0.35) to hide low-confidence polygons. --geojson-crs must match the CRS you used when writing the GeoJSON: the default is EPSG:4326, same as lam --dst-crs; if you ran lam --dst-crs "" to keep the raster CRS, pass that CRS to lam-viz (for example --geojson-crs EPSG:32633).
- Tiling defaults: 1008 px tiles, 128 px overlap (override with
--tile-size/--overlap). - Cross-tile deduplication uses class-agnostic NMS on boxes in raster space (default IoU 0.45). Disable with
--merge-iou -1. - Output CRS:
--dst-crs EPSG:4326by default; set--dst-crsempty to keep the raster’s CRS. - Weights: the CLI always loads a local file only—
$CHECKPOINT_PATHif set, else<repo>/assets/weights/sam3.pt. The file must exist (populate withhf sync hf://buckets/drduhe/lam-weights ./assets/weights/). Pass--checkpointto override. The CLI does not download from Hugging Face.
python app.pyBy default, weights load from Hugging Face when no local checkpoint is configured (LOAD_FROM_HF defaults to true in app.py). Set LOAD_FROM_HF=false or CHECKPOINT_PATH when you want offline / image-bundled weights only.
- Listens on
0.0.0.0:8080by default (override withSAGEMAKER_BIND_TO_PORT). GET /ping— health check (HTTP 200, plain-text bodyhealthy).POST /invocations— same tiling pipeline aspython -m lam: overlapping windows (defaults 1008 px, overlap 128), cross-tile NMS, georeferenced GeoJSON inLAM_TILE_DST_CRS(defaultEPSG:4326). Send raw raster bytes in the body (for example GeoTIFF);Content-Typeshould match the format (image/tiff,image/jpeg, …). Alternatively passs3_uriinX-Amzn-SageMaker-Custom-Attributesor a JSON body (see below).
Per-request parameters can be set in X-Amzn-SageMaker-Custom-Attributes (comma-separated key=value, values URL-encoded as needed) and/or in a JSON body when Content-Type: application/json. If both are present, JSON overrides the header for the same key.
| Parameter | Purpose | Accepted keys (case-insensitive; - and _ equivalent) |
|---|---|---|
| Text prompt | Open-vocabulary query | text_prompt, prompt, lam_text_prompt |
| S3 input | Read raster from S3 instead of body | s3_uri, lam_s3_uri |
| Tile size | Window size (px) | tile_size, lam_tile_size |
| Overlap | Tile overlap (px) | overlap, lam_tile_overlap |
| Output CRS | GeoJSON geometry CRS | dst_crs, lam_tile_dst_crs (empty value keeps raster CRS) |
| Cross-tile NMS | Box merge IoU | merge_iou, lam_tile_merge_iou (-1 disables) |
Environment variables LAM_TILE_* still set defaults when a key is omitted.
curl -s -X POST http://127.0.0.1:8080/invocations \
-H "Content-Type: image/tiff" \
-H "X-Amzn-SageMaker-Custom-Attributes: text_prompt=building" \
--data-binary @/assets/images/small.tifResponse body is a GeoJSON FeatureCollection with geometries in the requested output CRS (raster CRS if dst_crs is empty).
| Variable | Purpose |
|---|---|
LOAD_FROM_HF |
Default true — allow Hugging Face when no local checkpoint; set false to require local weights |
CHECKPOINT_PATH |
Path to local weights (convention: assets/weights/sam3.pt); when set, Hugging Face load is skipped |
DEFAULT_TEXT_PROMPT |
Prompt if the request does not set text_prompt (default objects) |
CONFIDENCE_THRESHOLD |
Detection confidence cutoff (default 0.3) |
FUSE_DETECTIONS_IOU_THRESHOLD |
Optional IoU threshold for fusing overlapping detections within each tile |
LAM_TILE_SIZE |
HTTP /invocations tile width/height in pixels (default 1008, same as CLI) |
LAM_TILE_OVERLAP |
HTTP tile overlap in pixels (default 128) |
LAM_TILE_DST_CRS |
HTTP output CRS, e.g. EPSG:4326 (empty string keeps raster CRS) |
LAM_TILE_MERGE_IOU |
Cross-tile box NMS IoU threshold (default 0.45; negative disables) |
ENABLE_TORCH_COMPILE |
torch.compile on CUDA (default on when CUDA is available) |
TORCH_COMPILE_MODE |
Passed to torch.compile (default reduce-overhead) |
MIXED_PRECISION |
bf16, fp16, or fp32 (CUDA; BF16 falls back if unsupported) |
PREWARM_GPU |
Run a dummy forward on startup when CUDA is available |
SAGEMAKER_BIND_TO_PORT |
Listen port (default 8080) |
Production-style images and local run examples (including GPU and large GeoTIFF entrypoint overrides) are documented in docker/README.md.
To provision VPC + SageMaker endpoint in AWS with the same container, use the CDK app under cdk/ (deployment.json from deployment.json.example, then cdk deploy --all).
Weights for the image must be present under assets/weights/ (for example sam3.pt). Sync them from the project Hugging Face bucket, then build from the repository root (the directory that contains pyproject.toml and app.py):
hf sync hf://buckets/drduhe/lam-weights ./assets/weights/
docker build -f docker/Dockerfile.lam-sagemaker -t lam-sagemaker:latest .End-to-end wall-clock timing for the GeoTIFF CLI (subprocess, matching SageMaker notebook %%time usage) is documented in benchmarks/README.md. Use scripts/benchmark_geotiff_cli.py to reproduce runs and emit --json for regression notes.
| Path | Contents |
|---|---|
assets/weights/ |
Local sam3.pt (gitignored); populate with hf sync hf://buckets/drduhe/lam-weights ./assets/weights/ for Docker builds and offline use; or set CHECKPOINT_PATH / --checkpoint |
lam/ |
Model, training, evaluation, inference helpers, build_lam_image_model, tokenizer asset under lam/assets/ |
lam/geospatial/ |
GeoTIFF tiling pipeline and georeferenced GeoJSON output |
lam/visualization_utils.py |
Upstream-aligned plotting helpers (plot_mask, COLORS, video/COCO tools); needs [viz] (conda env includes these via pip) |
app.py |
Flask app: GDAL decode → LAM → GeoJSON |
lam/hosting/ |
Server helpers, SageMaker-oriented parsing, GeoJSON serialization |
docker/ |
Multi-stage SageMaker-oriented Dockerfile |
cdk/ |
AWS CDK: VPC, SageMaker role, real-time endpoint (see cdk/README.md) |
conda/lam-sagemaker.yml |
Reference conda environment for GDAL + aligned scientific stack |
benchmarks/ |
Performance benchmarking notes for the GeoTIFF CLI |
scripts/benchmark_geotiff_cli.py |
Wall-time driver for python -m lam (see benchmarks/README.md) |
import torch
from lam import build_lam_image_model
from lam.model.sam3_image_processor import Sam3Processor
device = "cuda" if torch.cuda.is_available() else "cpu"
model = build_lam_image_model(device=device, load_from_HF=True).to(device).eval()
processor = Sam3Processor(model, device=device, confidence_threshold=0.3)
# PIL Image or similar → state → prompt
state = processor.set_image(pil_image)
out = processor.set_text_prompt("your prompt here", state)
# out contains masks, boxes, scoresInstall [viz] for lam.visualization_utils (same role as upstream sam3.visualization_utils: COLORS, plot_mask, plot_bbox, plot_results, video/COCO helpers).
from lam.visualization_utils import COLORS, plot_mask
# After processor.set_text_prompt(...): overlay each mask on the PIL image in matplotlibFor branch workflow, reviews, and commit conventions, see CONTRIBUTING.md. Release notes live in CHANGELOG.md.
Install dev tools and enable pre-commit hooks (Ruff lint + format, YAML/TOML checks, whitespace):
pip install '.[dev]'
pre-commit installRun on all tracked files once: pre-commit run --all-files. Hooks use the Ruff settings in pyproject.toml.
Maintenance — LAM is supported on a best-effort basis. There is no service-level agreement and response times will vary.
Where to ask — Use GitHub Issues on this repository for bug reports and feature requests (issue templates help triage). Use GitHub Discussions for questions, how-tos, and broader design or integration topics when Discussions are enabled here; if they are not enabled yet, opening an Issue for a question is fine.
Help wanted — The maintainer is actively looking for community members who want to share ownership: reviews, documentation, issue triage, and steering the roadmap. If you rely on LAM and want to co-maintain, say so in a Discussion, Issue, or PR thread.
Why this repo — LAM is an independent packaging aimed at geospatial tiling, HTTP serving, and day-to-day integration on top of the SAM 3–class image model. One reason it exists is limited ongoing development in Meta’s upstream facebookresearch/sam3 repository for those downstream concerns; LAM is meant to move them forward with the community. Point [project.urls] in pyproject.toml at this repo when you publish so PyPI and GitHub show the right home.
Model and code are subject to the SAM License in LICENSE (Meta). Review the agreement before use or redistribution.
LAM builds on research and open releases from the Segment Anything / SAM family. Weights and architecture align with publicly released SAM 3 image checkpoints (for example on Hugging Face); this repository is an independent packaging for geospatial and serving workflows.