Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
37 changes: 16 additions & 21 deletions .github/workflows/docs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,8 @@ on:
branches: [ main ]
pull_request:
branches: [ main ]
paths:
- "docs/**"

jobs:
build-docs:
Expand All @@ -17,27 +19,20 @@ jobs:
with:
python-version: '3.11'

- name: Install uv
uses: astral-sh/setup-uv@v6
with:
version: "0.8.4"
python-version: "3.13"
enable-cache: false

- name: Install dependencies
run: |
sudo apt-get update
sudo apt-get install -y proj-bin gdal-bin libgdal-dev
python -m pip install --upgrade pip
pip install -e ".[docs]" --only-binary=:all:
run: uv sync --group docs

- name: Build documentation
run: |
cd docs
make html
- name: Build docs
if: github.event_name == 'pull_request'
run: uv run -- mkdocs build

- name: Upload documentation
uses: actions/upload-artifact@v4
with:
name: documentation
path: docs/_build/html/

- name: Deploy to GitHub Pages
if: github.event_name == 'push' && github.ref == 'refs/heads/main'
uses: peaceiris/actions-gh-pages@v3
with:
github_token: ${{ secrets.GITHUB_TOKEN }}
publish_dir: ./docs/_build/html
- name: Deploy docs
if: github.event_name == 'push'
run: uv run -- mkdocs gh-deploy --force
2 changes: 1 addition & 1 deletion .vscode/launch.json
Original file line number Diff line number Diff line change
Expand Up @@ -166,4 +166,4 @@

}
]
}
}
107 changes: 107 additions & 0 deletions docs/converter.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,107 @@
# Using the GeoZarr Converter

The GeoZarr converter provides tools to transform EOPF datasets into GeoZarr-spec 0.4 compliant format. This guide explains how to use the converter effectively.

## Command Line Interface

The converter can be accessed via the `eopf-geozarr` command-line tool. Below are some common use cases:

### Basic Conversion

Convert an EOPF dataset to GeoZarr format:

```bash
eopf-geozarr convert input.zarr output.zarr
```

### S3 Output

Convert and save the output directly to an S3 bucket:

```bash
eopf-geozarr convert input.zarr s3://my-bucket/output.zarr
```

### Parallel Processing

Enable parallel processing for large datasets using a Dask cluster:

```bash
eopf-geozarr convert input.zarr output.zarr --dask-cluster
```

### Validation

Validate the GeoZarr compliance of a dataset:

```bash
eopf-geozarr validate output.zarr
```

## Python API

The converter also provides a Python API for programmatic usage:

### Example: Basic Conversion

```python
import xarray as xr
from eopf_geozarr import create_geozarr_dataset

# Load your EOPF DataTree
dt = xr.open_datatree("path/to/eopf/dataset.zarr", engine="zarr")

# Convert to GeoZarr format
dt_geozarr = create_geozarr_dataset(
dt_input=dt,
groups=["/measurements/r10m", "/measurements/r20m", "/measurements/r60m"],
output_path="path/to/output/geozarr.zarr",
spatial_chunk=4096,
min_dimension=256,
tile_width=256,
max_retries=3
)
```

### Example: S3 Output

```python
import os
from eopf_geozarr import create_geozarr_dataset

# Configure S3 credentials
os.environ['AWS_ACCESS_KEY_ID'] = 'your_access_key'
os.environ['AWS_SECRET_ACCESS_KEY'] = 'your_secret_key'
os.environ['AWS_DEFAULT_REGION'] = 'us-east-1'

# Convert and save to S3
dt_geozarr = create_geozarr_dataset(
dt_input=dt,
groups=["/measurements/r10m", "/measurements/r20m", "/measurements/r60m"],
output_path="s3://my-bucket/output.zarr",
spatial_chunk=4096,
min_dimension=256,
tile_width=256,
max_retries=3
)
```

## Advanced Features

### Chunk Alignment

The converter ensures proper chunk alignment to optimize storage and prevent data corruption. It uses the `calculate_aligned_chunk_size` function to determine optimal chunk sizes.

### Multiscale Support

The converter supports multiscale datasets, creating overview levels with /2 downsampling logic. Each level is stored as a sibling group (e.g., `/0`, `/1`, `/2`).

### Native CRS Preservation

The converter maintains the native coordinate reference system (CRS) of the dataset, avoiding reprojection to Web Mercator.

## Error Handling

The converter includes robust error handling and retry logic for network operations, ensuring reliable processing even in challenging environments.

For more details, refer to the [API Reference](api.md).
33 changes: 24 additions & 9 deletions docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ The EOPF GeoZarr library enables conversion of EOPF datasets to the GeoZarr spec
## Key Features

### GeoZarr Specification Compliance

- Full compliance with GeoZarr spec 0.4
- `_ARRAY_DIMENSIONS` attributes on all arrays
- CF standard names for all variables
Expand All @@ -30,19 +31,33 @@ The EOPF GeoZarr library enables conversion of EOPF datasets to the GeoZarr spec
- Proper multiscales metadata structure

### Native CRS Preservation
- No reprojection to TMS required
- Maintains original coordinate reference systems
- Native CRS tile matrix sets

### Multiscale Support
- COG-style /2 downsampling logic
- Overview levels as children groups
- Configurable minimum dimensions and tile widths
- Maintains native CRS (e.g., UTM zones) throughout all overview levels
- Avoids reprojection to Web Mercator, preserving scientific accuracy
- Custom tile matrix sets using native CRS

### Band Organization

- Spectral bands stored as separate DataArray variables
- Enables band-specific metadata and selective access
- Supports different processing chains per spectral band

### Chunking Strategy

- Aligned chunking to optimize storage efficiency and I/O performance
- Prevents partial chunks that waste storage space
- Reduces memory fragmentation

### Hierarchical Structure

- All resolution levels stored as siblings (`/0`, `/1`, `/2`, etc.)
- Multiscales metadata in parent group attributes
- Complies with xarray DataTree alignment requirements

### Robust Processing

- Band-by-band writing with validation
- Retry logic for network operations
- Comprehensive error handling

## Architecture

Expand All @@ -58,4 +73,4 @@ See the [Quick Start](quickstart.md) guide to begin using the library, or check

## Support

For questions, issues, or contributions, please visit the [GitHub repository](https://github.com/developmentseed/eopf-geozarr).
For questions, issues, or contributions, please visit the [GitHub repository](https://github.com/eopf-explorer/data-model).
82 changes: 82 additions & 0 deletions docs/stylesheets/extra.css
Original file line number Diff line number Diff line change
@@ -0,0 +1,82 @@
[data-md-color-scheme="default"] {
--md-primary-fg-color: #160A42;
--font-family: Inter, sans-serif;
--md-link-fg-color: #0078D7;
/* Bright blue for visibility */
}

/* Apply the link color only in the main document content */
[data-md-color-scheme="default"] main article a {
color: var(--md-link-fg-color);
text-decoration: none;
/* Remove underline */
}

[data-md-color-scheme="slate"] {
--md-default-fg-color: hsla(var(--md-hue), 15%, 90%, 0.9);
--md-default-fg-color--light: hsla(var(--md-hue), 15%, 90%, 1);
--md-primary-fg-color: #160A42;
--font-family: Inter, sans-serif;
}

/* Sets up a figure counter */

/* Initialise the counter */
body {
counter-reset: figureCounter;
}

/* Increment the counter for every instance of a figure even if it doesn't have a caption */
figure {
counter-increment: figureCounter;
}

/* Prepend the counter to the figcaption content */
figure figcaption:before {
content: "Figure " counter(figureCounter) ": ";
}

/* reduce font size of code and xarray output cells in rendered jupyter notebooks */
.jupyter-wrapper .jp-OutputArea-output pre,
.xr-wrap {
font-size: 0.8em;
}

/* Code to better render xarray html representation with mknotebook */
.md-typeset pre.xr-text-repr-fallback {
display: none;
}

.md-typeset ul.xr-sections,
.jupyter-wrapper .jp-OutputArea-output dl.xr-attrs {
display: grid;
}

.md-typeset li.xr-var-item,
.md-typeset ul.xr-var-list {
display: contents;
}

.md-typeset .xr-section-details {
display: none;
}

.md-typeset ul.xr-dim-list li {
margin-bottom: 0;
margin-left: 0;
}

.md-typeset ul.xr-dim-list {
margin-bottom: 0;
margin-top: 0;
}

.jupyter-wrapper .jp-OutputArea-output .xr-attrs dt {
padding: 0;
margin: 0;
float: left;
padding-right: 10px;
width: auto;
font-weight: normal;
grid-column: 1;
}
54 changes: 54 additions & 0 deletions mkdocs.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
site_name: ESA Sentinel Zarr Explorer - Data Model
site_description: Documentation for the ESA Sentinel Zarr Explorer Data Model
site_author: Development Seed
repo_url: https://github.com/eopf-explorer/data-model
edit_uri: edit/main/docs/

theme:
name: material
logo: assets/logo-eopf-sentinel-explorer-dark-icon.svg
features:
#- navigation.tabs
#- navigation.tabs.sticky
- navigation.sections
- content.code.copy
- navigation.instant
- navigation.tracking
- navigation.expand
- navigation.indexes
- navigation.top
- content.code.annotate
- content.tabs.link
#- toc.integrate # Table of contents on the left

palette:
# Palette toggle for light mode
- scheme: default
toggle:
icon: material/brightness-7
name: Switch to dark mode
- scheme: slate
accent: orange
toggle:
icon: material/brightness-4
name: Switch to light mode

extra_css:
- stylesheets/extra.css


markdown_extensions:
- admonition
- attr_list
- codehilite
- pymdownx.highlight
- pymdownx.superfences
- pymdownx.tabbed
- pymdownx.tasklist:
custom_checkbox: true
- toc:
permalink: true

nav:
- Home: index.md
- Using the Converter: converter.md
8 changes: 4 additions & 4 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -61,10 +61,10 @@ test = [
"pytest-xdist>=3.0.0",
]
docs = [
"sphinx>=6.0.0",
"sphinx-rtd-theme>=1.2.0",
# "myst-parser>=1.0.0",
"sphinx-autodoc-typehints>=1.20.0",
"mkdocs>=1.4.0",
"mkdocs-material>=9.1.0",
"pymdown-extensions>=10.0",
"mike>=2.1.3",
]

[project.urls]
Expand Down
Loading
Loading