# The proj: Convention

The **proj:** convention encodes Coordinate Reference System (CRS) information for geospatial data stored in Zarr format. It answers the question *"what coordinate system is this data in?"* using one of three standard encodings.

This notebook covers:

1. The three CRS encoding methods: EPSG code, WKT2, and PROJJSON
2. Convention registration via `zarr_conventions`
3. Validation
4. Inheritance from groups to arrays
5. Composition with the `spatial:` and `multiscales` conventions
6. End-to-end conversion of a Cloud-Optimized GeoTIFF to Zarr V3

Each section explains the metadata concept and then demonstrates it with a real-world example.

## Example Dataset

Throughout this notebook we use a [Sentinel-2 L2A scene](https://registry.opendata.aws/sentinel-2-l2a-cogs/) from the `sentinel-cogs` bucket on AWS as our running example (following the [async-geotiff demo](https://github.com/developmentseed/async-geotiff#example)).

The scene is tile **12/S/UF** acquired on 2022-06-09. Its key geospatial properties are:

| Property | Value |
|---|---|
| CRS | EPSG:32612 (WGS 84 / UTM zone 12N) |
| Pixel size | 10 m (TCI band) |
| Origin | (300000.0, 4100040.0) |
| Dimensions | 10980 rows x 10980 columns |
| Bounding box | 300000.0, 3990240.0, 409800.0, 4100040.0 |

Sentinel-2 is a good example because it has bands at three native resolutions (10 m, 20 m, 60 m) that all share the same CRS — a natural fit for group-level inheritance and multiscale composition.

In [1]:
import json

from pyproj import CRS

# The CRS for our Sentinel-2 scene
crs = CRS.from_epsg(32612)
print(crs)

EPSG:32612


## Overview

The proj: convention defines three properties, all using the `proj:` namespace prefix:

| Property | Type | Description |
|---|---|---|
| `proj:code` | string | Authority:code identifier (e.g., `EPSG:4326`) |
| `proj:wkt2` | string | WKT2 (ISO 19162) CRS representation |
| `proj:projjson` | object | PROJJSON CRS representation |

**Exactly one** of these must be provided. The convention can be applied to both Zarr groups and arrays.

## Method 1: EPSG Code

The simplest way to specify a CRS is with an authority:code identifier. The `proj:code` string follows the pattern `AUTHORITY:CODE` and must match `^[A-Z]+:[0-9]+$`.

Known projection authorities include:

| Authority | Description |
|---|---|
| EPSG | European Petroleum Survey Group |
| IAU | International Astronomical Union (e.g., `IAU_2015:30100`) |
| OGC | Open Geospatial Consortium |
| ESRI | Esri spatial references |

This is the preferred method when a well-known code exists for the CRS, because it's compact and unambiguous.

In [2]:
from geozarr_examples import create_proj_attrs

# Our Sentinel-2 scene uses UTM zone 12N
attrs = create_proj_attrs(code="EPSG:32612")
print(json.dumps(attrs, indent=2))

{
  "proj:code": "EPSG:32612"
}


## Method 2: WKT2

WKT2 ([ISO 19162:2019](http://docs.opengeospatial.org/is/12-063r5/12-063r5.html)) provides a full textual CRS representation. It is useful when:

- No valid authority code exists for the CRS
- You need the full CRS definition to be self-contained in the metadata
- The CRS uses custom parameters not captured by a registered code

Here we use pyproj to obtain the WKT2 string for the same Sentinel-2 CRS.

In [3]:
# The same UTM zone 12N CRS, expressed as WKT2
wkt2_string = crs.to_wkt()

attrs = create_proj_attrs(wkt2=wkt2_string)
print(json.dumps(attrs, indent=2))

{
  "proj:wkt2": "PROJCRS[\"WGS 84 / UTM zone 12N\",BASEGEOGCRS[\"WGS 84\",ENSEMBLE[\"World Geodetic System 1984 ensemble\",MEMBER[\"World Geodetic System 1984 (Transit)\"],MEMBER[\"World Geodetic System 1984 (G730)\"],MEMBER[\"World Geodetic System 1984 (G873)\"],MEMBER[\"World Geodetic System 1984 (G1150)\"],MEMBER[\"World Geodetic System 1984 (G1674)\"],MEMBER[\"World Geodetic System 1984 (G1762)\"],MEMBER[\"World Geodetic System 1984 (G2139)\"],MEMBER[\"World Geodetic System 1984 (G2296)\"],ELLIPSOID[\"WGS 84\",6378137,298.257223563,LENGTHUNIT[\"metre\",1]],ENSEMBLEACCURACY[2.0]],PRIMEM[\"Greenwich\",0,ANGLEUNIT[\"degree\",0.0174532925199433]],ID[\"EPSG\",4326]],CONVERSION[\"UTM zone 12N\",METHOD[\"Transverse Mercator\",ID[\"EPSG\",9807]],PARAMETER[\"Latitude of natural origin\",0,ANGLEUNIT[\"degree\",0.0174532925199433],ID[\"EPSG\",8801]],PARAMETER[\"Longitude of natural origin\",-111,ANGLEUNIT[\"degree\",0.0174532925199433],ID[\"EPSG\",8802]],PARAMETER[\"Scale factor at natural o

## Method 3: PROJJSON

[PROJJSON](https://proj.org/specifications/projjson.html) is a JSON encoding of CRS definitions following the PROJ specification. Since it's a native JSON object, it integrates naturally with Zarr's JSON-based metadata and can be validated against the [PROJJSON schema](https://proj.org/schemas/v0.7/projjson.schema.json).

In [4]:
# The same UTM zone 12N CRS, expressed as PROJJSON
projjson_obj = crs.to_json_dict()

attrs = create_proj_attrs(projjson=projjson_obj)
print(json.dumps(attrs, indent=2))

{
  "proj:projjson": {
    "$schema": "https://proj.org/schemas/v0.7/projjson.schema.json",
    "type": "ProjectedCRS",
    "name": "WGS 84 / UTM zone 12N",
    "base_crs": {
      "name": "WGS 84",
      "datum_ensemble": {
        "name": "World Geodetic System 1984 ensemble",
        "members": [
          {
            "name": "World Geodetic System 1984 (Transit)",
            "id": {
              "authority": "EPSG",
              "code": 1166
            }
          },
          {
            "name": "World Geodetic System 1984 (G730)",
            "id": {
              "authority": "EPSG",
              "code": 1152
            }
          },
          {
            "name": "World Geodetic System 1984 (G873)",
            "id": {
              "authority": "EPSG",
              "code": 1153
            }
          },
          {
            "name": "World Geodetic System 1984 (G1150)",
            "id": {
              "authority": "EPSG",
              "code": 1154
          

All three methods describe the same CRS — the choice depends on your use case:

| Method | When to use |
|---|---|
| `proj:code` | A well-known authority code exists (most common) |
| `proj:wkt2` | Self-contained text representation needed, or no authority code exists |
| `proj:projjson` | JSON-native representation preferred, or detailed CRS structure needed |

## Convention Registration

Every Zarr convention must be registered in the `zarr_conventions` array in the node's attributes. This array identifies which conventions are in use and provides links to their schemas and specifications.

A convention entry must include at least one of `uuid`, `schema_url`, or `spec_url` to be identifiable.

In [5]:
from geozarr_examples import ProjConventionMetadata, create_zarr_conventions

conventions = create_zarr_conventions(ProjConventionMetadata())
print(json.dumps(conventions, indent=2))

[
  {
    "uuid": "f17cb550-5864-4468-aeb7-f3180cfb622f",
    "schema_url": "https://raw.githubusercontent.com/zarr-experimental/geo-proj/refs/tags/v1/schema.json",
    "spec_url": "https://github.com/zarr-experimental/geo-proj/blob/v1/README.md",
    "name": "proj:",
    "description": "Coordinate reference system information for geospatial data"
  }
]


The convention entry contains:

- **uuid** (`f17cb550-...`): Permanent identifier for the proj: convention
- **schema_url**: Link to the JSON Schema used for machine validation
- **spec_url**: Link to the human-readable specification
- **name**: The namespace prefix (`proj:`)
- **description**: Brief summary of the convention's purpose

## Putting It Together

Here's what the complete Zarr V3 metadata looks like for a Sentinel-2 group using the proj: convention. This is the structure that would appear in the group's `zarr.json` file.

In [6]:
# Complete zarr.json metadata for the Sentinel-2 TCI group
full_attrs = create_proj_attrs(code="EPSG:32612")
full_attrs["zarr_conventions"] = create_zarr_conventions(ProjConventionMetadata())

zarr_metadata = {
    "zarr_format": 3,
    "node_type": "group",
    "attributes": full_attrs,
}

print(json.dumps(zarr_metadata, indent=2))

{
  "zarr_format": 3,
  "node_type": "group",
  "attributes": {
    "proj:code": "EPSG:32612",
    "zarr_conventions": [
      {
        "uuid": "f17cb550-5864-4468-aeb7-f3180cfb622f",
        "schema_url": "https://raw.githubusercontent.com/zarr-experimental/geo-proj/refs/tags/v1/schema.json",
        "spec_url": "https://github.com/zarr-experimental/geo-proj/blob/v1/README.md",
        "name": "proj:",
        "description": "Coordinate reference system information for geospatial data"
      }
    ]
  }
}


## Validation

The `validate_proj` helper checks that attributes conform to the convention. It returns a `(is_valid, errors)` tuple. The key rule is that **exactly one** of `proj:code`, `proj:wkt2`, or `proj:projjson` must be present.

In [7]:
from geozarr_examples import validate_proj

# Valid: our Sentinel-2 scene's CRS
is_valid, errors = validate_proj({"proj:code": "EPSG:32612"})
print(f"Valid: {is_valid}, Errors: {errors}")

Valid: True, Errors: []


In [8]:
# Invalid: no CRS encoding provided
is_valid, errors = validate_proj({})
print(f"Valid: {is_valid}")
for error in errors:
    print(f"  {error}")

Valid: False
  {'type': 'value_error', 'loc': (), 'msg': 'Value error, At least one of proj:code, proj:wkt2, or proj:projjson must be provided', 'input': {}, 'ctx': {'error': ValueError('At least one of proj:code, proj:wkt2, or proj:projjson must be provided')}, 'url': 'https://errors.pydantic.dev/2.12/v/value_error'}


## Inheritance

The proj: convention supports **group-to-array inheritance**. When defined at the group level, the CRS applies to all direct child arrays. Any child array can override with its own CRS definition.

Inheritance is limited to **direct children only** — it does not cascade to grandchildren. This keeps the scope predictable and the implementation simple.

Sentinel-2 scenes are a natural fit for this pattern: all bands share the same UTM CRS, so defining it once at the group level avoids repeating identical metadata across every band.

In [9]:
# Group-level CRS applies to all direct child arrays
group_attrs = create_proj_attrs(code="EPSG:32612")
group_attrs["zarr_conventions"] = create_zarr_conventions(ProjConventionMetadata())

print("Group attributes (shared by all child arrays):")
print(json.dumps(group_attrs, indent=2))

# Visualize the inheritance hierarchy
print()
print("Sentinel-2 scene group/        <- proj:code = EPSG:32612")
print("  ├── TCI   (10m)              <- inherits EPSG:32612")
print("  ├── B02   (10m)              <- inherits EPSG:32612")
print("  ├── B05   (20m)              <- inherits EPSG:32612")
print("  └── B01   (60m)              <- inherits EPSG:32612")

Group attributes (shared by all child arrays):
{
  "proj:code": "EPSG:32612",
  "zarr_conventions": [
    {
      "uuid": "f17cb550-5864-4468-aeb7-f3180cfb622f",
      "schema_url": "https://raw.githubusercontent.com/zarr-experimental/geo-proj/refs/tags/v1/schema.json",
      "spec_url": "https://github.com/zarr-experimental/geo-proj/blob/v1/README.md",
      "name": "proj:",
      "description": "Coordinate reference system information for geospatial data"
    }
  ]
}

Sentinel-2 scene group/        <- proj:code = EPSG:32612
  ├── TCI   (10m)              <- inherits EPSG:32612
  ├── B02   (10m)              <- inherits EPSG:32612
  ├── B05   (20m)              <- inherits EPSG:32612
  └── B01   (60m)              <- inherits EPSG:32612


## Composition with the spatial: Convention

The proj: convention focuses solely on CRS definitions. For complete georeferencing, it is typically composed with the **spatial:** convention, which defines *how to transform* between pixel coordinates and CRS coordinates.

| Convention | Responsibility |
|---|---|
| `proj:` | What coordinate system (CRS definition) |
| `spatial:` | How to transform (affine matrix, bounding box, dimensions) |

For our Sentinel-2 scene, the TCI band has 10 m pixels with an affine transform of `Affine(10.0, 0.0, 300000.0, 0.0, -10.0, 4100040.0)` — matching the output shown in the [async-geotiff example](https://github.com/developmentseed/async-geotiff#example). The `spatial:transform` uses the same Rasterio/Affine coefficient ordering `[a, b, c, d, e, f]`:

- `a` = 10.0: pixel width (10 m east per column)
- `b` = 0.0: no row rotation
- `c` = 300000.0: easting of the upper-left corner
- `d` = 0.0: no column rotation
- `e` = -10.0: pixel height (10 m south per row)
- `f` = 4100040.0: northing of the upper-left corner

In [10]:
from geozarr_examples import SpatialConventionMetadata, create_spatial_attrs

# Sentinel-2 TCI band: 10m resolution, 10980x10980 pixels
attrs = create_proj_attrs(code="EPSG:32612")
attrs.update(
    create_spatial_attrs(
        dimensions=["Y", "X"],
        transform=[10.0, 0.0, 300000.0, 0.0, -10.0, 4100040.0],
        shape=[10980, 10980],
        bbox=[300000.0, 3990240.0, 409800.0, 4100040.0],
    )
)
attrs["zarr_conventions"] = create_zarr_conventions(
    ProjConventionMetadata(),
    SpatialConventionMetadata(),
)

print(json.dumps(attrs, indent=2))

{
  "proj:code": "EPSG:32612",
  "spatial:dimensions": [
    "Y",
    "X"
  ],
  "spatial:bbox": [
    300000.0,
    3990240.0,
    409800.0,
    4100040.0
  ],
  "spatial:transform_type": "affine",
  "spatial:transform": [
    10.0,
    0.0,
    300000.0,
    0.0,
    -10.0,
    4100040.0
  ],
  "spatial:shape": [
    10980,
    10980
  ],
  "spatial:registration": "pixel",
  "zarr_conventions": [
    {
      "uuid": "f17cb550-5864-4468-aeb7-f3180cfb622f",
      "schema_url": "https://raw.githubusercontent.com/zarr-experimental/geo-proj/refs/tags/v1/schema.json",
      "spec_url": "https://github.com/zarr-experimental/geo-proj/blob/v1/README.md",
      "name": "proj:",
      "description": "Coordinate reference system information for geospatial data"
    },
    {
      "uuid": "689b58e2-cf7b-45e0-9fff-9cfc0883d6b4",
      "schema_url": "https://raw.githubusercontent.com/zarr-conventions/spatial/refs/tags/v1/schema.json",
      "spec_url": "https://github.com/zarr-conventions/spatial/b

## Composition with Multiscales

Sentinel-2 bands are acquired at three native resolutions — 10 m, 20 m, and 60 m — making it a natural multi-resolution dataset. When the proj:, spatial:, and multiscales conventions are composed together:

- **`proj:code`** is defined once at the group level and applies to all resolution levels
- **`spatial:dimensions`** and **`spatial:bbox`** are shared across all levels (same geographic extent)
- Each resolution level has its own **`spatial:shape`** and **`spatial:transform`** reflecting its pixel size
- The multiscales `transform.scale` describes the **resampling relationship** between levels, not the geospatial coordinate transformation

In [11]:
from geozarr_examples import MultiscalesConventionMetadata, create_multiscales_layout

# Sentinel-2 multi-resolution group: 10m, 20m, and 60m bands
# CRS and bounding box are shared; shape and transform vary per resolution.
attrs = create_proj_attrs(code="EPSG:32612")
attrs.update(
    create_spatial_attrs(
        dimensions=["Y", "X"],
        bbox=[300000.0, 3990240.0, 409800.0, 4100040.0],
    )
)
attrs.update(
    create_multiscales_layout(
        [
            {
                "asset": "r10m",
                "transform": {"scale": [1.0, 1.0], "translation": [0.0, 0.0]},
            },
            {
                "asset": "r20m",
                "derived_from": "r10m",
                "transform": {"scale": [2.0, 2.0], "translation": [0.0, 0.0]},
            },
            {
                "asset": "r60m",
                "derived_from": "r10m",
                "transform": {"scale": [6.0, 6.0], "translation": [0.0, 0.0]},
            },
        ]
    )
)
attrs["zarr_conventions"] = create_zarr_conventions(
    MultiscalesConventionMetadata(),
    ProjConventionMetadata(),
    SpatialConventionMetadata(),
)

print(json.dumps(attrs, indent=2))

{
  "proj:code": "EPSG:32612",
  "spatial:dimensions": [
    "Y",
    "X"
  ],
  "spatial:bbox": [
    300000.0,
    3990240.0,
    409800.0,
    4100040.0
  ],
  "spatial:transform_type": "affine",
  "spatial:registration": "pixel",
  "multiscales": {
    "layout": [
      {
        "asset": "r10m",
        "transform": {
          "scale": [
            1.0,
            1.0
          ],
          "translation": [
            0.0,
            0.0
          ]
        }
      },
      {
        "asset": "r20m",
        "derived_from": "r10m",
        "transform": {
          "scale": [
            2.0,
            2.0
          ],
          "translation": [
            0.0,
            0.0
          ]
        }
      },
      {
        "asset": "r60m",
        "derived_from": "r10m",
        "transform": {
          "scale": [
            6.0,
            6.0
          ],
          "translation": [
            0.0,
            0.0
          ]
        }
      }
    ]
  },
  "zarr_convent

In this example, the 10 m level is the base (`r10m`). The 20 m level has `scale: [2.0, 2.0]` meaning each pixel covers 2x the area of the base, and 60 m has `scale: [6.0, 6.0]`. The actual geospatial coordinates are determined by each level's `spatial:transform`, not by the multiscales scale factors.

## Converting Between CRS Formats with pyproj

In practice, [pyproj](https://pyproj4.github.io/pyproj/) makes it easy to start from any CRS representation and produce whichever format the proj: convention requires.

In [12]:
# All three representations of the Sentinel-2 scene's CRS
print("proj:code")
print(f"  EPSG:{crs.to_epsg()}")
print()
print("proj:wkt2 (truncated)")
print(f"  {crs.to_wkt()[:80]}...")
print()
print("proj:projjson (summary)")
pj = crs.to_json_dict()
print(f"  type: {pj['type']}")
print(f"  name: {pj['name']}")
print(f"  keys: {list(pj.keys())}")

proj:code
  EPSG:32612

proj:wkt2 (truncated)
  PROJCRS["WGS 84 / UTM zone 12N",BASEGEOGCRS["WGS 84",ENSEMBLE["World Geodetic Sy...

proj:projjson (summary)
  type: ProjectedCRS
  name: WGS 84 / UTM zone 12N
  keys: ['$schema', 'type', 'name', 'base_crs', 'conversion', 'coordinate_system', 'scope', 'area', 'bbox', 'id']


## Converting a GeoTIFF to Zarr

This section ties everything together with a real end-to-end workflow: opening the Sentinel-2 COG from the [async-geotiff example](https://github.com/developmentseed/async-geotiff#example), extracting convention metadata, writing to Zarr, and validating the result.

### Step 1: Open the remote COG

We use [async-geotiff](https://github.com/developmentseed/async-geotiff) to open the Cloud-Optimized GeoTIFF directly from S3. The GeoTIFF object exposes the geospatial properties we need — `crs`, `transform`, `bounds`, and `shape` — without reading any pixel data.

In [13]:
from async_geotiff import GeoTIFF
from obstore.store import S3Store

store = S3Store("sentinel-cogs", region="us-west-2", skip_signature=True)
path = "sentinel-s2-l2a-cogs/12/S/UF/2022/6/S2B_12SUF_20220609_0_L2A/TCI.tif"

geotiff = await GeoTIFF.open(path, store=store)

print(f"CRS:       {geotiff.crs}")
print(f"Transform: {geotiff.transform}")
print(f"Shape:     {geotiff.shape}")
print(f"Bounds:    {geotiff.bounds}")
print(f"Bands:     {geotiff.count}")
print(f"Dtype:     {geotiff.dtype}")

CRS:       EPSG:32612
Transform: | 10.00, 0.00, 300000.00|
| 0.00,-10.00, 4100040.00|
| 0.00, 0.00, 1.00|
Shape:     (10980, 10980)
Bounds:    (300000.0, 3990240.0, 409800.0, 4100040.0)
Bands:     3
Dtype:     uint8


### Step 2: Build convention metadata from the COG

The GeoTIFF's properties map directly to convention attributes. The COG also contains internal **overviews** (reduced-resolution copies) which map naturally to the **multiscales** convention — each overview becomes a scale level.

- `geotiff.crs.to_epsg()` → `proj:code`
- `geotiff.transform` (Affine coefficients) → `spatial:transform`
- `geotiff.shape` → `spatial:shape`
- `geotiff.bounds` → `spatial:bbox`
- `geotiff.overviews` → `multiscales` layout

In [14]:
# Build proj: and spatial: attributes from the GeoTIFF's properties
t = geotiff.transform

geozarr_attrs = create_proj_attrs(code=f"EPSG:{geotiff.crs.to_epsg()}")
geozarr_attrs.update(
    create_spatial_attrs(
        dimensions=["Y", "X"],
        bbox=list(geotiff.bounds),
    )
)

# Build multiscales layout from the COG's overviews
# The base (full-resolution) image is level 0; each overview is a coarser level.
base_res = t.a  # pixel width of the base level
levels = [
    {"asset": "0", "transform": {"scale": [1.0, 1.0], "translation": [0.0, 0.0]}},
]
for i, overview in enumerate(geotiff.overviews):
    ov_res = overview.transform.a
    scale_factor = ov_res / base_res
    levels.append(
        {
            "asset": str(i + 1),
            "derived_from": "0",
            "transform": {
                "scale": [scale_factor, scale_factor],
                "translation": [0.0, 0.0],
            },
        }
    )

geozarr_attrs.update(create_multiscales_layout(levels))
geozarr_attrs["zarr_conventions"] = create_zarr_conventions(
    MultiscalesConventionMetadata(),
    ProjConventionMetadata(),
    SpatialConventionMetadata(),
)

print(f"Base resolution: {base_res} m")
print(f"Overview levels: {len(geotiff.overviews)}")
for i, overview in enumerate(geotiff.overviews):
    print(
        f"  Overview {i+1}: {overview.width}x{overview.height} px, {overview.transform.a:.1f} m/px"
    )
print()
print(json.dumps(geozarr_attrs, indent=2))

Base resolution: 10.0 m
Overview levels: 4
  Overview 1: 5490x5490 px, 20.0 m/px
  Overview 2: 2745x2745 px, 40.0 m/px
  Overview 3: 1373x1373 px, 80.0 m/px
  Overview 4: 687x687 px, 159.8 m/px

{
  "proj:code": "EPSG:32612",
  "spatial:dimensions": [
    "Y",
    "X"
  ],
  "spatial:bbox": [
    300000.0,
    3990240.0,
    409800.0,
    4100040.0
  ],
  "spatial:transform_type": "affine",
  "spatial:registration": "pixel",
  "multiscales": {
    "layout": [
      {
        "asset": "0",
        "transform": {
          "scale": [
            1.0,
            1.0
          ],
          "translation": [
            0.0,
            0.0
          ]
        }
      },
      {
        "asset": "1",
        "derived_from": "0",
        "transform": {
          "scale": [
            2.0,
            2.0
          ],
          "translation": [
            0.0,
            0.0
          ]
        }
      },
      {
        "asset": "2",
        "derived_from": "0",
        "transform": {
   

### Step 3: Read and write to Zarr V3 with multiscales

We read the full-resolution image and each overview, writing them as separate child arrays in a remote Zarr V3 store on S3. Zarr v3's `ObjectStore` wraps an obstore `S3Store`, so the same obstore backend used to *read* the COG is used to *write* the Zarr.

In [16]:
import zarr
from zarr.storage import ObjectStore

bucket = "us-west-2.opendata.source.coop"
prefix = "pangeo/geozarr-examples/TCI.zarr"

output_store = S3Store(bucket, prefix=prefix, region="us-west-2")
zarr_store = ObjectStore(output_store)
root: zarr.Group = zarr.open_group(zarr_store, mode="w", zarr_format=3)

# Set convention attributes on the group
root.attrs.update(geozarr_attrs)

# Write the full-resolution image as level "0"
base_array = await geotiff.read()
root.create_array("0", data=base_array.data, chunks=(3, 512, 512))
print(f"Level 0 (base): shape={base_array.data.shape}, dtype={base_array.data.dtype}")

# Write each overview as a separate level
for i, overview in enumerate(geotiff.overviews):
    ov_array = await overview.read()
    root.create_array(str(i + 1), data=ov_array.data, chunks=(3, 512, 512))
    print(f"Level {i+1} (overview): shape={ov_array.data.shape}")

print(f"\nWrote Zarr V3 store to s3://{bucket}/{prefix}")

Level 0 (base): shape=(3, 10980, 10980), dtype=uint8
Level 1 (overview): shape=(3, 5490, 5490)
Level 2 (overview): shape=(3, 2745, 2745)
Level 3 (overview): shape=(3, 1373, 1373)
Level 4 (overview): shape=(3, 687, 687)

Wrote Zarr V3 store to s3://us-west-2.opendata.source.coop/pangeo/geozarr-examples/TCI.zarr


### Step 4: Validate the Zarr store

We reopen the store and use `validate_group` to confirm the conventions are correctly applied.

In [17]:
from geozarr_examples import detect_conventions, validate_group

# Reopen from S3 and validate
read_store = S3Store(bucket, prefix=prefix, region="us-west-2", skip_signature=True)
root = zarr.open_group(ObjectStore(read_store), mode="r")

detected = detect_conventions(dict(root.attrs))
print(f"Detected conventions: {detected}")

results = validate_group(root)
for conv, errors in results.items():
    status = "PASS" if not errors else "FAIL"
    print(f"  [{status}] {conv}")
    for err in errors:
        print(f"         {err}")

print(f"\nStore tree:")
root.tree()

Detected conventions: ['spatial', 'proj', 'multiscales']
  [PASS] spatial
  [PASS] proj
  [PASS] multiscales
  [PASS] zarr_conventions

Store tree:


## Summary

The proj: convention provides a focused, modular approach to encoding CRS information in Zarr:

- **Three encoding methods** (EPSG code, WKT2, PROJJSON) support a range of use cases from simple well-known projections to custom CRS definitions
- **Convention registration** via `zarr_conventions` makes CRS metadata self-describing and discoverable
- **Group-to-array inheritance** reduces redundancy — as shown with Sentinel-2 bands sharing the same UTM CRS
- **Composability** with the spatial: and multiscales conventions enables complete georeferencing while keeping each convention's scope well-defined

For the full specification, see the [proj: convention README](https://github.com/zarr-experimental/geo-proj).