diff --git a/CHANGELOG.md b/CHANGELOG.md index f0f028bc9..6323533ec 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -9,6 +9,8 @@ and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0. ### Added +- Spatial search support for collections via `bbox` parameter on `/collections` endpoint. Collections are now indexed with a `bbox_shape` field (GeoJSON polygon) derived from their spatial extent for efficient geospatial queries when created or updated. [#481](https://github.com/stac-utils/stac-fastapi-elasticsearch-opensearch/pull/481) +- Introduced SFEOS Tools (`sfeos_tools/`) - An installable Click-based CLI package for managing SFEOS deployments. Initial command `add-bbox-shape` adds the `bbox_shape` field to existing collections for spatial search compatibility. Install with `pip install sfeos-tools[elasticsearch]` or `pip install sfeos-tools[opensearch]`. [#481](https://github.com/stac-utils/stac-fastapi-elasticsearch-opensearch/pull/481) - CloudFerro logo to sponsors and supporters list [#485](https://github.com/stac-utils/stac-fastapi-elasticsearch-opensearch/pull/485) - Latest news section to README [#485](https://github.com/stac-utils/stac-fastapi-elasticsearch-opensearch/pull/485) @@ -16,7 +18,7 @@ and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0. ### Fixed -[v6.5.1] - 2025-09-30 +## [v6.5.1] - 2025-09-30 ### Fixed @@ -24,7 +26,7 @@ and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0. - Issue where datetime param was not being passed from POST collections search logic to Elasticsearch [#483](https://github.com/stac-utils/stac-fastapi-elasticsearch-opensearch/pull/483) - Collections search tests to ensure both GET /collections and GET/POST /collections-search endpoints are tested [#483](https://github.com/stac-utils/stac-fastapi-elasticsearch-opensearch/pull/483) -[v6.5.0] - 2025-09-29 +## [v6.5.0] - 2025-09-29 ### Added diff --git a/README.md b/README.md index b87bd21be..37dbdba7d 100644 --- a/README.md +++ b/README.md @@ -32,7 +32,8 @@ The following organizations have contributed time and/or funding to support the
-- 10/04/2025: The [CloudFerro](https://cloudferro.com/) logo has been added to the sponsors and supporters list above. Their sponsorship of the ongoing collections search extension work has been invaluable. This is in addition to the many other important changes and updates their developers have added to the project. +- **10/12/2025:** Collections search **bbox** functionality added! The collections search extension now supports bbox queries. Collections will need to be updated via the API or with the new **[SFEOS-tools](#sfeos-tools-cli)** CLI package to support geospatial discoverability. Thanks again to **CloudFerro** for their sponsorship of this work! +- **10/04/2025:** The **[CloudFerro](https://cloudferro.com/)** logo has been added to the sponsors and supporters list above. Their sponsorship of the ongoing collections search extension work has been invaluable. This is in addition to the many other important changes and updates their developers have added to the project.
@@ -105,6 +106,7 @@ This project is built on the following technologies: STAC, stac-fastapi, FastAPI - [Interacting with the API](#interacting-with-the-api) - [Configure the API](#configure-the-api) - [Collection Pagination](#collection-pagination) + - [SFEOS Tools CLI](#sfeos-tools-cli) - [Ingesting Sample Data CLI Tool](#ingesting-sample-data-cli-tool) - [Elasticsearch Mappings](#elasticsearch-mappings) - [Managing Elasticsearch Indices](#managing-elasticsearch-indices) @@ -160,9 +162,21 @@ These endpoints support advanced collection discovery features including: - Collections are matched if their temporal extent overlaps with the provided datetime parameter - This allows for efficient discovery of collections based on time periods +- **Spatial Filtering**: Filter collections by their spatial extent using the `bbox` parameter + - Example: `/collections?bbox=-10,35,40,70` (finds collections whose spatial extent intersects with this bounding box) + - Example: `/collections?bbox=-180,-90,180,90` (finds all collections with global coverage) + - Supports both 2D bounding boxes `[minx, miny, maxx, maxy]` and 3D bounding boxes `[minx, miny, minz, maxx, maxy, maxz]` (altitude values are ignored for spatial queries) + - Collections are matched if their spatial extent (stored in the `extent.spatial.bbox` field) intersects with the provided bbox parameter + - **Implementation Note**: When collections are created or updated, a `bbox_shape` field is automatically generated from the collection's spatial extent and indexed as a GeoJSON polygon for efficient geospatial queries + - **Migrating Legacy Collections**: Collections created before this feature was added will not be discoverable via bbox search until they have the `bbox_shape` field added. You can either: + - Update each collection via the API (PUT `/collections/{collection_id}` with the existing collection data) + - Run the migration tool (see [SFEOS Tools CLI](#sfeos-tools-cli) for installation and connection options): + - `sfeos-tools add-bbox-shape --backend elasticsearch --no-ssl` + - `sfeos-tools add-bbox-shape --backend opensearch --host db.example.com --no-ssl` + These extensions make it easier to build user interfaces that display and navigate through collections efficiently. -> **Configuration**: Collection search extensions (sorting, field selection, free text search, structured filtering, and datetime filtering) for the `/collections` endpoint can be disabled by setting the `ENABLE_COLLECTIONS_SEARCH` environment variable to `false`. By default, these extensions are enabled. +> **Configuration**: Collection search extensions (sorting, field selection, free text search, structured filtering, datetime filtering, and spatial filtering) for the `/collections` endpoint can be disabled by setting the `ENABLE_COLLECTIONS_SEARCH` environment variable to `false`. By default, these extensions are enabled. > > **Configuration**: The custom `/collections-search` endpoint can be enabled by setting the `ENABLE_COLLECTIONS_SEARCH_ROUTE` environment variable to `true`. By default, this endpoint is **disabled**. @@ -470,6 +484,64 @@ The system uses a precise naming convention: curl -X "GET" "http://localhost:8080/collections?limit=1&token=example_token" ``` +## SFEOS Tools CLI + +- **Overview**: SFEOS Tools is an installable CLI package for managing and maintaining SFEOS deployments. + +- **Installation**: + ```shell + # For Elasticsearch (from PyPI) + pip install sfeos-tools[elasticsearch] + + # For OpenSearch (from PyPI) + pip install sfeos-tools[opensearch] + + # For local development + pip install -e sfeos_tools[elasticsearch] + # or + pip install -e sfeos_tools[opensearch] + ``` + +- **Available Commands**: + - `add-bbox-shape`: Add bbox_shape field to existing collections for spatial search support + +- **Basic Usage**: + ```shell + sfeos-tools add-bbox-shape --backend elasticsearch + sfeos-tools add-bbox-shape --backend opensearch + ``` + +- **Connection Options**: Configure database connection via CLI flags or environment variables: + - `--host`: Database host (default: `localhost` or `ES_HOST` env var) + - `--port`: Database port (default: `9200` or `ES_PORT` env var) + - `--use-ssl` / `--no-ssl`: Use SSL connection (default: `true` or `ES_USE_SSL` env var) + - `--user`: Database username (default: `ES_USER` env var) + - `--password`: Database password (default: `ES_PASS` env var) + +- **Examples**: + ```shell + # Local Docker Compose (no SSL) + sfeos-tools add-bbox-shape --backend elasticsearch --no-ssl + + # Remote server with SSL + sfeos-tools add-bbox-shape \ + --backend elasticsearch \ + --host db.example.com \ + --port 9200 \ + --user admin \ + --password secret + + # Cloud deployment with environment variables + ES_HOST=my-es-cluster.cloud.com ES_PORT=9243 ES_USER=elastic ES_PASS=changeme \ + sfeos-tools add-bbox-shape --backend elasticsearch + + # Using --help for more information + sfeos-tools --help + sfeos-tools add-bbox-shape --help + ``` + +For more details, see the [SFEOS Tools README](./sfeos_tools/README.md). + ## Ingesting Sample Data CLI Tool - **Overview**: The `data_loader.py` script provides a convenient way to load STAC items into the database. diff --git a/sfeos_tools/LICENSE b/sfeos_tools/LICENSE new file mode 100644 index 000000000..1c6074b87 --- /dev/null +++ b/sfeos_tools/LICENSE @@ -0,0 +1,21 @@ +MIT License + +Copyright (c) 2025 Jonathan Healy and CloudFerro S.A. + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. diff --git a/sfeos_tools/README.md b/sfeos_tools/README.md new file mode 100644 index 000000000..5347c28b3 --- /dev/null +++ b/sfeos_tools/README.md @@ -0,0 +1,113 @@ +# SFEOS Tools + +CLI tools for managing [stac-fastapi-elasticsearch-opensearch](https://github.com/stac-utils/stac-fastapi-elasticsearch-opensearch) deployments. + +## Installation + +### For Elasticsearch + +```bash +pip install sfeos-tools[elasticsearch] +``` + +Or for local development: +```bash +pip install -e sfeos_tools[elasticsearch] +``` + +### For OpenSearch + +```bash +pip install sfeos-tools[opensearch] +``` + +Or for local development: +```bash +pip install -e sfeos_tools[opensearch] +``` + +### For Development (both backends) + +```bash +pip install sfeos-tools[dev] +``` + +Or for local development: +```bash +pip install -e sfeos_tools[dev] +``` + +## Usage + +After installation, the `sfeos-tools` command will be available: + +```bash +# View available commands +sfeos-tools --help + +# View version +sfeos-tools --version + +# Get help for a specific command +sfeos-tools add-bbox-shape --help +``` + +## Commands + +### add-bbox-shape + +Add `bbox_shape` field to existing collections for spatial search support. + +**Basic usage:** + +```bash +# Elasticsearch +sfeos-tools add-bbox-shape --backend elasticsearch + +# OpenSearch +sfeos-tools add-bbox-shape --backend opensearch +``` + +**Connection options:** + +```bash +# Local Docker Compose (no SSL) +sfeos-tools add-bbox-shape --backend elasticsearch --no-ssl + +# Remote server with SSL +sfeos-tools add-bbox-shape \ + --backend elasticsearch \ + --host db.example.com \ + --port 9200 \ + --user admin \ + --password secret + +# Using environment variables +ES_HOST=my-cluster.cloud.com ES_PORT=9243 ES_USER=elastic ES_PASS=changeme \ + sfeos-tools add-bbox-shape --backend elasticsearch +``` + +**Available options:** + +- `--backend`: Database backend (elasticsearch or opensearch) - **required** +- `--host`: Database host (default: localhost or ES_HOST env var) +- `--port`: Database port (default: 9200 or ES_PORT env var) +- `--use-ssl / --no-ssl`: Use SSL connection (default: true or ES_USE_SSL env var) +- `--user`: Database username (default: ES_USER env var) +- `--password`: Database password (default: ES_PASS env var) + +## Development + +To develop sfeos-tools locally: + +```bash +# Install in editable mode with dev dependencies +pip install -e ./sfeos_tools[dev] + +# Run the CLI +sfeos-tools --help +``` + +## License + +MIT License - see the main repository for details. diff --git a/sfeos_tools/setup.py b/sfeos_tools/setup.py new file mode 100644 index 000000000..d5cf44162 --- /dev/null +++ b/sfeos_tools/setup.py @@ -0,0 +1,56 @@ +"""Setup for SFEOS Tools.""" + +from setuptools import find_packages, setup + +with open("README.md", "r", encoding="utf-8") as f: + long_description = f.read() + +setup( + name="sfeos-tools", + version="0.1.0", + description="CLI tools for managing stac-fastapi-elasticsearch-opensearch deployments", + long_description=long_description, + long_description_content_type="text/markdown", + author="Jonathan Healy, CloudFerro S.A.", + license="MIT", + url="https://github.com/stac-utils/stac-fastapi-elasticsearch-opensearch", + packages=find_packages(), + python_requires=">=3.8", + install_requires=[ + "click>=8.0.0", + ], + extras_require={ + "elasticsearch": [ + "stac_fastapi_core", + "sfeos_helpers", + "stac_fastapi_elasticsearch", + ], + "opensearch": [ + "stac_fastapi_core", + "sfeos_helpers", + "stac_fastapi_opensearch", + ], + "dev": [ + "stac_fastapi_core", + "sfeos_helpers", + "stac_fastapi_elasticsearch", + "stac_fastapi_opensearch", + ], + }, + entry_points={ + "console_scripts": [ + "sfeos-tools=sfeos_tools.cli:cli", + ], + }, + classifiers=[ + "Development Status :: 4 - Beta", + "Intended Audience :: Developers", + "License :: OSI Approved :: MIT License", + "Programming Language :: Python :: 3", + "Programming Language :: Python :: 3.8", + "Programming Language :: Python :: 3.9", + "Programming Language :: Python :: 3.10", + "Programming Language :: Python :: 3.11", + "Programming Language :: Python :: 3.12", + ], +) diff --git a/sfeos_tools/sfeos_tools/__init__.py b/sfeos_tools/sfeos_tools/__init__.py new file mode 100644 index 000000000..9106f13e8 --- /dev/null +++ b/sfeos_tools/sfeos_tools/__init__.py @@ -0,0 +1,3 @@ +"""SFEOS Tools - Utilities for managing stac-fastapi-elasticsearch-opensearch deployments.""" + +__version__ = "0.1.0" diff --git a/sfeos_tools/sfeos_tools/cli.py b/sfeos_tools/sfeos_tools/cli.py new file mode 100644 index 000000000..827e81169 --- /dev/null +++ b/sfeos_tools/sfeos_tools/cli.py @@ -0,0 +1,229 @@ +"""SFEOS CLI Tools - Utilities for managing stac-fastapi-elasticsearch-opensearch deployments. + +This tool provides various utilities for managing and maintaining SFEOS deployments, +including database migrations, maintenance tasks, and more. + +Usage: + sfeos-tools add-bbox-shape --backend elasticsearch + sfeos-tools add-bbox-shape --backend opensearch +""" + +import asyncio +import logging +import sys + +import click + +from stac_fastapi.sfeos_helpers.database import add_bbox_shape_to_collection +from stac_fastapi.sfeos_helpers.mappings import COLLECTIONS_INDEX + +logging.basicConfig(level=logging.INFO) +logger = logging.getLogger(__name__) + + +async def process_collection_bbox_shape(client, collection_doc, backend): + """Process a single collection document to add bbox_shape field. + + Args: + client: Elasticsearch/OpenSearch client + collection_doc: Collection document from database + backend: Backend type ('elasticsearch' or 'opensearch') + + Returns: + bool: True if collection was updated, False if no update was needed + """ + collection = collection_doc["_source"] + collection_id = collection.get("id", collection_doc["_id"]) + + # Use the shared function to add bbox_shape + was_added = add_bbox_shape_to_collection(collection) + + if not was_added: + return False + + # Update the collection in the database + if backend == "elasticsearch": + await client.index( + index=COLLECTIONS_INDEX, + id=collection_id, + document=collection, + refresh=True, + ) + else: # opensearch + await client.index( + index=COLLECTIONS_INDEX, + id=collection_id, + body=collection, + refresh=True, + ) + + logger.info(f"Collection '{collection_id}': Added bbox_shape field") + return True + + +async def run_add_bbox_shape(backend): + """Add bbox_shape field to all existing collections. + + Args: + backend: Backend type ('elasticsearch' or 'opensearch') + """ + import os + + logger.info( + f"Starting migration: Adding bbox_shape to existing collections ({backend})" + ) + + # Log connection info (showing what will be used by the client) + es_host = os.getenv("ES_HOST", "localhost") + es_port = os.getenv( + "ES_PORT", "9200" + ) # Both backends default to 9200 in their config + es_use_ssl = os.getenv("ES_USE_SSL", "true") + logger.info(f"Connecting to {backend} at {es_host}:{es_port} (SSL: {es_use_ssl})") + + # Create client based on backend + if backend == "elasticsearch": + from stac_fastapi.elasticsearch.config import AsyncElasticsearchSettings + + settings = AsyncElasticsearchSettings() + else: # opensearch + from stac_fastapi.opensearch.config import AsyncOpensearchSettings + + settings = AsyncOpensearchSettings() + + client = settings.create_client + + try: + # Get all collections + response = await client.search( + index=COLLECTIONS_INDEX, + body={ + "query": {"match_all": {}}, + "size": 10000, + }, # Adjust size if you have more collections + ) + + total_collections = response["hits"]["total"]["value"] + logger.info(f"Found {total_collections} collections to process") + + updated_count = 0 + skipped_count = 0 + + for hit in response["hits"]["hits"]: + was_updated = await process_collection_bbox_shape(client, hit, backend) + if was_updated: + updated_count += 1 + else: + skipped_count += 1 + + logger.info( + f"Migration complete: {updated_count} collections updated, {skipped_count} skipped" + ) + + except Exception as e: + logger.error(f"Migration failed with error: {e}") + raise + finally: + await client.close() + + +@click.group() +@click.version_option(version="0.1.0", prog_name="sfeos-tools") +def cli(): + """SFEOS Tools - Utilities for managing stac-fastapi-elasticsearch-opensearch deployments.""" + pass + + +@cli.command("add-bbox-shape") +@click.option( + "--backend", + type=click.Choice(["elasticsearch", "opensearch"], case_sensitive=False), + required=True, + help="Database backend to use", +) +@click.option( + "--host", + type=str, + default=None, + help="Database host (default: localhost or ES_HOST env var)", +) +@click.option( + "--port", + type=int, + default=None, + help="Database port (default: 9200 for ES, 9202 for OS, or ES_PORT env var)", +) +@click.option( + "--use-ssl/--no-ssl", + default=None, + help="Use SSL connection (default: true or ES_USE_SSL env var)", +) +@click.option( + "--user", + type=str, + default=None, + help="Database username (default: ES_USER env var)", +) +@click.option( + "--password", + type=str, + default=None, + help="Database password (default: ES_PASS env var)", +) +def add_bbox_shape(backend, host, port, use_ssl, user, password): + """Add bbox_shape field to existing collections for spatial search support. + + This migration is required for collections created before spatial search + was added. Collections created or updated after this feature will + automatically have the bbox_shape field. + + Examples: + sfeos_tools.py add-bbox-shape --backend elasticsearch + sfeos_tools.py add-bbox-shape --backend opensearch --host db.example.com --port 9200 + sfeos_tools.py add-bbox-shape --backend elasticsearch --no-ssl --host localhost + """ + import os + + # Set environment variables from CLI options if provided + if host: + os.environ["ES_HOST"] = host + if port: + os.environ["ES_PORT"] = str(port) + if use_ssl is not None: + os.environ["ES_USE_SSL"] = "true" if use_ssl else "false" + if user: + os.environ["ES_USER"] = user + if password: + os.environ["ES_PASS"] = password + + try: + asyncio.run(run_add_bbox_shape(backend.lower())) + click.echo(click.style("āœ“ Migration completed successfully", fg="green")) + except KeyboardInterrupt: + click.echo(click.style("\nāœ— Migration interrupted by user", fg="yellow")) + sys.exit(1) + except Exception as e: + error_msg = str(e) + click.echo(click.style(f"āœ— Migration failed: {error_msg}", fg="red")) + + # Provide helpful hints for common errors + if "TLS" in error_msg or "SSL" in error_msg: + click.echo( + click.style( + "\nšŸ’” Hint: If you're connecting to a local Docker Compose instance, " + "try adding --no-ssl flag", + fg="yellow", + ) + ) + elif "Connection refused" in error_msg: + click.echo( + click.style( + "\nšŸ’” Hint: Make sure your database is running and accessible at the specified host:port", + fg="yellow", + ) + ) + sys.exit(1) + + +if __name__ == "__main__": + cli() diff --git a/stac_fastapi/core/stac_fastapi/core/base_database_logic.py b/stac_fastapi/core/stac_fastapi/core/base_database_logic.py index c592b6d26..105fdf925 100644 --- a/stac_fastapi/core/stac_fastapi/core/base_database_logic.py +++ b/stac_fastapi/core/stac_fastapi/core/base_database_logic.py @@ -3,6 +3,8 @@ import abc from typing import Any, Dict, Iterable, List, Optional, Tuple +from stac_pydantic.shared import BBox + class BaseDatabaseLogic(abc.ABC): """ @@ -19,7 +21,12 @@ async def get_all_collections( limit: int, request: Any = None, sort: Optional[List[Dict[str, Any]]] = None, - ) -> Tuple[List[Dict[str, Any]], Optional[str]]: + bbox: Optional[BBox] = None, + q: Optional[List[str]] = None, + filter: Optional[Dict[str, Any]] = None, + query: Optional[Dict[str, Dict[str, Any]]] = None, + datetime: Optional[str] = None, + ) -> Tuple[List[Dict[str, Any]], Optional[str], Optional[int]]: """Retrieve a list of collections from the database, supporting pagination. Args: @@ -27,9 +34,14 @@ async def get_all_collections( limit (int): The number of results to return. request (Any, optional): The FastAPI request object. Defaults to None. sort (Optional[List[Dict[str, Any]]], optional): Optional sort parameter. Defaults to None. + bbox (Optional[BBox], optional): Bounding box to filter collections by spatial extent. Defaults to None. + q (Optional[List[str]], optional): Free text search terms. Defaults to None. + filter (Optional[Dict[str, Any]], optional): Structured query in CQL2 format. Defaults to None. + query (Optional[Dict[str, Dict[str, Any]]], optional): Query extension parameters. Defaults to None. + datetime (Optional[str], optional): Temporal filter. Defaults to None. Returns: - A tuple of (collections, next pagination token if any). + A tuple of (collections, next pagination token if any, optional count). """ pass diff --git a/stac_fastapi/core/stac_fastapi/core/core.py b/stac_fastapi/core/stac_fastapi/core/core.py index 143b4d5ac..cc175b6ce 100644 --- a/stac_fastapi/core/stac_fastapi/core/core.py +++ b/stac_fastapi/core/stac_fastapi/core/core.py @@ -241,6 +241,7 @@ async def landing_page(self, **kwargs) -> stac_types.LandingPage: async def all_collections( self, limit: Optional[int] = None, + bbox: Optional[BBox] = None, datetime: Optional[str] = None, fields: Optional[List[str]] = None, sortby: Optional[Union[str, List[str]]] = None, @@ -255,14 +256,17 @@ async def all_collections( """Read all collections from the database. Args: - datetime (Optional[str]): Filter collections by datetime range. limit (Optional[int]): Maximum number of collections to return. + bbox (Optional[BBox]): Bounding box to filter collections by spatial extent. + datetime (Optional[str]): Filter collections by datetime range. fields (Optional[List[str]]): Fields to include or exclude from the results. - sortby (Optional[str]): Sorting options for the results. + sortby (Optional[Union[str, List[str]]]): Sorting options for the results. filter_expr (Optional[str]): Structured filter expression in CQL2 JSON or CQL2-text format. - query (Optional[str]): Legacy query parameter (deprecated). filter_lang (Optional[str]): Must be 'cql2-json' or 'cql2-text' if specified, other values will result in an error. q (Optional[Union[str, List[str]]]): Free text search terms. + query (Optional[str]): Legacy query parameter (deprecated). + request (Request): FastAPI Request object. + token (Optional[str]): Pagination token for retrieving the next page of results. **kwargs: Keyword arguments from the request. Returns: @@ -401,6 +405,7 @@ async def all_collections( limit=limit, request=request, sort=sort, + bbox=bbox, q=q_list, filter=parsed_filter, query=parsed_query, @@ -502,6 +507,7 @@ async def post_all_collections( # Pass all parameters from search_request to all_collections return await self.all_collections( limit=search_request.limit if hasattr(search_request, "limit") else None, + bbox=search_request.bbox if hasattr(search_request, "bbox") else None, datetime=search_request.datetime if hasattr(search_request, "datetime") else None, diff --git a/stac_fastapi/core/stac_fastapi/core/serializers.py b/stac_fastapi/core/stac_fastapi/core/serializers.py index 1700ac598..973de18d8 100644 --- a/stac_fastapi/core/stac_fastapi/core/serializers.py +++ b/stac_fastapi/core/stac_fastapi/core/serializers.py @@ -1,6 +1,7 @@ """Serializers.""" import abc +import logging from copy import deepcopy from typing import Any, List, Optional @@ -13,6 +14,8 @@ from stac_fastapi.types import stac as stac_types from stac_fastapi.types.links import ItemLinks, resolve_links +logger = logging.getLogger(__name__) + @attr.s class Serializer(abc.ABC): @@ -168,6 +171,9 @@ def db_to_stac( # Avoid modifying the input dict in-place ... doing so breaks some tests collection = deepcopy(collection) + # Remove internal bbox_shape field (not part of STAC spec) + collection.pop("bbox_shape", None) + # Set defaults collection_id = collection.get("id") collection.setdefault("type", "Collection") diff --git a/stac_fastapi/elasticsearch/stac_fastapi/elasticsearch/database_logic.py b/stac_fastapi/elasticsearch/stac_fastapi/elasticsearch/database_logic.py index 9c136411a..c3f6f8530 100644 --- a/stac_fastapi/elasticsearch/stac_fastapi/elasticsearch/database_logic.py +++ b/stac_fastapi/elasticsearch/stac_fastapi/elasticsearch/database_logic.py @@ -29,13 +29,15 @@ ) from stac_fastapi.sfeos_helpers import filter as filter_module from stac_fastapi.sfeos_helpers.database import ( + add_bbox_shape_to_collection, + apply_collections_bbox_filter_shared, + apply_collections_datetime_filter_shared, apply_free_text_filter_shared, apply_intersects_filter_shared, create_index_templates_shared, delete_item_index_shared, get_queryables_mapping_shared, index_alias_by_collection_id, - index_by_collection_id, mk_actions, mk_item_id, populate_sort_shared, @@ -99,26 +101,6 @@ async def create_collection_index() -> None: await client.close() -async def create_item_index(collection_id: str): - """ - Create the index for Items. The settings of the index template will be used implicitly. - - Args: - collection_id (str): Collection identifier. - - Returns: - None - - """ - client = AsyncElasticsearchSettings().create_client - - await client.options(ignore_status=400).indices.create( - index=f"{index_by_collection_id(collection_id)}-000001", - body={"aliases": {index_alias_by_collection_id(collection_id): {}}}, - ) - await client.close() - - async def delete_item_index(collection_id: str): """Delete the index for items in a collection. @@ -175,6 +157,7 @@ async def get_all_collections( limit: int, request: Request, sort: Optional[List[Dict[str, Any]]] = None, + bbox: Optional[List[float]] = None, q: Optional[List[str]] = None, filter: Optional[Dict[str, Any]] = None, query: Optional[Dict[str, Dict[str, Any]]] = None, @@ -187,6 +170,7 @@ async def get_all_collections( limit (int): The number of results to return. request (Request): The FastAPI request object. sort (Optional[List[Dict[str, Any]]]): Optional sort parameter from the request. + bbox (Optional[List[float]]): Bounding box to filter collections by spatial extent. q (Optional[List[str]]): Free text search terms. query (Optional[Dict[str, Dict[str, Any]]]): Query extension parameters. filter (Optional[Dict[str, Any]]): Structured query in CQL2 format. @@ -314,12 +298,15 @@ async def get_all_collections( query_parts.append({"bool": {"must_not": {"match_all": {}}}}) raise - # Combine all query parts with AND logic if there are multiple - datetime_filter = None - if datetime: - datetime_filter = self._apply_collection_datetime_filter(datetime) - if datetime_filter: - query_parts.append(datetime_filter) + # Apply bbox filter if provided + bbox_filter = apply_collections_bbox_filter_shared(bbox) + if bbox_filter: + query_parts.append(bbox_filter) + + # Apply datetime filter if provided + datetime_filter = apply_collections_datetime_filter_shared(datetime) + if datetime_filter: + query_parts.append(datetime_filter) # Combine all query parts with AND logic if query_parts: @@ -329,12 +316,6 @@ async def get_all_collections( else {"bool": {"must": query_parts}} ) - # Create a copy of the body for count query (without pagination and sorting) - count_body = body.copy() - if "search_after" in count_body: - del count_body["search_after"] - count_body["size"] = 0 - # Create async tasks for both search and count search_task = asyncio.create_task( self.client.search( @@ -384,41 +365,6 @@ async def get_all_collections( return collections, next_token, matched - @staticmethod - def _apply_collection_datetime_filter( - datetime_str: Optional[str], - ) -> Optional[Dict[str, Any]]: - """Create a temporal filter for collections based on their extent.""" - if not datetime_str: - return None - - # Parse the datetime string into start and end - if "/" in datetime_str: - start, end = datetime_str.split("/") - # Replace open-ended ranges with concrete dates - if start == "..": - # For open-ended start, use a very early date - start = "1800-01-01T00:00:00Z" - if end == "..": - # For open-ended end, use a far future date - end = "2999-12-31T23:59:59Z" - else: - # If it's just a single date, use it for both start and end - start = end = datetime_str - - return { - "bool": { - "must": [ - # Check if any date in the array is less than or equal to the query end date - # This will match if the collection's start date is before or equal to the query end date - {"range": {"extent.temporal.interval": {"lte": end}}}, - # Check if any date in the array is greater than or equal to the query start date - # This will match if the collection's end date is after or equal to the query start date - {"range": {"extent.temporal.interval": {"gte": start}}}, - ] - } - } - async def get_one_item(self, collection_id: str, item_id: str) -> Dict: """Retrieve a single item from the database. @@ -1386,7 +1332,7 @@ async def create_collection(self, collection: Collection, **kwargs: Any): None Notes: - A new index is created for the items in the Collection using the `create_item_index` function. + A new index is created for the items in the Collection if the index insertion strategy requires it. """ collection_id = collection["id"] @@ -1404,6 +1350,12 @@ async def create_collection(self, collection: Collection, **kwargs: Any): if await self.client.exists(index=COLLECTIONS_INDEX, id=collection_id): raise ConflictError(f"Collection {collection_id} already exists") + if get_bool_env("ENABLE_COLLECTIONS_SEARCH") or get_bool_env( + "ENABLE_COLLECTIONS_SEARCH_ROUTE" + ): + # Convert bbox to bbox_shape for geospatial queries (ES/OS specific) + add_bbox_shape_to_collection(collection) + # Index the collection in the database await self.client.index( index=COLLECTIONS_INDEX, @@ -1507,6 +1459,12 @@ async def update_collection( await self.delete_collection(collection_id) else: + if get_bool_env("ENABLE_COLLECTIONS_SEARCH") or get_bool_env( + "ENABLE_COLLECTIONS_SEARCH_ROUTE" + ): + # Convert bbox to bbox_shape for geospatial queries (ES/OS specific) + add_bbox_shape_to_collection(collection) + # Update the existing collection await self.client.index( index=COLLECTIONS_INDEX, diff --git a/stac_fastapi/opensearch/stac_fastapi/opensearch/database_logic.py b/stac_fastapi/opensearch/stac_fastapi/opensearch/database_logic.py index d16e8215a..9d814ba92 100644 --- a/stac_fastapi/opensearch/stac_fastapi/opensearch/database_logic.py +++ b/stac_fastapi/opensearch/stac_fastapi/opensearch/database_logic.py @@ -29,6 +29,9 @@ from stac_fastapi.opensearch.config import OpensearchSettings as SyncSearchSettings from stac_fastapi.sfeos_helpers import filter as filter_module from stac_fastapi.sfeos_helpers.database import ( + add_bbox_shape_to_collection, + apply_collections_bbox_filter_shared, + apply_collections_datetime_filter_shared, apply_free_text_filter_shared, apply_intersects_filter_shared, create_index_templates_shared, @@ -159,6 +162,7 @@ async def get_all_collections( limit: int, request: Request, sort: Optional[List[Dict[str, Any]]] = None, + bbox: Optional[List[float]] = None, q: Optional[List[str]] = None, filter: Optional[Dict[str, Any]] = None, query: Optional[Dict[str, Dict[str, Any]]] = None, @@ -171,6 +175,7 @@ async def get_all_collections( limit (int): The number of results to return. request (Request): The FastAPI request object. sort (Optional[List[Dict[str, Any]]]): Optional sort parameter from the request. + bbox (Optional[List[float]]): Bounding box to filter collections by spatial extent. q (Optional[List[str]]): Free text search terms. query (Optional[Dict[str, Dict[str, Any]]]): Query extension parameters. filter (Optional[Dict[str, Any]]): Structured query in CQL2 format. @@ -298,12 +303,15 @@ async def get_all_collections( query_parts.append({"bool": {"must_not": {"match_all": {}}}}) raise - # Combine all query parts with AND logic if there are multiple - datetime_filter = None - if datetime: - datetime_filter = self._apply_collection_datetime_filter(datetime) - if datetime_filter: - query_parts.append(datetime_filter) + # Apply bbox filter if provided + bbox_filter = apply_collections_bbox_filter_shared(bbox) + if bbox_filter: + query_parts.append(bbox_filter) + + # Apply datetime filter if provided + datetime_filter = apply_collections_datetime_filter_shared(datetime) + if datetime_filter: + query_parts.append(datetime_filter) # Combine all query parts with AND logic if query_parts: @@ -313,12 +321,6 @@ async def get_all_collections( else {"bool": {"must": query_parts}} ) - # Create a copy of the body for count query (without pagination and sorting) - count_body = body.copy() - if "search_after" in count_body: - del count_body["search_after"] - count_body["size"] = 0 - # Create async tasks for both search and count search_task = asyncio.create_task( self.client.search( @@ -454,41 +456,6 @@ def apply_free_text_filter(search: Search, free_text_queries: Optional[List[str] search=search, free_text_queries=free_text_queries ) - @staticmethod - def _apply_collection_datetime_filter( - datetime_str: Optional[str], - ) -> Optional[Dict[str, Any]]: - """Create a temporal filter for collections based on their extent.""" - if not datetime_str: - return None - - # Parse the datetime string into start and end - if "/" in datetime_str: - start, end = datetime_str.split("/") - # Replace open-ended ranges with concrete dates - if start == "..": - # For open-ended start, use a very early date - start = "1800-01-01T00:00:00Z" - if end == "..": - # For open-ended end, use a far future date - end = "2999-12-31T23:59:59Z" - else: - # If it's just a single date, use it for both start and end - start = end = datetime_str - - return { - "bool": { - "must": [ - # Check if any date in the array is less than or equal to the query end date - # This will match if the collection's start date is before or equal to the query end date - {"range": {"extent.temporal.interval": {"lte": end}}}, - # Check if any date in the array is greater than or equal to the query start date - # This will match if the collection's end date is after or equal to the query start date - {"range": {"extent.temporal.interval": {"gte": start}}}, - ] - } - } - @staticmethod def apply_datetime_filter( search: Search, datetime: Optional[str] @@ -1356,7 +1323,7 @@ async def create_collection(self, collection: Collection, **kwargs: Any): ConflictError: If a Collection with the same id already exists in the database. Notes: - A new index is created for the items in the Collection using the `create_item_index` function. + A new index is created for the items in the Collection if the index insertion strategy requires it. """ collection_id = collection["id"] @@ -1373,6 +1340,12 @@ async def create_collection(self, collection: Collection, **kwargs: Any): if await self.client.exists(index=COLLECTIONS_INDEX, id=collection_id): raise ConflictError(f"Collection {collection_id} already exists") + if get_bool_env("ENABLE_COLLECTIONS_SEARCH") or get_bool_env( + "ENABLE_COLLECTIONS_SEARCH_ROUTE" + ): + # Convert bbox to bbox_shape for geospatial queries (ES/OS specific) + add_bbox_shape_to_collection(collection) + await self.client.index( index=COLLECTIONS_INDEX, id=collection_id, @@ -1464,6 +1437,12 @@ async def update_collection( await self.delete_collection(collection_id=collection_id, **kwargs) else: + if get_bool_env("ENABLE_COLLECTIONS_SEARCH") or get_bool_env( + "ENABLE_COLLECTIONS_SEARCH_ROUTE" + ): + # Convert bbox to bbox_shape for geospatial queries (ES/OS specific) + add_bbox_shape_to_collection(collection) + await self.client.index( index=COLLECTIONS_INDEX, id=collection_id, diff --git a/stac_fastapi/sfeos_helpers/stac_fastapi/sfeos_helpers/database/__init__.py b/stac_fastapi/sfeos_helpers/stac_fastapi/sfeos_helpers/database/__init__.py index bacf1ac31..01dae07b8 100644 --- a/stac_fastapi/sfeos_helpers/stac_fastapi/sfeos_helpers/database/__init__.py +++ b/stac_fastapi/sfeos_helpers/stac_fastapi/sfeos_helpers/database/__init__.py @@ -42,11 +42,13 @@ ) from .mapping import get_queryables_mapping_shared from .query import ( + apply_collections_bbox_filter_shared, + apply_collections_datetime_filter_shared, apply_free_text_filter_shared, apply_intersects_filter_shared, populate_sort_shared, ) -from .utils import get_bool_env, validate_refresh +from .utils import add_bbox_shape_to_collection, get_bool_env, validate_refresh __all__ = [ # Index operations @@ -59,6 +61,8 @@ # Query operations "apply_free_text_filter_shared", "apply_intersects_filter_shared", + "apply_collections_bbox_filter_shared", + "apply_collections_datetime_filter_shared", "populate_sort_shared", # Mapping operations "get_queryables_mapping_shared", @@ -68,6 +72,7 @@ # Utility functions "validate_refresh", "get_bool_env", + "add_bbox_shape_to_collection", # Datetime utilities "return_date", "extract_date", diff --git a/stac_fastapi/sfeos_helpers/stac_fastapi/sfeos_helpers/database/query.py b/stac_fastapi/sfeos_helpers/stac_fastapi/sfeos_helpers/database/query.py index 80d071287..72285a56f 100644 --- a/stac_fastapi/sfeos_helpers/stac_fastapi/sfeos_helpers/database/query.py +++ b/stac_fastapi/sfeos_helpers/stac_fastapi/sfeos_helpers/database/query.py @@ -3,8 +3,10 @@ This module provides functions for building and manipulating Elasticsearch/OpenSearch queries. """ -from typing import Any, Dict, List, Optional +import logging +from typing import Any, Dict, List, Optional, Union +from stac_fastapi.core.utilities import bbox2polygon from stac_fastapi.sfeos_helpers.mappings import Geometry ES_MAX_URL_LENGTH = 4096 @@ -66,6 +68,139 @@ def apply_intersects_filter_shared( } +def apply_collections_datetime_filter_shared( + datetime_str: Optional[str], +) -> Optional[Dict[str, Any]]: + """Create a temporal filter for collections based on their extent. + + Args: + datetime_str: The datetime parameter. Can be: + - A single datetime string (e.g., "2020-01-01T00:00:00Z") + - A datetime range with "/" separator (e.g., "2020-01-01T00:00:00Z/2021-01-01T00:00:00Z") + - Open-ended ranges using ".." (e.g., "../2021-01-01T00:00:00Z" or "2020-01-01T00:00:00Z/..") + - None if no datetime filter is provided + + Returns: + Optional[Dict[str, Any]]: A dictionary containing the temporal filter configuration + that can be used with Elasticsearch/OpenSearch queries, or None if datetime_str is None. + Example return value: + { + "bool": { + "must": [ + {"range": {"extent.temporal.interval": {"lte": "2021-01-01T00:00:00Z"}}}, + {"range": {"extent.temporal.interval": {"gte": "2020-01-01T00:00:00Z"}}} + ] + } + } + + Notes: + - This function is specifically for filtering collections by their temporal extent + - It queries the extent.temporal.interval field + - Open-ended ranges (..) are replaced with concrete dates (1800-01-01 for start, 2999-12-31 for end) + """ + if not datetime_str: + return None + + # Parse the datetime string into start and end + if "/" in datetime_str: + start, end = datetime_str.split("/") + # Replace open-ended ranges with concrete dates + if start == "..": + # For open-ended start, use a very early date + start = "1800-01-01T00:00:00Z" + if end == "..": + # For open-ended end, use a far future date + end = "2999-12-31T23:59:59Z" + else: + # If it's just a single date, use it for both start and end + start = end = datetime_str + + return { + "bool": { + "must": [ + # Check if any date in the array is less than or equal to the query end date + # This will match if the collection's start date is before or equal to the query end date + {"range": {"extent.temporal.interval": {"lte": end}}}, + # Check if any date in the array is greater than or equal to the query start date + # This will match if the collection's end date is after or equal to the query start date + {"range": {"extent.temporal.interval": {"gte": start}}}, + ] + } + } + + +def apply_collections_bbox_filter_shared( + bbox: Union[str, List[float], None] +) -> Optional[Dict[str, Dict]]: + """Create a geo_shape filter for collections bbox search. + + This function handles bbox parsing from both GET requests (string format) and POST requests + (list format), and constructs a geo_shape query for filtering collections by their bbox_shape field. + + Args: + bbox: The bounding box parameter. Can be: + - A string of comma-separated coordinates (from GET requests) + - A list of floats [minx, miny, maxx, maxy] for 2D bbox + - None if no bbox filter is provided + + Returns: + Optional[Dict[str, Dict]]: A dictionary containing the geo_shape filter configuration + that can be used with Elasticsearch/OpenSearch queries, or None if bbox is invalid. + Example return value: + { + "geo_shape": { + "bbox_shape": { + "shape": { + "type": "Polygon", + "coordinates": [[[minx, miny], [maxx, miny], [maxx, maxy], [minx, maxy], [minx, miny]]] + }, + "relation": "intersects" + } + } + } + + Notes: + - This function is specifically for filtering collections by their spatial extent + - It queries the bbox_shape field (not the geometry field used for items) + - The bbox is expected to be 2D (4 values) after any 3D to 2D conversion in the API layer + """ + logger = logging.getLogger(__name__) + + if not bbox: + return None + + # Parse bbox if it's a string (from GET requests) + if isinstance(bbox, str): + try: + bbox = [float(x.strip()) for x in bbox.split(",")] + except (ValueError, AttributeError) as e: + logger.error(f"Invalid bbox format: {bbox}, error: {e}") + return None + + if not bbox or len(bbox) != 4: + if bbox: + logger.warning( + f"bbox has incorrect number of coordinates (length={len(bbox)}), expected 4 (2D bbox)" + ) + return None + + # Convert bbox to a polygon for geo_shape query + bbox_polygon = { + "type": "Polygon", + "coordinates": bbox2polygon(bbox[0], bbox[1], bbox[2], bbox[3]), + } + + # Return geo_shape query for bbox_shape field + return { + "geo_shape": { + "bbox_shape": { + "shape": bbox_polygon, + "relation": "intersects", + } + } + } + + def populate_sort_shared(sortby: List) -> Optional[Dict[str, Dict[str, str]]]: """Create a sort configuration for Elasticsearch/OpenSearch queries. diff --git a/stac_fastapi/sfeos_helpers/stac_fastapi/sfeos_helpers/database/utils.py b/stac_fastapi/sfeos_helpers/stac_fastapi/sfeos_helpers/database/utils.py index 12085c378..eaa596fad 100644 --- a/stac_fastapi/sfeos_helpers/stac_fastapi/sfeos_helpers/database/utils.py +++ b/stac_fastapi/sfeos_helpers/stac_fastapi/sfeos_helpers/database/utils.py @@ -5,9 +5,9 @@ """ import logging -from typing import Dict, List, Union +from typing import Any, Dict, List, Union -from stac_fastapi.core.utilities import get_bool_env +from stac_fastapi.core.utilities import bbox2polygon, get_bool_env from stac_fastapi.extensions.core.transaction.request import ( PatchAddReplaceTest, PatchOperation, @@ -15,6 +15,84 @@ ) from stac_fastapi.sfeos_helpers.models.patch import ElasticPath, ESCommandSet +logger = logging.getLogger(__name__) + + +def add_bbox_shape_to_collection(collection: Dict[str, Any]) -> bool: + """Add bbox_shape field to a collection document for spatial queries. + + This function extracts the bounding box from a collection's spatial extent + and converts it to a GeoJSON polygon shape that can be used for geospatial + queries in Elasticsearch/OpenSearch. + + Args: + collection: Collection document dictionary to modify in-place. + + Returns: + bool: True if bbox_shape was added, False if it was skipped (already exists, + no spatial extent, or invalid bbox). + + Notes: + - Modifies the collection dictionary in-place by adding a 'bbox_shape' field + - Handles both 2D [minx, miny, maxx, maxy] and 3D [minx, miny, minz, maxx, maxy, maxz] bboxes + - Uses the first bbox if multiple are present in the collection + - Logs warnings for collections with invalid or missing bbox data + """ + collection_id = collection.get("id", "unknown") + + # Check if bbox_shape already exists + if "bbox_shape" in collection: + logger.debug( + f"Collection '{collection_id}' already has bbox_shape field, skipping" + ) + return False + + # Check if collection has spatial extent + if "extent" not in collection or "spatial" not in collection["extent"]: + logger.warning(f"Collection '{collection_id}' has no spatial extent, skipping") + return False + + spatial_extent = collection["extent"]["spatial"] + if "bbox" not in spatial_extent or not spatial_extent["bbox"]: + logger.warning( + f"Collection '{collection_id}' has no bbox in spatial extent, skipping" + ) + return False + + # Get the first bbox (collections can have multiple bboxes, but we use the first one) + bbox = ( + spatial_extent["bbox"][0] + if isinstance(spatial_extent["bbox"][0], list) + else spatial_extent["bbox"] + ) + + if len(bbox) < 4: + logger.warning( + f"Collection '{collection_id}': bbox has insufficient coordinates (length={len(bbox)}), expected at least 4" + ) + return False + + # Extract 2D coordinates (bbox can be 2D [minx, miny, maxx, maxy] or 3D [minx, miny, minz, maxx, maxy, maxz]) + # For 2D polygon, we only need the x,y coordinates and discard altitude (z) values + minx, miny = bbox[0], bbox[1] + if len(bbox) == 4: + # 2D bbox: [minx, miny, maxx, maxy] + maxx, maxy = bbox[2], bbox[3] + else: + # 3D bbox: [minx, miny, minz, maxx, maxy, maxz] + # Extract indices 3,4 for maxx,maxy - discarding altitude at indices 2 (minz) and 5 (maxz) + maxx, maxy = bbox[3], bbox[4] + + # Convert bbox to GeoJSON polygon + bbox_polygon_coords = bbox2polygon(minx, miny, maxx, maxy) + collection["bbox_shape"] = { + "type": "Polygon", + "coordinates": bbox_polygon_coords, + } + + logger.debug(f"Collection '{collection_id}': Added bbox_shape field") + return True + def validate_refresh(value: Union[str, bool]) -> str: """ diff --git a/stac_fastapi/sfeos_helpers/stac_fastapi/sfeos_helpers/mappings.py b/stac_fastapi/sfeos_helpers/stac_fastapi/sfeos_helpers/mappings.py index b2d7264d6..cb0c8f2d5 100644 --- a/stac_fastapi/sfeos_helpers/stac_fastapi/sfeos_helpers/mappings.py +++ b/stac_fastapi/sfeos_helpers/stac_fastapi/sfeos_helpers/mappings.py @@ -160,7 +160,7 @@ class Geometry(Protocol): # noqa "dynamic_templates": ES_MAPPINGS_DYNAMIC_TEMPLATES, "properties": { "id": {"type": "keyword"}, - "extent.spatial.bbox": {"type": "long"}, + "bbox_shape": {"type": "geo_shape"}, "extent.temporal.interval": { "type": "date", "format": "strict_date_optional_time||epoch_millis", diff --git a/stac_fastapi/tests/api/test_api_search_collections.py b/stac_fastapi/tests/api/test_api_search_collections.py index 029292ed0..19c9c6071 100644 --- a/stac_fastapi/tests/api/test_api_search_collections.py +++ b/stac_fastapi/tests/api/test_api_search_collections.py @@ -1020,3 +1020,232 @@ async def test_collections_pagination_all_endpoints(app_client, txn_client, ctx) for i, expected_id in enumerate(expected_ids): assert test_found[i]["id"] == expected_id + + +@pytest.mark.asyncio +async def test_collections_bbox_all_endpoints(app_client, txn_client, ctx): + """Verify GET /collections, GET /collections-search, and POST /collections-search honor the bbox parameter.""" + # Create multiple collections with different spatial extents + base_collection = ctx.collection + + # Use unique prefixes to avoid conflicts between tests + test_prefix = f"bbox-{uuid.uuid4().hex[:8]}" + + # Create collections with different bboxes + # Collection 1: Europe bbox + collection_europe = base_collection.copy() + collection_europe["id"] = f"{test_prefix}-europe" + collection_europe["title"] = "Europe Collection" + collection_europe["extent"] = { + "spatial": {"bbox": [[-10.0, 35.0, 40.0, 70.0]]}, + "temporal": {"interval": [[None, None]]}, + } + await create_collection(txn_client, collection_europe) + + # Collection 2: North America bbox + collection_na = base_collection.copy() + collection_na["id"] = f"{test_prefix}-north-america" + collection_na["title"] = "North America Collection" + collection_na["extent"] = { + "spatial": {"bbox": [[-170.0, 15.0, -50.0, 75.0]]}, + "temporal": {"interval": [[None, None]]}, + } + await create_collection(txn_client, collection_na) + + # Collection 3: Asia bbox + collection_asia = base_collection.copy() + collection_asia["id"] = f"{test_prefix}-asia" + collection_asia["title"] = "Asia Collection" + collection_asia["extent"] = { + "spatial": {"bbox": [[60.0, -10.0, 150.0, 55.0]]}, + "temporal": {"interval": [[None, None]]}, + } + await create_collection(txn_client, collection_asia) + + # Collection 4: Global bbox (should match any query) + collection_global = base_collection.copy() + collection_global["id"] = f"{test_prefix}-global" + collection_global["title"] = "Global Collection" + collection_global["extent"] = { + "spatial": {"bbox": [[-180.0, -90.0, 180.0, 90.0]]}, + "temporal": {"interval": [[None, None]]}, + } + await create_collection(txn_client, collection_global) + + # Collection 5: 3D bbox (with altitude) - should still work for 2D queries + collection_3d = base_collection.copy() + collection_3d["id"] = f"{test_prefix}-3d-europe" + collection_3d["title"] = "3D Europe Collection" + collection_3d["extent"] = { + "spatial": {"bbox": [[-10.0, 35.0, 0.0, 40.0, 70.0, 5000.0]]}, # 3D bbox + "temporal": {"interval": [[None, None]]}, + } + await create_collection(txn_client, collection_3d) + + await refresh_indices(txn_client) + + # Test 1: Query for Europe region - should match Europe, Global, and 3D Europe collections + europe_bbox = [0.0, 40.0, 20.0, 60.0] # Central Europe + + endpoints = [ + { + "method": "GET", + "path": "/collections", + "params": [("bbox", ",".join(map(str, europe_bbox)))], + }, + { + "method": "GET", + "path": "/collections-search", + "params": [("bbox", ",".join(map(str, europe_bbox)))], + }, + { + "method": "POST", + "path": "/collections-search", + "body": {"bbox": europe_bbox}, + }, + ] + + for endpoint in endpoints: + if endpoint["method"] == "GET": + resp = await app_client.get(endpoint["path"], params=endpoint["params"]) + else: # POST + resp = await app_client.post(endpoint["path"], json=endpoint["body"]) + + assert ( + resp.status_code == 200 + ), f"Failed for {endpoint['method']} {endpoint['path']}: {resp.text}" + resp_json = resp.json() + + collections_list = resp_json["collections"] + + # Filter collections to only include the ones we created for this test + test_collections = [ + c for c in collections_list if c["id"].startswith(test_prefix) + ] + + # Should find Europe, Global, and 3D Europe collections + found_ids = {c["id"] for c in test_collections} + assert ( + f"{test_prefix}-europe" in found_ids + ), f"Europe collection not found {endpoint['method']} {endpoint['path']}" + assert ( + f"{test_prefix}-global" in found_ids + ), f"Global collection not found {endpoint['method']} {endpoint['path']}" + assert ( + f"{test_prefix}-3d-europe" in found_ids + ), f"3D Europe collection not found {endpoint['method']} {endpoint['path']}" + # Should NOT find North America or Asia + assert ( + f"{test_prefix}-north-america" not in found_ids + ), f"North America should not match Europe bbox in {endpoint['method']} {endpoint['path']}" + assert ( + f"{test_prefix}-asia" not in found_ids + ), f"Asia should not match Europe bbox in {endpoint['method']} {endpoint['path']}" + + # Test 2: Query for North America region - should match North America and Global collections + na_bbox = [-120.0, 30.0, -80.0, 50.0] # Central North America + + endpoints = [ + { + "method": "GET", + "path": "/collections", + "params": [("bbox", ",".join(map(str, na_bbox)))], + }, + { + "method": "GET", + "path": "/collections-search", + "params": [("bbox", ",".join(map(str, na_bbox)))], + }, + {"method": "POST", "path": "/collections-search", "body": {"bbox": na_bbox}}, + ] + + for endpoint in endpoints: + if endpoint["method"] == "GET": + resp = await app_client.get(endpoint["path"], params=endpoint["params"]) + else: # POST + resp = await app_client.post(endpoint["path"], json=endpoint["body"]) + + assert ( + resp.status_code == 200 + ), f"Failed for {endpoint['method']} {endpoint['path']}: {resp.text}" + resp_json = resp.json() + + collections_list = resp_json["collections"] + + # Filter collections to only include the ones we created for this test + test_collections = [ + c for c in collections_list if c["id"].startswith(test_prefix) + ] + + # Should find North America and Global collections + found_ids = {c["id"] for c in test_collections} + assert ( + f"{test_prefix}-north-america" in found_ids + ), f"North America collection not found {endpoint['method']} {endpoint['path']}" + assert ( + f"{test_prefix}-global" in found_ids + ), f"Global collection not found {endpoint['method']} {endpoint['path']}" + # Should NOT find Europe, Asia, or 3D Europe + assert ( + f"{test_prefix}-europe" not in found_ids + ), f"Europe should not match North America bbox in {endpoint['method']} {endpoint['path']}" + assert ( + f"{test_prefix}-asia" not in found_ids + ), f"Asia should not match North America bbox in {endpoint['method']} {endpoint['path']}" + assert ( + f"{test_prefix}-3d-europe" not in found_ids + ), f"3D Europe should not match North America bbox in {endpoint['method']} {endpoint['path']}" + + # Test 3: Query for Asia region - should match Asia and Global collections + asia_bbox = [100.0, 20.0, 130.0, 45.0] # East Asia + + endpoints = [ + { + "method": "GET", + "path": "/collections", + "params": [("bbox", ",".join(map(str, asia_bbox)))], + }, + { + "method": "GET", + "path": "/collections-search", + "params": [("bbox", ",".join(map(str, asia_bbox)))], + }, + {"method": "POST", "path": "/collections-search", "body": {"bbox": asia_bbox}}, + ] + + for endpoint in endpoints: + if endpoint["method"] == "GET": + resp = await app_client.get(endpoint["path"], params=endpoint["params"]) + else: # POST + resp = await app_client.post(endpoint["path"], json=endpoint["body"]) + + assert ( + resp.status_code == 200 + ), f"Failed for {endpoint['method']} {endpoint['path']}: {resp.text}" + resp_json = resp.json() + + collections_list = resp_json["collections"] + + # Filter collections to only include the ones we created for this test + test_collections = [ + c for c in collections_list if c["id"].startswith(test_prefix) + ] + + # Should find Asia and Global collections + found_ids = {c["id"] for c in test_collections} + assert ( + f"{test_prefix}-asia" in found_ids + ), f"Asia collection not found {endpoint['method']} {endpoint['path']}" + assert ( + f"{test_prefix}-global" in found_ids + ), f"Global collection not found {endpoint['method']} {endpoint['path']}" + # Should NOT find Europe, North America, or 3D Europe + assert ( + f"{test_prefix}-europe" not in found_ids + ), f"Europe should not match Asia bbox in {endpoint['method']} {endpoint['path']}" + assert ( + f"{test_prefix}-north-america" not in found_ids + ), f"North America should not match Asia bbox in {endpoint['method']} {endpoint['path']}" + assert ( + f"{test_prefix}-3d-europe" not in found_ids + ), f"3D Europe should not match Asia bbox in {endpoint['method']} {endpoint['path']}"