Skip to content

NASA-IMPACT/veda-backend

Repository files navigation

veda-backend

This project deploys a complete backend for a SpatioTemporal Asset Catalog including a postgres database, a metadata API, and raster tiling API. Veda-backend is a non-forked version of the eoAPI demo project. Veda-backend is decoupled from the demo project to selectively incorporate new stable functionality from the fast moving development in eoAPI while providing a continuous baseline for veda-backend users and to support project specific business and deployment logic.

The primary tools employed in the eoAPI demo and this project are:

VEDA backend context

architecture diagram

Edit this diagram in VS Code using the Draw.io Integration Extension and export a new SVG

Veda backend is is the central index of the VEDA ecosystem. This project provides the infrastructure for a PgSTAC database, STAC API, and TiTiler. This infrastructure is used to discover, access, and visualize the Analysis Ready Cloud Optimized (ARCO) assets of the VEDA Data Store.

Deployment

This project uses an AWS CDK CloudFormation stack to deploy a full AWS virtual private cloud environment with a database and supporting lambda function APIs. The deployment constructs, database, and API services are highly configurable. This section provices basic deployment instructions as well as support for customization.

Tooling & supporting documentation

Enviroment variables

An .example.env template is supplied for for local deployments. If updating an existing deployment, it is essential to check the most current values for these variables by fetching these values from AWS Secrets Manager. The environment secrets are named <app-name>-<stage>-env, for example veda-backend-dev-env.

Warning The environment variables stored as AWS secrets are manually maintained and should be reviewed before deploying updates to existing stacks.

Fetch environment variables using AWS CLI

To retrieve the variables for a stage that has been previously deployed, the secrets manager can be used to quickly populate an .env file with scripts/sync-env-local.sh.

./scripts/sync-env-local.sh <app-secret-name>

Basic environment variables

Name Explanation
APP_NAME Optional app name used to name stack and resources, defaults to veda-backend
STAGE REQUIRED Deployment stage used to name stack and resources, i.e. dev, staging, prod
VEDA_DB_PGSTAC_VERSION REQUIRED version of PgStac database, i.e. 0.7.6
VEDA_DB_SCHEMA_VERSION REQUIRED The version of the custom veda-backend schema, i.e. 0.1.1
VEDA_DB_SNAPSHOT_ID Once used always REQUIRED Optional RDS snapshot identifier to initialize RDS from a snapshot

Advanced configuration

The constructs and applications in this project are configured using pydantic. The settings are defined in config.py files stored alongside the associated construct or application--for example the settings for the RDS PostgreSQL construct are defined in database/infrastructure/config.py. For custom configuration, use environment variables to override the pydantic defaults.

Construct Env Prefix Configuration
Database VEDA_DB database/infrastructure/config.py
Domain VEDA_DOMAIN domain/infrastructure/config.py
Network N/A network/infrastructure/config.py
Raster API (TiTiler) VEDA_RASTER raster_api/infrastructure/config.py
STAC API VEDA stac_api/infrastructure/config.py
Routes VEDA routes/infrastructure/config.py
S3 Website VEDA s3_website/infrastructure/config.py
App (global settings) N/A config.py

Deploying to the cloud

Install deployment pre-requisites

These can be installed with homebrew on MacOS

brew install node
brew install nvm
brew install jq

Virtual environment example

python3 -m venv .venv
source .venv/bin/activate

Install requirements

nvm use --lts
npm install --location=global aws-cdk
python3 -m pip install --upgrade pip
python3 -m pip install -e ".[dev,deploy,test]"

Run the deployment

# Review what infrastructure changes your deployment will cause
cdk diff
# Execute deployment and standby--security changes will require approval for deployment
cdk deploy

Deleting the CloudFormation stack

If this is a development stack that is safe to delete, you can delete the stack in CloudFormation console or via cdk destroy, however, the additional manual steps were required to completely delete the stack resources:

  1. You will need to disable deletion protection of the RDS database and delete the database.
  2. Identify and delete the RDS subnet group associated with the RDS database you just deleted (it will not be automatically removed because of the RDS deletion protection in place when the group was created).
  3. If this stack created a new VPC, detach the Internet Gateway (IGW) from the VPC and delete it.
  4. If this stack created a new VPC, delete the VPC (this should delete a subnet and security group too).

Custom deployments

The default settings for this project generate a complete AWS environment including a VPC and gateways for the stack. See this guidance for adjusting the veda-backend stack for existing managed and/or shared AWS environments.

Local Docker deployment

Start up a local stack

docker compose up

Clean up after running locally

docker compose down

Running tests locally

To run tests implicated in CI, a script is included that requires as little setup as possible

./scripts/run-local-tests.sh

In case of failure, all container logs will be written out to container_logs.log.

Operations

Adding new data to veda-backend

Warning PgSTAC records should be loaded in the database using pypgstac for proper indexing and partitioning.

The VEDA ecosystem includes tools specifially created for loading PgSTAC records and optimizing data assets. The veda-data-airflow project provides examples of cloud pipelines that transform data to cloud optimized formats, generate STAC metadata, and submit records for publication to the veda-backend database via veda-backend's ingest API. Veda-backend's integrated ingest system includes an API lambda for enqueuing collection and item records in a DynamoDB table and an ingestor lambda that batch loads DDB enqueued records into the PgSTAC database. Currently, the client id and domain of an existing Cognito user pool programmatic client must be supplied in configuration as VEDA_CLIENT_ID and VEDA_COGNITO_DOMAIN (the veda-auth project can be used to deploy a Cognito user pool and client). To dispense auth tokens via the ingest API swagger docs and /token endpoints, an administrator must add the ingest API lambda URL to the allowed callbacks of the Cognito client.

Support scripts

Support scripts are provided for manual system operations.

VEDA ecosystem

Projects

Name Explanation
veda-backend Central index (database) and APIs for recording, discovering, viewing, and using VEDA assets
veda-config Configuration for viewing VEDA assets in dashboard UI
veda-ui Dashboard UI for viewing and analysing VEDA assets
veda-stac-ingestor Entry-point for users/services to add new records to database
veda-data Collection and asset discovery configuration
veda-data-airflow Cloud optimize data assets and submit records for publication to veda-stac-ingestor
veda-docs Documentation repository for end users of VEDA ecosystem data and tools

VEDA usage examples

STAC community resources

STAC browser

Radiant Earth's stac-browser is a browser for STAC catalogs. The demo version of this browser radiantearth.github.io/stac-browser can be used to browse the contents of the veda-backend STAC catalog, paste the veda-backend stac-api URL deployed by this project in the demo and click load. Read more about the recent developments and usage of stac-browser here.

License

This project is licensed under Apache 2, see the LICENSE file for more details.