Skip to content

v9.0.1

Compare
Choose a tag to compare
@laurenfrederick laurenfrederick released this 07 May 21:41
· 2032 commits to master since this release
b70d75b

Cumulus v9.0.1 Migration steps

  • This release of Cumulus enables integration with a PostgreSQL database for archiving Cumulus data. There are several upgrade steps involved, some of which need to be done before redeploying Cumulus. See the documentation on upgrading to the RDS release.

BREAKING CHANGES

  • CUMULUS-2185 - RDS Migration Epic
    • CUMULUS-2191
      • Removed the following from the @cumulus/api/models.asyncOperation class in
        favor of the added @cumulus/async-operations module:
        • start
        • startAsyncOperations
    • CUMULUS-2187
      • The async-operations endpoint will now omit output instead of
        returning none when the operation did not return output.
    • CUMULUS-2309
      • Removed @cumulus/api/models/granule.unpublishAndDeleteGranule in favor
        of @cumulus/api/lib/granule-remove-from-cmr.unpublishGranule and
        @cumulus/api/lib/granule-delete.deleteGranuleAndFiles.
    • CUMULUS-2385
      • Updated sf-event-sqs-to-db-records to write a granule's files to
        PostgreSQL only after the workflow has exited the Running status.
        Please note that any workflow that uses sf_sqs_report_task for
        mid-workflow updates will be impacted.
      • Changed PostgreSQL file schema and TypeScript type definition to require
        bucket and key fields.
      • Updated granule/file write logic to mark a granule's status as "failed"
    • CUMULUS-2455
      • API move granule endpoint now moves granule files on a per-file basis
      • API move granule endpoint on granule file move failure will retain the
        file at it's original location, but continue to move any other granule
        files.
      • Removed the move method from the @cumulus/api/models.granule class.
        logic is now handled in @cumulus/api/endpoints/granules and is
        accessible via the Core API.

Added

  • CUMULUS-2185 - RDS Migration Epic
    • CUMULUS-2130
      • Added postgres-migration-count-tool lambda/ECS task to allow for
        evaluation of database state
      • Added /migrationCounts api endpoint that allows running of the
        postgres-migration-count-tool as an asyncOperation
    • CUMULUS-2394
      • Updated PDR and Granule writes to check the step function
        workflow_start_time against the createdAt field for each record to ensure
        old records do not overwrite newer ones for legacy Dynamo and PostgreSQL
        writes
    • CUMULUS-2188
      • Added data-migration2 Lambda to be run after data-migration1
      • Added logic to data-migration2 Lambda for migrating execution records
        from DynamoDB to PostgreSQL
    • CUMULUS-2191
      • Added @cumulus/async-operations to core packages, exposing
        startAsyncOperation which will handle starting an async operation and
        adding an entry to both PostgreSQL and DynamoDb
    • CUMULUS-2127
      • Add schema migration for collections table
    • CUMULUS-2129
      • Added logic to data-migration1 Lambda for migrating collection records
        from Dynamo to PostgreSQL
    • CUMULUS-2157
      • Add schema migration for providers table
      • Added logic to data-migration1 Lambda for migrating provider records
        from Dynamo to PostgreSQL
    • CUMULUS-2187
      • Added logic to data-migration1 Lambda for migrating async operation
        records from Dynamo to PostgreSQL
    • CUMULUS-2198
      • Added logic to data-migration1 Lambda for migrating rule records from
        DynamoDB to PostgreSQL
    • CUMULUS-2182
      • Add schema migration for PDRs table
    • CUMULUS-2230
      • Add schema migration for rules table
    • CUMULUS-2183
      • Add schema migration for asyncOperations table
    • CUMULUS-2184
      • Add schema migration for executions table
    • CUMULUS-2257
      • Updated PostgreSQL table and column names to snake_case
      • Added translateApiAsyncOperationToPostgresAsyncOperation function to @cumulus/db
    • CUMULUS-2186
      • Added logic to data-migration2 Lambda for migrating PDR records from
        DynamoDB to PostgreSQL
    • CUMULUS-2235
      • Added initial ingest load spec test/utility
    • CUMULUS-2167
      • Added logic to data-migration2 Lambda for migrating Granule records from
        DynamoDB to PostgreSQL and parse Granule records to store File records in
        RDS.
    • CUMULUS-2367
      • Added granules_executions table to PostgreSQL schema to allow for a
        many-to-many relationship between granules and executions
        • The table refers to granule and execution records using foreign keys
          defined with ON CASCADE DELETE, which means that any time a granule or
          execution record is deleted, all of the records in the
          granules_executions table referring to that record will also be
          deleted.
      • Added upsertGranuleWithExecutionJoinRecord helper to @cumulus/db to
        allow for upserting a granule record and its corresponding
        granules_execution record
    • CUMULUS-2128
      • Added helper functions:
        • @cumulus/db/translate/file/translateApiFiletoPostgresFile
        • @cumulus/db/translate/file/translateApiGranuletoPostgresGranule
        • @cumulus/message/Providers/getMessageProvider
    • CUMULUS-2190
      • Added helper functions:
        • @cumulus/message/Executions/getMessageExecutionOriginalPayload
        • @cumulus/message/Executions/getMessageExecutionFinalPayload
        • @cumulus/message/workflows/getMessageWorkflowTasks
        • @cumulus/message/workflows/getMessageWorkflowStartTime
        • @cumulus/message/workflows/getMessageWorkflowStopTime
        • @cumulus/message/workflows/getMessageWorkflowName
    • CUMULUS-2192
      • Added helper functions:
        • @cumulus/message/PDRs/getMessagePdrRunningExecutions
        • @cumulus/message/PDRs/getMessagePdrCompletedExecutions
        • @cumulus/message/PDRs/getMessagePdrFailedExecutions
        • @cumulus/message/PDRs/getMessagePdrStats
        • @cumulus/message/PDRs/getPdrPercentCompletion
        • @cumulus/message/workflows/getWorkflowDuration
    • CUMULUS-2199
      • Added translateApiRuleToPostgresRule to @cumulus/db to translate API
        Rule to conform to Postgres Rule definition.
    • CUMUlUS-2128
      • Added "upsert" logic to the sfEventSqsToDbRecords Lambda for granule and
        file writes to the core PostgreSQL database
    • CUMULUS-2199
      • Updated Rules endpoint to write rules to core PostgreSQL database in
        addition to DynamoDB and to delete rules from the PostgreSQL database in
        addition to DynamoDB.
      • Updated create in Rules Model to take in optional createdAt parameter
        which sets the value of createdAt if not specified during function call.
    • CUMULUS-2189
      • Updated Provider endpoint logic to write providers in parallel to Core
        PostgreSQL database
      • Update integration tests to utilize API calls instead of direct
        api/model/Provider calls
    • CUMULUS-2191
      • Updated cumuluss/async-operation task to write async-operations to the
        PostgreSQL database.
    • CUMULUS-2228
      • Added logic to the sfEventSqsToDbRecords Lambda to write execution, PDR,
        and granule records to the core PostgreSQL database in parallel with
        writes to DynamoDB
    • CUMUlUS-2190
      • Added "upsert" logic to the sfEventSqsToDbRecords Lambda for PDR writes
        to the core PostgreSQL database
    • CUMUlUS-2192
      • Added "upsert" logic to the sfEventSqsToDbRecords Lambda for execution
        writes to the core PostgreSQL database
    • CUMULUS-2187
      • The async-operations endpoint will now omit output instead of
        returning none when the operation did not return output.
    • CUMULUS-2167
      • Change PostgreSQL schema definition for files to remove filename and
        name and only support file_name.
      • Change PostgreSQL schema definition for files to remove size to only
        support file_size.
      • Change PostgresFile to remove duplicate fields filename and name and
        rename size to file_size.
    • CUMULUS-2266
      • Change sf-event-sqs-to-db-records behavior to discard and not throw an
        error on an out-of-order/delayed message so as not to have it be sent to
        the DLQ.
    • CUMULUS-2305
      • Changed DELETE /pdrs/{pdrname} API behavior to also delete record from
        PostgreSQL database.
    • CUMULUS-2309
      • Changed DELETE /granules/{granuleName} API behavior to also delete
        record from PostgreSQL database.
      • Changed Bulk operation BULK_GRANULE_DELETE API behavior to also delete
        records from PostgreSQL database.
    • CUMULUS-2367
      • Updated granule_cumulus_id foreign key to granule in PostgreSQL files
        table to use a CASCADE delete, so records in the files table are
        automatically deleted by the database when the corresponding granule is
        deleted.
    • CUMULUS-2407
      • Updated data-migration1 and data-migration2 Lambdas to use UPSERT instead
        of UPDATE when migrating dynamoDB records to PostgreSQL.
      • Changed data-migration1 and data-migration2 logic to only update already
        migrated records if the incoming record update has a newer timestamp
    • CUMULUS-2329
      • Add write-db-dlq-records-to-s3 lambda.
      • Add terraform config to automatically write db records DLQ messages to an
        s3 archive on the system bucket.
      • Add unit tests and a component spec test for the above.
    • CUMULUS-2380
      • Add process-dead-letter-archive lambda to pick up and process dead letters in the S3 system bucket dead letter archive.
      • Add /deadLetterArchive/recoverCumulusMessages endpoint to trigger an async operation to leverage this capability on demand.
      • Add unit tests and integration test for all of the above.
    • CUMULUS-2406
      • Updated parallel write logic to ensure that updatedAt/updated_at
        timestamps are the same in Dynamo/PG on record write for the following
        data types:
        • async operations
        • granules
        • executions
        • PDRs
    • CUMULUS-2446
      • Remove schema validation check against DynamoDB table for collections when
        migrating records from DynamoDB to core PostgreSQL database.
    • CUMULUS-2447
      • Changed translateApiAsyncOperationToPostgresAsyncOperation to call
        JSON.stringify and then JSON.parse on output.
    • CUMULUS-2313
      • Added postgres-migration-async-operation lambda to start an ECS task to
        run a the data-migration2 lambda.
      • Updated async_operations table to include Data Migration 2 as a new
        operation_type.
      • Updated cumulus-tf/variables.tf to include optional_dynamo_tables that
        will be merged with dynamo_tables.
    • CUMULUS-2451
      • Added summary type file packages/db/src/types/summary.ts with
        MigrationSummary and DataMigration1 and DataMigration2 types.
      • Updated data-migration1 and data-migration2 lambdas to return
        MigrationSummary objects.
      • Added logging for every batch of 100 records processed for executions,
        granules and files, and PDRs.
      • Removed RecordAlreadyMigrated logs in data-migration1 and
        data-migration2
    • CUMULUS-2452
      • Added support for only migrating certain granules by specifying the
        granuleSearchParams.granuleId or granuleSearchParams.collectionId
        properties in the payload for the
        <prefix>-postgres-migration-async-operation Lambda
      • Added support for only running certain migrations for data-migration2 by
        specifying the migrationsList property in the payload for the
        <prefix>-postgres-migration-async-operation Lambda
    • CUMULUS-2453
      • Created storeErrors function which stores errors in system bucket.
      • Updated executions and granulesAndFiles data migrations to call storeErrors to store migration errors.
      • Added system_bucket variable to data-migration2.
    • CUMULUS-2455
      • Move granules API endpoint records move updates for migrated granule files
        if writing any of the granule files fails.
    • CUMULUS-2468
      • Added support for doing DynamoDB parallel scanning for executions and granules migrations to improve performance. The behavior of the parallel scanning and writes can be controlled via the following properties on the event input to the <prefix>-postgres-migration-async-operation Lambda:
        • granuleMigrationParams.parallelScanSegments: How many segments to divide your granules DynamoDB table into for parallel scanning
        • granuleMigrationParams.parallelScanLimit: The maximum number of granule records to evaluate for each parallel scanning segment of the DynamoDB table
        • granuleMigrationParams.writeConcurrency: The maximum number of concurrent granule/file writes to perform to the PostgreSQL database across all DynamoDB segments
        • executionMigrationParams.parallelScanSegments: How many segments to divide your executions DynamoDB table into for parallel scanning
        • executionMigrationParams.parallelScanLimit: The maximum number of execution records to evaluate for each parallel scanning segment of the DynamoDB table
        • executionMigrationParams.writeConcurrency: The maximum number of concurrent execution writes to perform to the PostgreSQL database across all DynamoDB segments
    • CUMULUS-2468 - Added @cumulus/aws-client/DynamoDb.parallelScan helper to perform parallel scanning on DynamoDb tables
    • CUMULUS-2507
      • Updated granule record write logic to set granule status to failed in both Postgres and DynamoDB if any/all of its files fail to write to the database.

Deprecated

  • CUMULUS-2185 - RDS Migration Epic
    • CUMULUS-2455
      • @cumulus/ingest/moveGranuleFiles