v9.0.1
laurenfrederick
released this
07 May 21:41
·
2032 commits
to master
since this release
Cumulus v9.0.1 Migration steps
- This release of Cumulus enables integration with a PostgreSQL database for archiving Cumulus data. There are several upgrade steps involved, some of which need to be done before redeploying Cumulus. See the documentation on upgrading to the RDS release.
BREAKING CHANGES
- CUMULUS-2185 - RDS Migration Epic
- CUMULUS-2191
- Removed the following from the
@cumulus/api/models.asyncOperation
class in
favor of the added@cumulus/async-operations
module:start
startAsyncOperations
- Removed the following from the
- CUMULUS-2187
- The
async-operations
endpoint will now omitoutput
instead of
returningnone
when the operation did not return output.
- The
- CUMULUS-2309
- Removed
@cumulus/api/models/granule.unpublishAndDeleteGranule
in favor
of@cumulus/api/lib/granule-remove-from-cmr.unpublishGranule
and
@cumulus/api/lib/granule-delete.deleteGranuleAndFiles
.
- Removed
- CUMULUS-2385
- Updated
sf-event-sqs-to-db-records
to write a granule's files to
PostgreSQL only after the workflow has exited theRunning
status.
Please note that any workflow that usessf_sqs_report_task
for
mid-workflow updates will be impacted. - Changed PostgreSQL
file
schema and TypeScript type definition to require
bucket
andkey
fields. - Updated granule/file write logic to mark a granule's status as "failed"
- Updated
- CUMULUS-2455
- API
move granule
endpoint now moves granule files on a per-file basis - API
move granule
endpoint on granule file move failure will retain the
file at it's original location, but continue to move any other granule
files. - Removed the
move
method from the@cumulus/api/models.granule
class.
logic is now handled in@cumulus/api/endpoints/granules
and is
accessible via the Core API.
- API
- CUMULUS-2191
Added
- CUMULUS-2185 - RDS Migration Epic
- CUMULUS-2130
- Added postgres-migration-count-tool lambda/ECS task to allow for
evaluation of database state - Added /migrationCounts api endpoint that allows running of the
postgres-migration-count-tool as an asyncOperation
- Added postgres-migration-count-tool lambda/ECS task to allow for
- CUMULUS-2394
- Updated PDR and Granule writes to check the step function
workflow_start_time against the createdAt field for each record to ensure
old records do not overwrite newer ones for legacy Dynamo and PostgreSQL
writes
- Updated PDR and Granule writes to check the step function
- CUMULUS-2188
- Added
data-migration2
Lambda to be run afterdata-migration1
- Added logic to
data-migration2
Lambda for migrating execution records
from DynamoDB to PostgreSQL
- Added
- CUMULUS-2191
- Added
@cumulus/async-operations
to core packages, exposing
startAsyncOperation
which will handle starting an async operation and
adding an entry to both PostgreSQL and DynamoDb
- Added
- CUMULUS-2127
- Add schema migration for
collections
table
- Add schema migration for
- CUMULUS-2129
- Added logic to
data-migration1
Lambda for migrating collection records
from Dynamo to PostgreSQL
- Added logic to
- CUMULUS-2157
- Add schema migration for
providers
table - Added logic to
data-migration1
Lambda for migrating provider records
from Dynamo to PostgreSQL
- Add schema migration for
- CUMULUS-2187
- Added logic to
data-migration1
Lambda for migrating async operation
records from Dynamo to PostgreSQL
- Added logic to
- CUMULUS-2198
- Added logic to
data-migration1
Lambda for migrating rule records from
DynamoDB to PostgreSQL
- Added logic to
- CUMULUS-2182
- Add schema migration for PDRs table
- CUMULUS-2230
- Add schema migration for
rules
table
- Add schema migration for
- CUMULUS-2183
- Add schema migration for
asyncOperations
table
- Add schema migration for
- CUMULUS-2184
- Add schema migration for
executions
table
- Add schema migration for
- CUMULUS-2257
- Updated PostgreSQL table and column names to snake_case
- Added
translateApiAsyncOperationToPostgresAsyncOperation
function to@cumulus/db
- CUMULUS-2186
- Added logic to
data-migration2
Lambda for migrating PDR records from
DynamoDB to PostgreSQL
- Added logic to
- CUMULUS-2235
- Added initial ingest load spec test/utility
- CUMULUS-2167
- Added logic to
data-migration2
Lambda for migrating Granule records from
DynamoDB to PostgreSQL and parse Granule records to store File records in
RDS.
- Added logic to
- CUMULUS-2367
- Added
granules_executions
table to PostgreSQL schema to allow for a
many-to-many relationship between granules and executions- The table refers to granule and execution records using foreign keys
defined with ON CASCADE DELETE, which means that any time a granule or
execution record is deleted, all of the records in the
granules_executions
table referring to that record will also be
deleted.
- The table refers to granule and execution records using foreign keys
- Added
upsertGranuleWithExecutionJoinRecord
helper to@cumulus/db
to
allow for upserting a granule record and its corresponding
granules_execution
record
- Added
- CUMULUS-2128
- Added helper functions:
@cumulus/db/translate/file/translateApiFiletoPostgresFile
@cumulus/db/translate/file/translateApiGranuletoPostgresGranule
@cumulus/message/Providers/getMessageProvider
- Added helper functions:
- CUMULUS-2190
- Added helper functions:
@cumulus/message/Executions/getMessageExecutionOriginalPayload
@cumulus/message/Executions/getMessageExecutionFinalPayload
@cumulus/message/workflows/getMessageWorkflowTasks
@cumulus/message/workflows/getMessageWorkflowStartTime
@cumulus/message/workflows/getMessageWorkflowStopTime
@cumulus/message/workflows/getMessageWorkflowName
- Added helper functions:
- CUMULUS-2192
- Added helper functions:
@cumulus/message/PDRs/getMessagePdrRunningExecutions
@cumulus/message/PDRs/getMessagePdrCompletedExecutions
@cumulus/message/PDRs/getMessagePdrFailedExecutions
@cumulus/message/PDRs/getMessagePdrStats
@cumulus/message/PDRs/getPdrPercentCompletion
@cumulus/message/workflows/getWorkflowDuration
- Added helper functions:
- CUMULUS-2199
- Added
translateApiRuleToPostgresRule
to@cumulus/db
to translate API
Rule to conform to Postgres Rule definition.
- Added
- CUMUlUS-2128
- Added "upsert" logic to the
sfEventSqsToDbRecords
Lambda for granule and
file writes to the core PostgreSQL database
- Added "upsert" logic to the
- CUMULUS-2199
- Updated Rules endpoint to write rules to core PostgreSQL database in
addition to DynamoDB and to delete rules from the PostgreSQL database in
addition to DynamoDB. - Updated
create
in Rules Model to take in optionalcreatedAt
parameter
which sets the value of createdAt if not specified during function call.
- Updated Rules endpoint to write rules to core PostgreSQL database in
- CUMULUS-2189
- Updated Provider endpoint logic to write providers in parallel to Core
PostgreSQL database - Update integration tests to utilize API calls instead of direct
api/model/Provider calls
- Updated Provider endpoint logic to write providers in parallel to Core
- CUMULUS-2191
- Updated cumuluss/async-operation task to write async-operations to the
PostgreSQL database.
- Updated cumuluss/async-operation task to write async-operations to the
- CUMULUS-2228
- Added logic to the
sfEventSqsToDbRecords
Lambda to write execution, PDR,
and granule records to the core PostgreSQL database in parallel with
writes to DynamoDB
- Added logic to the
- CUMUlUS-2190
- Added "upsert" logic to the
sfEventSqsToDbRecords
Lambda for PDR writes
to the core PostgreSQL database
- Added "upsert" logic to the
- CUMUlUS-2192
- Added "upsert" logic to the
sfEventSqsToDbRecords
Lambda for execution
writes to the core PostgreSQL database
- Added "upsert" logic to the
- CUMULUS-2187
- The
async-operations
endpoint will now omitoutput
instead of
returningnone
when the operation did not return output.
- The
- CUMULUS-2167
- Change PostgreSQL schema definition for
files
to removefilename
and
name
and only supportfile_name
. - Change PostgreSQL schema definition for
files
to removesize
to only
supportfile_size
. - Change
PostgresFile
to remove duplicate fieldsfilename
andname
and
renamesize
tofile_size
.
- Change PostgreSQL schema definition for
- CUMULUS-2266
- Change
sf-event-sqs-to-db-records
behavior to discard and not throw an
error on an out-of-order/delayed message so as not to have it be sent to
the DLQ.
- Change
- CUMULUS-2305
- Changed
DELETE /pdrs/{pdrname}
API behavior to also delete record from
PostgreSQL database.
- Changed
- CUMULUS-2309
- Changed
DELETE /granules/{granuleName}
API behavior to also delete
record from PostgreSQL database. - Changed
Bulk operation BULK_GRANULE_DELETE
API behavior to also delete
records from PostgreSQL database.
- Changed
- CUMULUS-2367
- Updated
granule_cumulus_id
foreign key to granule in PostgreSQLfiles
table to use a CASCADE delete, so records in the files table are
automatically deleted by the database when the corresponding granule is
deleted.
- Updated
- CUMULUS-2407
- Updated data-migration1 and data-migration2 Lambdas to use UPSERT instead
of UPDATE when migrating dynamoDB records to PostgreSQL. - Changed data-migration1 and data-migration2 logic to only update already
migrated records if the incoming record update has a newer timestamp
- Updated data-migration1 and data-migration2 Lambdas to use UPSERT instead
- CUMULUS-2329
- Add
write-db-dlq-records-to-s3
lambda. - Add terraform config to automatically write db records DLQ messages to an
s3 archive on the system bucket. - Add unit tests and a component spec test for the above.
- Add
- CUMULUS-2380
- Add
process-dead-letter-archive
lambda to pick up and process dead letters in the S3 system bucket dead letter archive. - Add
/deadLetterArchive/recoverCumulusMessages
endpoint to trigger an async operation to leverage this capability on demand. - Add unit tests and integration test for all of the above.
- Add
- CUMULUS-2406
- Updated parallel write logic to ensure that updatedAt/updated_at
timestamps are the same in Dynamo/PG on record write for the following
data types:- async operations
- granules
- executions
- PDRs
- Updated parallel write logic to ensure that updatedAt/updated_at
- CUMULUS-2446
- Remove schema validation check against DynamoDB table for collections when
migrating records from DynamoDB to core PostgreSQL database.
- Remove schema validation check against DynamoDB table for collections when
- CUMULUS-2447
- Changed
translateApiAsyncOperationToPostgresAsyncOperation
to call
JSON.stringify
and thenJSON.parse
on output.
- Changed
- CUMULUS-2313
- Added
postgres-migration-async-operation
lambda to start an ECS task to
run a thedata-migration2
lambda. - Updated
async_operations
table to includeData Migration 2
as a new
operation_type
. - Updated
cumulus-tf/variables.tf
to includeoptional_dynamo_tables
that
will be merged withdynamo_tables
.
- Added
- CUMULUS-2451
- Added summary type file
packages/db/src/types/summary.ts
with
MigrationSummary
andDataMigration1
andDataMigration2
types. - Updated
data-migration1
anddata-migration2
lambdas to return
MigrationSummary
objects. - Added logging for every batch of 100 records processed for executions,
granules and files, and PDRs. - Removed
RecordAlreadyMigrated
logs indata-migration1
and
data-migration2
- Added summary type file
- CUMULUS-2452
- Added support for only migrating certain granules by specifying the
granuleSearchParams.granuleId
orgranuleSearchParams.collectionId
properties in the payload for the
<prefix>-postgres-migration-async-operation
Lambda - Added support for only running certain migrations for data-migration2 by
specifying themigrationsList
property in the payload for the
<prefix>-postgres-migration-async-operation
Lambda
- Added support for only migrating certain granules by specifying the
- CUMULUS-2453
- Created
storeErrors
function which stores errors in system bucket. - Updated
executions
andgranulesAndFiles
data migrations to callstoreErrors
to store migration errors. - Added
system_bucket
variable todata-migration2
.
- Created
- CUMULUS-2455
- Move granules API endpoint records move updates for migrated granule files
if writing any of the granule files fails.
- Move granules API endpoint records move updates for migrated granule files
- CUMULUS-2468
- Added support for doing DynamoDB parallel scanning for
executions
andgranules
migrations to improve performance. The behavior of the parallel scanning and writes can be controlled via the following properties on the event input to the<prefix>-postgres-migration-async-operation
Lambda:granuleMigrationParams.parallelScanSegments
: How many segments to divide your granules DynamoDB table into for parallel scanninggranuleMigrationParams.parallelScanLimit
: The maximum number of granule records to evaluate for each parallel scanning segment of the DynamoDB tablegranuleMigrationParams.writeConcurrency
: The maximum number of concurrent granule/file writes to perform to the PostgreSQL database across all DynamoDB segmentsexecutionMigrationParams.parallelScanSegments
: How many segments to divide your executions DynamoDB table into for parallel scanningexecutionMigrationParams.parallelScanLimit
: The maximum number of execution records to evaluate for each parallel scanning segment of the DynamoDB tableexecutionMigrationParams.writeConcurrency
: The maximum number of concurrent execution writes to perform to the PostgreSQL database across all DynamoDB segments
- Added support for doing DynamoDB parallel scanning for
- CUMULUS-2468 - Added
@cumulus/aws-client/DynamoDb.parallelScan
helper to perform parallel scanning on DynamoDb tables - CUMULUS-2507
- Updated granule record write logic to set granule status to
failed
in both Postgres and DynamoDB if any/all of its files fail to write to the database.
- Updated granule record write logic to set granule status to
- CUMULUS-2130
Deprecated
- CUMULUS-2185 - RDS Migration Epic
- CUMULUS-2455
@cumulus/ingest/moveGranuleFiles
- CUMULUS-2455