You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
I would like to use data-prepper to full load load/export and ingest change data capture events from AWS DocumentDB.
Describe the solution you'd like
Support DocumentDB Source to do a full scan of AWS DocumentDB collection that would export the entire collection data to Opensearch Sink. The DocumentDB Source will also read the DocumentDB stream data and would ingest any change data capture events to Opensearch Sink. For the full load, the source will implement a partition supplier that would partition the collection into multiple query partition and will do scans in parallel.
Describe alternatives you've considered (Optional)
Support Kafka Connect with Debezium mongodb connector plugins
documentdb-pipeline:
source:
documentdb:
acknowledgments: true
host: "<<docdb-2024-01-03-20-31-17.cluster-abcdef.us-east-1.docdb.amazonaws.com>>"
port: 27017
authentication:
username: ${{aws_secrets:secret:username}}
password: ${{aws_secrets:secret:password}}
aws:
sts_role_arn: "<<arn:aws:iam::123456789012:role/Example-Role>>"
# If id_key is specified, new key with docdb_id that matches the data from _id will be created
# id_key: "docdb_id"
s3_bucket: "<<bucket-name>>"
s3_region: "<<bucket-region>>"
# optional s3_prefix for Opensearch ingestion to write the temporary data
s3_prefix: "<<path_prefix>>"
collections:
# collection format: <databaseName>.<collectionName>
- collection: "<<dbname.collection1>>"
export: true
stream: true
The text was updated successfully, but these errors were encountered:
Is your feature request related to a problem? Please describe.
I would like to use data-prepper to full load load/export and ingest change data capture events from AWS DocumentDB.
Describe the solution you'd like
Support DocumentDB Source to do a full scan of AWS DocumentDB collection that would export the entire collection data to Opensearch Sink. The DocumentDB Source will also read the DocumentDB stream data and would ingest any change data capture events to Opensearch Sink. For the full load, the source will implement a partition supplier that would partition the collection into multiple query partition and will do scans in parallel.
Describe alternatives you've considered (Optional)
Support Kafka Connect with Debezium mongodb connector plugins
Additional context
Sample DocumentDB source configuration:
The text was updated successfully, but these errors were encountered: