Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

DMS Endpoint Re-design #23506

Open
YakDriver opened this issue Mar 4, 2022 · 1 comment 路 May be fixed by #23507
Open

DMS Endpoint Re-design #23506

YakDriver opened this issue Mar 4, 2022 · 1 comment 路 May be fixed by #23507
Labels
enhancement Requests to existing resources that expand the functionality or scope. service/dms Issues and PRs that pertain to the dms service.
Milestone

Comments

@YakDriver
Copy link
Member

YakDriver commented Mar 4, 2022

Community Note

  • Please vote on this issue by adding a 馃憤 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Description

Driving Force: AWS has hinted (in the Console) that it will be deprecating extra_connection_attributes. In addition, it is bad (see below) having two ways of providing the exact same information: 1) using a single long string providing many attributes (e.g., extra_connection_attributes = "bucketFolder=value;bucketName=value;...) that maps to 2) many Terraform arguments (e.g., s3_settings.0.bucket_folder = "value" and s3_settings.0.bucket_name = "value").

Problems:

  • It causes many diffs as the values AWS wants vary slightly depending on which way you use (e.g., extra_connection_attributes wants dataFormat=PARQUET_2_0 while S3Settings.DataFormat wants parquet-2-0) but then the provider is trying to marry them up. In addition, e.g., AWS only accepts all caps for some values but returns all lowercase.
  • Handling the complexities between extra_connection_attributes and individual arguments is error prone. In addition, the mess that comes with handling so many different endpoint types and arguments in one resource is error prone.
  • This approach is confusing because it varies significantly from the normal practitioner experience in Terraform.

Previous Design: Before, we could fit a bunch of endpoints into the same resource because we did not have individual Terraform arguments corresponding to each of the extra_connection_attributes. Without extra_connection_attributes, there are about 259 endpoint-specific attributes (159 unique attributes).

New Design Plan:

  1. We add new endpoint-specific resources (see below), each of which will have a reasonable number of arguments.
  2. The new resources will not have extra_connection_attributes. This means each attribute will need a Terraform argument. This allows us to avoid the mapping back and forth and makes it easier to handle the AWS API inconsistencies (e.g., caps to lowercase).
  3. The current endpoint (aws_dms_endpoint) will continue as before but we will deprecate these arguments: elasticsearch_settings, extra_connection_attributes, kafka_settings, kinesis_settings, mongodb_settings, and s3_settings.
  4. The current endpoint resource will still be used for some endpoint types because they either have no corresponding extra_connection_attributes or use only the set of common arguments: aurora, azuredb, db2, dynamodb, mariadb, and sybase.

New or Affected Resource(s)

  • aws_dms_dms_transfer_endpoint
  • aws_dms_db2_endpoint
  • aws_dms_docdb_endpoint
  • aws_dms_kafka_endpoint
  • aws_dms_kinesis_endpoint
  • aws_dms_mongodb_endpoint
  • aws_dms_mysql_endpoint
  • aws_dms_neptune_endpoint
  • aws_dms_opensearch_endpoint
  • aws_dms_oracle_endpoint
  • aws_dms_postgres_endpoint
  • aws_dms_redis_endpoint
  • aws_dms_redshift_endpoint
  • aws_dms_s3_endpoint
  • aws_dms_sqlserver_endpoint

Potential Terraform Configuration

resource "aws_dms_s3_endpoint" "test" {
  endpoint_id   = "example"
  endpoint_type = "target"
  ssl_mode      = "none"

  tags = {
    Name   = "example"
    Update = "to-update"
    Remove = "to-remove"
  }

  add_column_name                             = true
  bucket_folder                               = "folder"
  bucket_name                                 = "updated_name"
  canned_acl_for_objects                      = "private"
  cdc_inserts_and_updates                     = true
  cdc_max_batch_interval                      = 100
  cdc_min_file_size                           = 16
  cdc_path                                    = "cdc/path"
  compression_type                            = "GZIP"
  csv_delimiter                               = ";"
  csv_no_sup_value                            = "x"
  csv_null_value                              = "?"
  csv_row_delimiter                           = "\\r\\n"
  data_format                                 = "parquet"
  data_page_size                              = 1100000
  date_partition_delimiter                    = "SLASH"
  date_partition_enabled                      = true
  date_partition_sequence                     = "yyyymmddhh"
  date_partition_timezone                     = "America/Eastern"
  dict_page_size_limit                        = 1000000
  enable_statistics                           = false
  encoding_type                               = "plain"
  encryption_mode                             = "SSE_S3"
  external_table_definition                   = "etd"
  ignore_header_rows                          = 1
  include_op_for_full_load                    = true
  max_file_size                               = 1000000
  parquet_timestamp_in_millisecond            = true
  parquet_version                             = "parquet-2-0"
  preserve_transactions                       = false
  rfc_4180                                    = true
  row_group_length                            = 11000
  service_access_role_arn                     = aws_iam_role.iam_role.arn
  timestamp_column_name                       = "tx_commit_time"
  use_csv_no_sup_value                        = true
  use_task_start_time_for_full_load_timestamp = true

  depends_on = [aws_iam_role_policy.dms_s3_access]
}

References

@YakDriver YakDriver added the enhancement Requests to existing resources that expand the functionality or scope. label Mar 4, 2022
@YakDriver YakDriver linked a pull request Mar 4, 2022 that will close this issue
@ewbankkit ewbankkit added the breaking-change Introduces a breaking change in current functionality; usually deferred to the next major release. label May 31, 2022
@ewbankkit ewbankkit added this to the v5.0.0 milestone May 31, 2022
@ewbankkit ewbankkit added the service/dms Issues and PRs that pertain to the dms service. label Jun 1, 2022
@breathingdust breathingdust removed the breaking-change Introduces a breaking change in current functionality; usually deferred to the next major release. label Feb 17, 2023
@johnsonaj johnsonaj modified the milestones: v5.0.0, v6.0.0 May 23, 2023
@samoclay
Copy link

samoclay commented Jun 6, 2023

would be great to hear any news on when this could be available

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Requests to existing resources that expand the functionality or scope. service/dms Issues and PRs that pertain to the dms service.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants