Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Amazon OpenSearch Service: High Level Constructs For OpenSearch MultiAZWithStandBy Feature #26026

Closed
2 tasks
AmanRajAWS opened this issue Jun 17, 2023 · 8 comments · Fixed by #26082
Closed
2 tasks
Labels
@aws-cdk/aws-opensearch Related to the @aws-cdk/aws-opensearchservice package effort/medium Medium work item – several days of effort feature-request A feature should be added or improved. good first issue Related to contributions. See CONTRIBUTING.md p2

Comments

@AmanRajAWS
Copy link

Describe the feature

The OpenSearch Team has recently launched MultiAZWithStandby Feature for OpenSearch Domain. Feature Documentation Link. According to the CDK docs currently there is no high level CDK construct for this feature.CDK Docs link for the MultiAZWIthStandbyFeature

Use Case

The general recommendation is to use high level constructs and due to lack of High level construct for this feature, the CDK template needs to be migrated to use CFN constructs if there is a need to Create an OpenSearchDomain with MultiAZWithStandBy. This serves as a hinderance to the adoption of the MultiAZWithStandBy feature for AWS OpenSearch Domains.

Proposed Solution

Add the MultiAZWithStandByAttribute to the capacityConfig object CDK Doc for CapacityConfig Attribute .
A MultiAZWithStandBy based OpenSearchDomain created using CDK high level constructs should look like
const domain = new Domain(this, 'Domain', {
version: EngineVersion.OPENSEARCH_1_0,
capacity: {
masterNodes: 3,
multiAZWithStandbyEnabled: true (possible values: [true,false])
},
});

Other Information

No response

Acknowledgements

  • I may be able to implement this feature request
  • This feature might incur a breaking change

CDK version used

2.84.0

Environment details (OS name and version, etc.)

macOS Ventura 13.4

@AmanRajAWS AmanRajAWS added feature-request A feature should be added or improved. needs-triage This issue or PR still needs to be triaged. labels Jun 17, 2023
@github-actions github-actions bot added the @aws-cdk/aws-opensearch Related to the @aws-cdk/aws-opensearchservice package label Jun 17, 2023
@pahud
Copy link
Contributor

pahud commented Jun 19, 2023

Is this feature supported by cloudformation? If not, it sounds like we need to create a L3(or at least L2.5) construct for that?

@pahud pahud added p2 effort/medium Medium work item – several days of effort and removed needs-triage This issue or PR still needs to be triaged. labels Jun 19, 2023
@AmanRajAWS
Copy link
Author

yes this feature is supported by cloudformation
docs link

@peterwoodworth peterwoodworth added the good first issue Related to contributions. See CONTRIBUTING.md label Jun 19, 2023
@peterwoodworth
Copy link
Contributor

You should be able to easily add this now to your template with an escape hatch.

Labeled as good first issue, all we need to do is add a boolean prop it appears

lpizzinidev added a commit to lpizzinidev/aws-cdk that referenced this issue Jun 29, 2023
lpizzinidev added a commit to lpizzinidev/aws-cdk that referenced this issue Jun 29, 2023
lpizzinidev added a commit to lpizzinidev/aws-cdk that referenced this issue Jun 30, 2023
lpizzinidev added a commit to lpizzinidev/aws-cdk that referenced this issue Jul 7, 2023
lpizzinidev added a commit to lpizzinidev/aws-cdk that referenced this issue Jul 7, 2023
lpizzinidev added a commit to lpizzinidev/aws-cdk that referenced this issue Jul 8, 2023
mergify bot added a commit to lpizzinidev/aws-cdk that referenced this issue Jul 17, 2023
lpizzinidev added a commit to lpizzinidev/aws-cdk that referenced this issue Jul 17, 2023
@mergify mergify bot closed this as completed in #26082 Jul 17, 2023
mergify bot pushed a commit that referenced this issue Jul 17, 2023
…e flag) (#26082)

This fix adds support for the [`MultiAZWithStandbyEnabled`](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-properties-opensearchservice-domain-clusterconfig.html#:~:text=%3A%20No%20interruption-,MultiAZWithStandbyEnabled,Update%20requires%3A%20No%20interruption,-WarmCount) flag to the [`CapacityConfig`](https://docs.aws.amazon.com/cdk/api/v2/docs/aws-cdk-lib.aws_opensearchservice.CapacityConfig.html) interface.

If enabled, the `ENABLE_OPENSEARCH_MULTIAZ_WITH_STANDBY` feature flag set the default value of `MultiAZWithStandbyEnabled` to `true`

Closes #26026.

----

*By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*
@github-actions
Copy link

⚠️COMMENT VISIBILITY WARNING⚠️

Comments on closed issues are hard for our team to see.
If you need more assistance, please either tag a team member or open a new issue that references this one.
If you wish to keep having a conversation with other community members under this issue feel free to do so.

bmoffatt pushed a commit to bmoffatt/aws-cdk that referenced this issue Jul 29, 2023
…e flag) (aws#26082)

This fix adds support for the [`MultiAZWithStandbyEnabled`](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-properties-opensearchservice-domain-clusterconfig.html#:~:text=%3A%20No%20interruption-,MultiAZWithStandbyEnabled,Update%20requires%3A%20No%20interruption,-WarmCount) flag to the [`CapacityConfig`](https://docs.aws.amazon.com/cdk/api/v2/docs/aws-cdk-lib.aws_opensearchservice.CapacityConfig.html) interface.

If enabled, the `ENABLE_OPENSEARCH_MULTIAZ_WITH_STANDBY` feature flag set the default value of `MultiAZWithStandbyEnabled` to `true`

Closes aws#26026.

----

*By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*
@tmokmss
Copy link
Contributor

tmokmss commented Mar 25, 2024

@AmanRajAWS
Hi, I always get the following error with multiAZWithStandbyEnabled: true. But I think Auto-Tune feature cannot be controlled from CloudFormation for now. Should multiAZWithStandbyEnabled be true by default then? I'm not sure how this option can be used.

Resource handler returned message: "Invalid request provided: You must turn on Auto-Tune for domains with standby. 
(Service: OpenSearch, Status Code: 400, Request ID: eac0100d-5014-43b3-8af0-84e2d0e10197)" (RequestToken: 51cb5e17-12ae-508e-0eb8-de5a14a75639, HandlerErrorCode: InvalidRequest)

@Aman199825
Copy link

@tmokmss AutoTuneOptions should be enabled by default on the Cluster can you please share the CDK construct or CFN template where you are facing this issue

@tmokmss
Copy link
Contributor

tmokmss commented Mar 25, 2024

@Aman199825 Thanks. I also found the doc explaining that:

OpenSearch Service enables Auto-Tune by default on new domains.
https://docs.aws.amazon.com/opensearch-service/latest/developerguide/auto-tune.html#auto-tune-enable

So I checked my configuration and turned out that I'm using t3.medium.search as instance type, which does not support Multi-AZ with standby feature:

Multi-AZ with Standby only works with the m5, c5, r5, r6g, c6g, m6g, r6gd and i3 instance types.
https://docs.aws.amazon.com/opensearch-service/latest/developerguide/managedomains-multiaz.html

I think this is the reason why I'm seeing the error. I also think CDK should validate this limitation, so I'll submit a PR or issue for this.

code to reproduce:

import * as cdk from 'aws-cdk-lib';
import { Domain, EngineVersion } from 'aws-cdk-lib/aws-opensearchservice';
import { EbsDeviceVolumeType, Vpc } from 'aws-cdk-lib/aws-ec2';

const stack = new cdk.Stack(app, 'OpenSearchAutoTuneReproduce', {
  env: {
    region: 'ap-northeast-1',
  },
});

const vpc = new Vpc(stack, 'Vpc', {natGateways: 1});
const targetSubnets = vpc.privateSubnets;

new Domain(stack, 'Domain', {
  version: EngineVersion.OPENSEARCH_2_11,
  capacity: {
    dataNodeInstanceType: 't3.medium.search',
    dataNodes: targetSubnets.length,
    multiAzWithStandbyEnabled: true, // not usable with t3
  },
  zoneAwareness: {
    enabled: true,
    availabilityZoneCount: targetSubnets.length,
  },
  ebs: {
    volumeSize: 30,
    volumeType: EbsDeviceVolumeType.GP3,
    throughput: 125,
    iops: 3000,
  },
  enforceHttps: true,
  fineGrainedAccessControl: {
    masterUserName: 'admin',
  },
  nodeToNodeEncryption: true,
  encryptionAtRest: {
    enabled: true,
  },
  vpc,
  vpcSubnets: [{ subnets: targetSubnets }],
  logging: {
    auditLogEnabled: true,
    slowSearchLogEnabled: true,
    appLogEnabled: true,
    slowIndexLogEnabled: true,
  },
  removalPolicy: cdk.RemovalPolicy.DESTROY,
});

@tmokmss
Copy link
Contributor

tmokmss commented Mar 25, 2024

@Aman199825 Do you know if OR1 and Im4gn instance type supports multi-AZ with standby feature?

https://docs.aws.amazon.com/opensearch-service/latest/developerguide/supported-instance-types.html

It's not listed in this doc but I suspect the doc is wrong because these instance types looks new and there's no reason not to support the feature.

mergify bot pushed a commit that referenced this issue Mar 28, 2024
…with standby feature (#29607)

### Issue # (if applicable)
Related with #26026

### Reason for this change

#26082 enabled Multi-AZ with Standby by default, but deployment fails if we use t3 instance type, because it does not support the feature. To fail fast, this PR adds validation on synth time.

> Multi-AZ with Standby only works with the m5, c5, r5, r6g, c6g, m6g, r6gd and i3 instance types.
> https://docs.aws.amazon.com/opensearch-service/latest/developerguide/managedomains-multiaz.html

> You can use T3 instance types only if your domain is provisioned without standby.
> https://docs.aws.amazon.com/opensearch-service/latest/developerguide/supported-instance-types.html#latest-gen

### Description of changes

If the instance type of data node or master node is t3, throws an error.

I also considered to automatically set `multiAzWithStandbyEnabled: false` if we detect any t3 instance type, but it would introduce unwanted behavior e.g. in the below case:

```ts
// Initial state
// multiAzWithStandbyEnabled: true as there's no t3 instance type
 new Domain(stack, 'Domain', {
    version: engineVersion,
    capacity: {
      dataNodeInstanceType: 'r5.large.search',
    },
})

// Update domain to add master nodes with t3 instance type
new Domain(stack, 'Domain', {
    version: engineVersion,
    capacity: {
      dataNodeInstanceType: 'r5.large.search',
      masterNodeInstanceType: 't3.medium.search',
      masterNodes: 3,
    },
})

// multiAzWithStandbyEnabled suddenly become false!
```

so we just throw an error.

### Description of how you validated changes

Added some unit tests.

I also confirmed that it results in deployment error if we try to deploy with t3 instance type & `multiAzWithStandbyEnabled : true` for both data node and master node.

### Checklist
- [X] My code adheres to the [CONTRIBUTING GUIDE](https://github.com/aws/aws-cdk/blob/main/CONTRIBUTING.md) and [DESIGN GUIDELINES](https://github.com/aws/aws-cdk/blob/main/docs/DESIGN_GUIDELINES.md)

----

*By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
@aws-cdk/aws-opensearch Related to the @aws-cdk/aws-opensearchservice package effort/medium Medium work item – several days of effort feature-request A feature should be added or improved. good first issue Related to contributions. See CONTRIBUTING.md p2
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants