Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SPLAT-1220: Assign Carrier IP to launch EC2 in AWS Wavelength Zones #78

Merged
merged 4 commits into from Dec 2, 2023

Conversation

mtulio
Copy link
Contributor

@mtulio mtulio commented Jul 25, 2023

This PR introduces support of Carrier IP Address assignment when PublicIP flag is set to true in the Machine Configuration.

When PublicIP is set in Machines which is set to launch in subnets with zone type wavelength-zones, the RunInstance must set the AssociateCarrierIpAddress in the network interface configuration.

This change is part of the full support of AWS Wavelength on OCP as part of 'edge nodes'. It is required only when launching instances in public subnets in Wavelength Zones.

Research: https://issues.redhat.com/browse/SPLAT-1045
Tracking card: https://issues.redhat.com/browse/SPLAT-1220
Installer full IPI support for Wavelength Zones: openshift/installer#7369
Enhancement: openshift/enhancements#1510

@openshift-ci openshift-ci bot added do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. labels Jul 25, 2023
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jul 25, 2023

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@mtulio
Copy link
Contributor Author

mtulio commented Jul 25, 2023

Opening a draft to get CI signals.
/test all

@mtulio
Copy link
Contributor Author

mtulio commented Jul 26, 2023

Machine deployed successfully in Wavelength Zones using edge compute pool in public subnets, network resources [VPC+, and Carrier Gateway] created by installer (openshift/installer#7369), when setting the publicIp=True in the MachineSet:

  • oc get machineset lz-p2-319-public-edge-us-east-1-wl1-dfw-wlz-1:
...
      providerSpec:
        value:
          ami:
            id: ami-0aca3b90a763e77e8
          apiVersion: machine.openshift.io/v1beta1
          blockDevices:
          - ebs:
              encrypted: true
              iops: 0
              kmsKey:
                arn: ""
              volumeSize: 120
              volumeType: gp2
          credentialsSecret:
            name: aws-cloud-credentials
          deviceIndex: 0
          iamInstanceProfile:
            id: lz-p2-319-vgzrq-worker-profile
          instanceType: r5.2xlarge
          kind: AWSMachineProviderConfig
          metadata:
            creationTimestamp: null
          metadataServiceOptions: {}
          placement:
            availabilityZone: us-east-1-wl1-dfw-wlz-1
            region: us-east-1
          publicIp: true
          securityGroups:
          - filters:
            - name: tag:Name
              values:
              - lz-p2-319-vgzrq-worker-sg
          subnet:
            filters:
            - name: tag:Name
              values:
              - lz-p2-319-vgzrq-public-us-east-1-wl1-dfw-wlz-1
          tags:
          - name: kubernetes.io/cluster/lz-p2-319-vgzrq
            value: owned
          userDataSecret:
            name: worker-user-data
      taints:
      - effect: NoSchedule
        key: node-role.kubernetes.io/edge
  • watching machine creation
$ oc get machines -w
NAME                                                  PHASE          TYPE         REGION      ZONE                      AGE
lz-p2-319-public-edge-us-east-1-wl1-dfw-wlz-1-bmpg7   Provisioning   r5.2xlarge   us-east-1   us-east-1-wl1-dfw-wlz-1   4s
[...]
lz-p2-319-public-edge-us-east-1-wl1-dfw-wlz-1-bmpg7   Provisioning   r5.2xlarge   us-east-1   us-east-1-wl1-dfw-wlz-1   33s
lz-p2-319-public-edge-us-east-1-wl1-dfw-wlz-1-bmpg7   Provisioning   r5.2xlarge   us-east-1   us-east-1-wl1-dfw-wlz-1   33s
lz-p2-319-public-edge-us-east-1-wl1-dfw-wlz-1-bmpg7   Provisioned    r5.2xlarge   us-east-1   us-east-1-wl1-dfw-wlz-1   33s
lz-p2-319-public-edge-us-east-1-wl1-dfw-wlz-1-bmpg7   Provisioned    r5.2xlarge   us-east-1   us-east-1-wl1-dfw-wlz-1   34s

lz-p2-319-public-edge-us-east-1-wl1-dfw-wlz-1-bmpg7   Provisioned   r5.2xlarge   us-east-1   us-east-1-wl1-dfw-wlz-1   18m
lz-p2-319-public-edge-us-east-1-wl1-dfw-wlz-1-bmpg7   Running       r5.2xlarge   us-east-1   us-east-1-wl1-dfw-wlz-1   18m
lz-p2-319-public-edge-us-east-1-wl1-dfw-wlz-1-bmpg7   Running       r5.2xlarge   us-east-1   us-east-1-wl1-dfw-wlz-1   19m

$ oc get machines -o wide
NAME                                                  PHASE     TYPE         REGION      ZONE                      AGE    NODE                          PROVIDERID                                           STATE
lz-p2-319-public-edge-us-east-1-wl1-dfw-wlz-1-bmpg7   Running   r5.2xlarge   us-east-1   us-east-1-wl1-dfw-wlz-1   27m    ip-10-0-30-131.ec2.internal   aws:///us-east-1-wl1-dfw-wlz-1/i-073f1c305c3093f13   running

$ oc get node ip-10-0-30-131.ec2.internal
NAME                          STATUS   ROLES         AGE     VERSION
ip-10-0-30-131.ec2.internal   Ready    edge,worker   9m25s   v1.27.3+e8b13aa

Note: the installer will create the nodes automatically in private subnets, tested by installer PR. This tests covers Day-2 scenarios when users choose to use public subnets.

@mtulio mtulio changed the title WIP|Spike: Supporting AWS Wavelength Zones with Carrier IP assignment SPLAT-1045: spike: Supporting AWS Wavelength Zones with Carrier IP assignment Jul 26, 2023
@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Jul 26, 2023
@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented Jul 26, 2023

@mtulio: This pull request references SPLAT-1045 which is a valid jira issue.

In response to this:

https://issues.redhat.com/browse/SPLAT-1045
openshift/installer#7369

/hold

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-bot
Copy link
Contributor

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

@openshift-ci openshift-ci bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 25, 2023
@mtulio mtulio changed the title SPLAT-1045: spike: Supporting AWS Wavelength Zones with Carrier IP assignment SPLAT-1218: spike: Supporting AWS Wavelength Zones with Carrier IP assignment Oct 27, 2023
@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented Oct 27, 2023

@mtulio: This pull request references SPLAT-1218 which is a valid jira issue.

In response to this:

https://issues.redhat.com/browse/SPLAT-1045
openshift/installer#7369

/hold

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@mtulio mtulio changed the title SPLAT-1218: spike: Supporting AWS Wavelength Zones with Carrier IP assignment SPLAT-1220: spike: Supporting AWS Wavelength Zones with Carrier IP assignment Oct 27, 2023
@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented Oct 27, 2023

@mtulio: This pull request references SPLAT-1220 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.15.0" version, but no target version was set.

In response to this:

https://issues.redhat.com/browse/SPLAT-1045
openshift/installer#7369

/hold

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@mtulio mtulio force-pushed the aws-wavelength-zones branch 2 times, most recently from c5512c5 to 9525ca6 Compare November 8, 2023 02:02
@mtulio
Copy link
Contributor Author

mtulio commented Nov 8, 2023

/test unit

2 similar comments
@mtulio
Copy link
Contributor Author

mtulio commented Nov 8, 2023

/test unit

@mtulio
Copy link
Contributor Author

mtulio commented Nov 8, 2023

/test unit

@mtulio
Copy link
Contributor Author

mtulio commented Nov 8, 2023

/payload-job periodic-ci-openshift-release-master-nightly-4.15-e2e-aws-ovn-localzones

Copy link
Contributor

openshift-ci bot commented Nov 8, 2023

@mtulio: trigger 1 job(s) for the /payload-(job|aggregate) command

  • periodic-ci-openshift-release-master-nightly-4.15-e2e-aws-ovn-localzones

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/6534be10-7dfc-11ee-82c3-fe49e6b8aa0d-0

@mtulio mtulio changed the title SPLAT-1220: spike: Supporting AWS Wavelength Zones with Carrier IP assignment SPLAT-1220: Supporting AWS Wavelength Zones with Carrier IP assignment Nov 8, 2023
@mtulio mtulio changed the title SPLAT-1220: Supporting AWS Wavelength Zones with Carrier IP assignment SPLAT-1220: Assign Carrier IP when launching nodes in AWS Wavelength Zones Nov 8, 2023
@mtulio mtulio changed the title SPLAT-1220: Assign Carrier IP when launching nodes in AWS Wavelength Zones SPLAT-1220: Assign Carrier IP to launch machine in AWS Wavelength Zones Nov 8, 2023
@mtulio
Copy link
Contributor Author

mtulio commented Nov 8, 2023

/test unit

@mtulio mtulio force-pushed the aws-wavelength-zones branch 2 times, most recently from 39bab59 to 22dba3d Compare November 9, 2023 02:51
@mtulio
Copy link
Contributor Author

mtulio commented Nov 9, 2023

/test unit

@mtulio
Copy link
Contributor Author

mtulio commented Nov 9, 2023

/test ?

@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented Nov 9, 2023

@mtulio: This pull request references SPLAT-1220 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.15.0" version, but no target version was set.

In response to this:

This PR introduces support of Carrier IP Address assignment when PublicIP flag is set to true in the Machine Configuration.

When PublicIP is set in Machines which is set to launch in subnets with zone type wavelength-zones, the RunInstance must set the AssociateCarrierIpAddress in the network interface configuration.

This change is part of the full support of AWS Wavelength on OCP as part of 'edge nodes'. It is required only when launching instances in public subnets in Wavelength Zones.

Research: https://issues.redhat.com/browse/SPLAT-1045
Tracking card: https://issues.redhat.com/browse/SPLAT-1220
Installer full IPI support for Wavelength Zones: openshift/installer#7369
Enhancement: openshift/enhancements#1510

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@mtulio
Copy link
Contributor Author

mtulio commented Nov 9, 2023

Hey @JoelSpeed @elmiko - I am working in the AWS Wavelength zone, this change is required to launch machines in public subnets running into those zones to assign carrier public IP addresses.

Would you mind taking a look?

Back in July I described how to test. I will keep the /hold label until: 1) I provide step by step, and results, to install the cluster with support of Wavelength zones subnets, replace the MAPI image, and launch a new machine in the public subnet. 2) We have a consensus on the change in the enhancement.

Looking forward to hearing from you.

/hold

@mtulio
Copy link
Contributor Author

mtulio commented Nov 10, 2023

Steps to install a cluster with custom release built upon installer and MAPI changes, patching the MachineSet manifest to for the instance be created in the public subnet (created by installer).

  • create a release image:

cluster-bot:

build 4.15.0-ec.1,openshift/installer#7369,openshift/machine-api-provider-aws#78
  • extract binary from built image (get in the job provided by cluster-bot):
oc adm release extract -a ~/.openshift/pull-secret-latest.json --tools registry.build05.ci.openshift.org/ci-ln-8jsw5l2/release:latest

tar xfz openshift-install-*.tar.gz

wget -O yq "https://github.com/mikefarah/yq/releases/download/v4.34.1/yq_linux_amd64"
chmod u+x yq
  • create install-config
CLUSTER_NAME=aws-a415wlzmapi01
cat <<EOF > ./install-config.yaml
apiVersion: v1
metadata:
  name: $CLUSTER_NAME
publish: External
pullSecret: '$(cat ~/.openshift/pull-secret-latest.json)'
sshKey: |
  $(cat ~/.ssh/id_rsa.pub)
baseDomain: devcluster.openshift.com
platform:
  aws:
    region: us-east-1
compute:
- name: edge
  platform:
    aws:
      zones:
      - us-east-1-wl1-nyc-wlz-1
EOF
  • create manifest
./openshift-install create manifests
  • patch manifest
MACHINE_SET_MANIFEST=./openshift/99_openshift-cluster-api_worker-machineset-0.yaml
SUBNET_NAME=$(yq eval .spec.template.spec.providerSpec.value.subnet.filters[0].values[0] openshift/99_openshift-cluster-api_worker-machineset-0.yaml | sed 's/private/public/')

cat <<EOF > ./machineset-patch.yaml
spec:
  template:
    spec:
      providerSpec:
        value:
          publicIP: yes
          subnet:
            filters:
              - name: tag:Name
                values:
                  - $SUBNET_NAME
EOF

./yq eval-all '. as $item ireduce ({}; . * $item)' "${MACHINE_SET_MANIFEST}" ./machineset-patch.yaml > machineset-new.yaml

cp "${MACHINE_SET_MANIFEST}" machineset-current.yaml
cp  machineset-new.yaml "${MACHINE_SET_MANIFEST}"
  • create cluster
./openshift-install create cluster --log-level debug
  • Check the results
$ oc get machineset -n openshift-machine-api | grep nyc
aws-a415wlzmapi01-q92ft-edge-us-east-1-wl1-nyc-wlz-1   1         1         1       1           52m


$ oc get machines -n openshift-machine-api | grep nyc
aws-a415wlzmapi01-q92ft-edge-us-east-1-wl1-nyc-wlz-1-jg4wq   Running   r5.2xlarge   us-east-1   us-east-1-wl1-nyc-wlz-1   47m

MACHINE_NAME=$(oc get machines -n openshift-machine-api | grep nyc | awk '{print$1}')
INSTANCE_ID=$(oc get machines -n openshift-machine-api $MACHINE_NAME -o json | jq -r .status.providerStatus.instanceId)

$ oc get nodes -l node-role.kubernetes.io/edge -o json | jq '.items[0].status.addresses[] | select(.type=="ExternalDNS")'
{
  "address": "ec2-155-146-73-121.compute-1.amazonaws.com",
  "type": "ExternalDNS"
}

$ aws ec2 describe-instances --region us-east-1 --instance-ids $INSTANCE_ID  | jq '.Reservations[].Instances[].NetworkInterfaces[].Association'
{
  "CarrierIp": "155.146.73.121",
  "IpOwnerId": "amazon",
  "PublicDnsName": "ec2-155-146-73-121.compute-1.amazonaws.com"
}

@mtulio
Copy link
Contributor Author

mtulio commented Nov 10, 2023

/test e2e-aws

@mtulio
Copy link
Contributor Author

mtulio commented Nov 11, 2023

/assign @JoelSpeed
/hold cancel

@openshift-ci openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Nov 11, 2023
@RadekManak
Copy link
Contributor

/approve

Copy link
Contributor

openshift-ci bot commented Nov 20, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: RadekManak

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 20, 2023
@elmiko
Copy link
Contributor

elmiko commented Nov 20, 2023

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Nov 20, 2023
@mtulio
Copy link
Contributor Author

mtulio commented Nov 22, 2023

Thanks @elmiko and @RadekManak .

I will hold it until we have signals from the installer PR and a QE review.

/hold

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Nov 22, 2023
@mtulio
Copy link
Contributor Author

mtulio commented Nov 22, 2023

Hi Yunfei - could you please take a look at this card as a follow-up of SPLAT-1237?
/assign @yunjiang29

@mtulio
Copy link
Contributor Author

mtulio commented Nov 22, 2023

Reinforcing the tests: created a new release with installer PR (openshift/installer#7369) and this one, then the machineset manifest was patched to deploy a machine in regular zones, checking the regular flow of assigning public IP.

Result: instance provisioned and public IP address assigned from regular flow (internet gateway / public subnet).

Steps (adapted from original comment ) :

oc adm release extract -a ~/.openshift/pull-secret-latest.json --tools registry.build05.ci.openshift.org/ci-ln-7mgy9hb/release:latest

tar xfz openshift-install-*.tar.gz

wget -O yq "https://github.com/mikefarah/yq/releases/download/v4.34.1/yq_linux_amd64"
chmod u+x yq

CLUSTER_NAME=aws-a415wlzmapi02
cat <<EOF > ./install-config.yaml
apiVersion: v1
metadata:
  name: $CLUSTER_NAME
publish: External
pullSecret: '$(cat ~/.openshift/pull-secret-latest.json)'
sshKey: |
  $(cat ~/.ssh/id_rsa.pub)
baseDomain: devcluster.openshift.com
platform:
  aws:
    region: us-east-1
EOF

./openshift-install create manifests

MACHINE_SET_MANIFEST=./openshift/99_openshift-cluster-api_worker-machineset-0.yaml
SUBNET_NAME=$(yq eval .spec.template.spec.providerSpec.value.subnet.filters[0].values[0] openshift/99_openshift-cluster-api_worker-machineset-0.yaml | sed 's/private/public/')


cat <<EOF > ./machineset-patch.yaml
spec:
  replicas: 1
  template:
    spec:
      providerSpec:
        value:
          publicIP: yes
          subnet:
            filters:
              - name: tag:Name
                values:
                  - $SUBNET_NAME
EOF

./yq eval-all '. as $item ireduce ({}; . * $item)' "${MACHINE_SET_MANIFEST}" ./machineset-patch.yaml > machineset-new.yaml

cp "${MACHINE_SET_MANIFEST}" machineset-current.yaml
cp  machineset-new.yaml "${MACHINE_SET_MANIFEST}"

./openshift-install create cluster --log-level debug

Screenshot from 2023-11-22 16-34-41

When the installer feature is accepted, we should have a regular flow of deploying machines in public subnets in regular/availability and wavelength zones to check the public IP assignment for both gateway types.

cc @JoelSpeed @yunjiang29

@mtulio
Copy link
Contributor Author

mtulio commented Dec 1, 2023

/test all

@mtulio
Copy link
Contributor Author

mtulio commented Dec 1, 2023

/test ?

Copy link
Contributor

openshift-ci bot commented Dec 1, 2023

@mtulio: The following commands are available to trigger required jobs:

  • /test e2e-aws
  • /test e2e-aws-operator
  • /test e2e-aws-serial
  • /test e2e-aws-upgrade
  • /test goimports
  • /test golint
  • /test govet
  • /test images
  • /test unit

Use /test all to run all jobs.

In response to this:

/test ?

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Copy link
Contributor

openshift-ci bot commented Dec 1, 2023

@mtulio: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@mtulio
Copy link
Contributor Author

mtulio commented Dec 2, 2023

@JoelSpeed as per @yunjiang29's comment[1] I am confident to set the risk assessment field.
/hold cancel

[1] https://issues.redhat.com/plugins/servlet/mobile#issue/OCPSTRAT-736

@openshift-ci openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Dec 2, 2023
@openshift-merge-bot openshift-merge-bot bot merged commit 8ec9d07 into openshift:main Dec 2, 2023
10 checks passed
@mtulio mtulio deleted the aws-wavelength-zones branch December 2, 2023 13:08
@openshift-bot
Copy link
Contributor

[ART PR BUILD NOTIFIER]

This PR has been included in build ose-machine-api-provider-aws-container-v4.15.0-202312040732.p0.g8ec9d07.assembly.stream for distgit ose-machine-api-provider-aws.
All builds following this will include this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants