Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add azure terraform #1454

Merged
merged 6 commits into from May 20, 2019

Conversation

serbrech
Copy link
Contributor

@serbrech serbrech commented Mar 22, 2019

This PR adds an initial terraform implementation to bootstrap openshift with the installer.
VM Image is set to Core OS until we have the right one available in Azure. This makes the ignition script fail but was enough to validate the setup.

Included :

  • Bootstrap machine
    • Ignition setup
  • Network
    • flat topology
    • single subnet
    • network security group with basic rules
    • Internal load balancers for worker <-> master and bootstrap <-> master communication
    • external load balancer for kubernetes api
    • public ip endpoint for external LB
    • route table
  • DNS
    • private/public DNS zone to satisfy bootstrap assumptions on urls
  • Masters
    • dynamic master count
    • masters use availability sets

All machines deployed have boot diagnostics enabled for serial console access during boot sequence/ignition

Notes for reviewers :

Follow up PR to come, to enable the Azure platform in the Installer go code

@openshift-ci-robot
Copy link
Contributor

Hi @serbrech. Thanks for your PR.

I'm waiting for a openshift member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci-robot openshift-ci-robot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Mar 22, 2019
@serbrech
Copy link
Contributor Author

@enxebre
Copy link
Member

enxebre commented Mar 25, 2019

/ok-to-test

@openshift-ci-robot openshift-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Mar 25, 2019
@trown
Copy link

trown commented May 16, 2019

/retest

@serbrech
Copy link
Contributor Author

DEBUG Apply complete! Resources: 66 added, 0 changed, 0 destroyed. 
DEBUG                                              
DEBUG The state of your infrastructure has been saved to the path 
DEBUG below. This state is required to modify and destroy your 
DEBUG infrastructure, so keep it safe. To inspect the complete state 
DEBUG use the `terraform show` command.            
DEBUG                                              
DEBUG State path: /var/folders/1k/zrqhhs855knc2fjdkz79tk_00000gn/T/openshift-install-438299141/terraform.tfstate 
DEBUG OpenShift Installer unreleased-master-975-gd79d19ad52e64b0e13be9ded31f4d12554738873-dirty 
DEBUG Built from commit d79d19ad52e64b0e13be9ded31f4d12554738873 
INFO Waiting up to 30m0s for the Kubernetes API at https://api.os4r.***********:6443... 
DEBUG Still waiting for the Kubernetes API: the server could not find the requested resource 
DEBUG Still waiting for the Kubernetes API: the server could not find the requested resource 
DEBUG Still waiting for the Kubernetes API: the server could not find the requested resource 
DEBUG Still waiting for the Kubernetes API: the server could not find the requested resource 
DEBUG Still waiting for the Kubernetes API: the server could not find the requested resource 
DEBUG Still waiting for the Kubernetes API: the server could not find the requested resource 
DEBUG Still waiting for the Kubernetes API: the server could not find the requested resource 
DEBUG Still waiting for the Kubernetes API: the server could not find the requested resource 
DEBUG Still waiting for the Kubernetes API: the server could not find the requested resource 
DEBUG Still waiting for the Kubernetes API: Get https://api.os4r.**********:6443/version?timeout=32s: dial tcp 52.224.200.36:6443: connect: connection refused 
INFO API v1.13.4+7e86a1a up                       
INFO Waiting up to 30m0s for bootstrapping to complete... 
...

@abhinavdahiya
Copy link
Contributor

/test tf-fmt

@abhinavdahiya
Copy link
Contributor

#1739 is now merged.

The templates needs to be move to 0.12 specification before these can be merged.

@abhinavdahiya
Copy link
Contributor

#1739 is now merged.

The templates needs to be move to 0.12 specification before these can be merged.
337621b

moving to 0.12 doesn't involve only fmt changes but also language specification changes that need to be upgrade with terraform 0.12upgrade.

@serbrech serbrech force-pushed the azure-terraform branch 3 times, most recently from 306e533 to 7b4f1b6 Compare May 17, 2019 18:18
add terraform to create a cluster on azure
resource names match expectations from capz[1]

[1] cluster-api-provider-azure : https://github.com/openshift/cluster-api-provider-azure/blob/master/pkg/cloud/azure/defaults.go
As for other providers, we rely on the MachineProviderSpec to
feed the terraform variables generation.
One exception is the basedomain resource group, which is passed
through via the Azure platform installConfig.
It is necessary to identify reliably the pubclic dns zone to attach the
cluster to.
this can either be reverted later
or disabled and put behind a feature/option flag
this uses the public load balancer ip behin the internal
api dns entries to work around the azure internal lb
limitation described here :
https://docs.microsoft.com/en-us/azure/load-balancer/load-balancer-overview#limitations
@serbrech
Copy link
Contributor Author

Apply completion + openshift bootstrap :

DEBUG Apply complete! Resources: 66 added, 0 changed, 0 destroyed. 
DEBUG OpenShift Installer unreleased-master-1025-ge73a0cc7c12832d9d57789b8366a744e310b161c-dirty 
DEBUG Built from commit e73a0cc7c12832d9d57789b8366a744e310b161c 
INFO Waiting up to 30m0s for the Kubernetes API at https://api.foo5.sanecoder.com:6443... 
DEBUG Still waiting for the Kubernetes API: the server could not find the requested resource 
DEBUG Still waiting for the Kubernetes API: the server could not find the requested resource 
DEBUG Still waiting for the Kubernetes API: the server could not find the requested resource 
DEBUG Still waiting for the Kubernetes API: the server could not find the requested resource 
DEBUG Still waiting for the Kubernetes API: the server could not find the requested resource 
DEBUG Still waiting for the Kubernetes API: the server could not find the requested resource 
DEBUG Still waiting for the Kubernetes API: the server could not find the requested resource 
DEBUG Still waiting for the Kubernetes API: the server could not find the requested resource 
DEBUG Still waiting for the Kubernetes API: the server could not find the requested resource 
DEBUG Still waiting for the Kubernetes API: Get https://api.foo5.sanecoder.com:6443/version?timeout=32s: dial tcp 52.191.220.114:6443: connect: operation timed out 
INFO API v1.13.4+cf89496 up                       
INFO Waiting up to 30m0s for bootstrapping to complete... 
DEBUG Bootstrap status: complete                   
INFO Destroying the bootstrap resources...

Bootstrap destroy logs :

DEBUG module.bootstrap.azurerm_network_interface_nat_rule_association.bootstrap_ssh: Still destroying... [id=/subscriptions/c1089427-83d3-4286-9f35-...public-lb/inboundNatRules/SSHBootstrap, 1m0s elapsed] 
DEBUG module.bootstrap.azurerm_network_interface_backend_address_pool_association.internal_lb_bootstrap: Destruction complete after 1m3s 
DEBUG module.bootstrap.azurerm_network_interface_backend_address_pool_association.public_lb_bootstrap: Destruction complete after 1m4s 
DEBUG module.bootstrap.azurerm_network_interface_nat_rule_association.bootstrap_ssh: Destruction complete after 1m5s 
DEBUG module.bootstrap.azurerm_virtual_machine.bootstrap: Still destroying... [id=/subscriptions/c1089427-83d3-4286-9f35-...e/virtualMachines/foo5-mfldd-bootstrap, 1m10s elapsed] 
DEBUG module.bootstrap.azurerm_virtual_machine.bootstrap: Still destroying... [id=/subscriptions/c1089427-83d3-4286-9f35-...e/virtualMachines/foo5-mfldd-bootstrap, 1m20s elapsed] 
DEBUG module.bootstrap.azurerm_virtual_machine.bootstrap: Still destroying... [id=/subscriptions/c1089427-83d3-4286-9f35-...e/virtualMachines/foo5-mfldd-bootstrap, 1m30s elapsed] 
DEBUG module.bootstrap.azurerm_virtual_machine.bootstrap: Destruction complete after 1m32s 
DEBUG module.bootstrap.azurerm_network_interface.bootstrap: Destroying... [id=/subscriptions/c1089427-83d3-4286-9f35-5af546a6eb67/resourceGroups/foo5-mfldd-rg/providers/Microsoft.Network/networkInterfaces/foo5-mfldd-bootstrap-nic] 
DEBUG module.bootstrap.azurerm_network_interface.bootstrap: Still destroying... [id=/subscriptions/c1089427-83d3-4286-9f35-...orkInterfaces/foo5-mfldd-bootstrap-nic, 10s elapsed] 
DEBUG module.bootstrap.azurerm_network_interface.bootstrap: Destruction complete after 11s 
DEBUG                                              
DEBUG Destroy complete! Resources: 9 destroyed.    
INFO Waiting up to 30m0s for the cluster at https://api.foo5.sanecoder.com:6443 to initialize... 

Some errors preventing success. I think due in part to the CredentialRequest missing for azure

DEBUG Still waiting for the cluster to initialize: Multiple errors are preventing progress:
* Cluster operator authentication is still updating: missing version information for oauth-openshift
* Cluster operator console has not yet reported success
* Cluster operator image-registry is still updating
* Cluster operator ingress has not yet reported success
* Cluster operator monitoring is still updating
* Cluster operator openshift-samples is still updating
* Could not update servicemonitor "openshift-apiserver-operator/openshift-apiserver-operator" (346 of 350): the server does not recognize this resource, check extension API servers
* Could not update servicemonitor "openshift-authentication-operator/authentication-operator" (321 of 350): the server does not recognize this resource, check extension API servers
* Could not update servicemonitor "openshift-controller-manager-operator/openshift-controller-manager-operator" (349 of 350): the server does not recognize this resource, check extension API servers
* Could not update servicemonitor "openshift-image-registry/image-registry" (327 of 350): the server does not recognize this resource, check extension API servers
* Could not update servicemonitor "openshift-kube-apiserver-operator/kube-apiserver-operator" (337 of 350): the server does not recognize this resource, check extension API servers
* Could not update servicemonitor "openshift-kube-controller-manager-operator/kube-controller-manager-operator" (340 of 350): the server does not recognize this resource, check extension API servers
* Could not update servicemonitor "openshift-kube-scheduler-operator/kube-scheduler-operator" (343 of 350): the server does not recognize this resource, check extension API servers
* Could not update servicemonitor "openshift-operator-lifecycle-manager/olm-operator" (267 of 350): the server does not recognize this resource, check extension API servers
* Could not update servicemonitor "openshift-service-catalog-apiserver-operator/openshift-service-catalog-apiserver-operator" (330 of 350): the server does not recognize this resource, check extension API servers
* Could not update servicemonitor "openshift-service-catalog-controller-manager-operator/openshift-service-catalog-controller-manager-operator" (333 of 350): the server does not recognize this resource, check extension API servers

@abhinavdahiya
Copy link
Contributor

/retest

#1454 (comment)

/lgtm
we can iterate on this.

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label May 20, 2019
@openshift-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: abhinavdahiya, serbrech

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 20, 2019
@openshift-merge-robot openshift-merge-robot merged commit 0c2b1a1 into openshift:master May 20, 2019
@openshift-ci-robot
Copy link
Contributor

@serbrech: The following test failed, say /retest to rerun them all:

Test name Commit Details Rerun command
ci/prow/e2e-aws-scaleup-rhel7 bc1cb28 link /test e2e-aws-scaleup-rhel7

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

wking added a commit to wking/openshift-installer that referenced this pull request Oct 18, 2019
These were added without a consumer in 2c40ae8 (data/data/azure:
add azure terraform, 2019-04-19, openshift#1454).
alaypatel07 pushed a commit to alaypatel07/installer that referenced this pull request Nov 13, 2019
These were added without a consumer in 2c40ae8 (data/data/azure:
add azure terraform, 2019-04-19, openshift#1454).
jhixson74 pushed a commit to jhixson74/installer that referenced this pull request Dec 6, 2019
These were added without a consumer in 2c40ae8 (data/data/azure:
add azure terraform, 2019-04-19, openshift#1454).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants