Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable async service instance provisioning #52

Conversation

samedguener
Copy link
Member

@samedguener samedguener commented Mar 15, 2018

@@ -35,7 +35,9 @@ The following arguments are supported:
* `space` - (Required, String) The ID of the [space](/docs/providers/cloudfoundry/r/space.html)
* `json_params` - (Optional, String) Json string of arbitrary parameters. Some services support providing additional configuration parameters within the provision request
* `tags` - (Optional, List) List of instance tags. Some services provide a list of tags that Cloud Foundry delivers in [VCAP_SERVICES Env variables](https://docs.cloudfoundry.org/devguide/deploy-apps/environment-variable.html#VCAP-SERVICES)

* `timeout` - (Optional, Integer) The timeout in seconds, which shall be waited until service instance is created. Default: 60 seconds
* `async` - (Optional, Bool) Set to `true` if the client allows asynchronous provisioning. Please refer to [this](https://github.com/openservicebrokerapi/servicebroker/blob/v2.12/spec.md#synchronous-and-asynchronous-operations) for more information. Default: false
Copy link
Collaborator

@gberche-orange gberche-orange Mar 16, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this flag really necessary ?
The provider should systematically set the accepts_incomplete flag in the CC API call and infer from the response status code (202 vs 200) and/or last_operation.state response field whether the service instance completion is sync or async

Copy link

@ghost ghost Mar 16, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll change the PR and according our the UX recommendations as discussed in #51 beginning of next week.

Thanks for the review!

@gberche-orange
Copy link
Collaborator

@samedguener also we probably also need to add async service test coverage into https://github.com/mevansam/terraform-provider-cf/blob/master/cloudfoundry/resource_cf_service_instance_test.go. Potentially, we'd need to deploy a service broker which supports async provisionning, may be https://github.com/cloudfoundry/cf-acceptance-tests/blob/master/assets/service_broker/README.md can be useful there.

@ghost
Copy link

ghost commented Mar 19, 2018

@gberche-orange I have updated the PR according to our needs. Can you assist me on how we could enable async. provisioning in the service broker? I couldn't figure out, how and where the service broker is actually set-up during testing.

Thanks in advance for the clarification!

@gberche-orange
Copy link
Collaborator

gberche-orange commented Mar 19, 2018

@doktorgibson the documentation at https://github.com/cloudfoundry/cf-acceptance-tests/blob/master/assets/service_broker/README.md#the-configuration-file describes how the configuration time allows to specify responses to each endpoint (status code and body). E.g.

   "async-plan-guid": {
      "sleep_seconds": 0,
      "status": 202,
      "body": {
        "last_operation": {
          "state": "in progress"
        }
      }

you might also get some inspiration from the related cf-acceptance-test async test cases at https://github.com/cloudfoundry/cf-acceptance-tests/blob/9b7d6d849f149f46015e63425278c07f6f799e04/services/service_instance_lifecycle.go#L293-L461

@gberche-orange
Copy link
Collaborator

@doktorgibson Fixed the pointer to acceptance test, sorry for copy/paste error.

@gberche-orange
Copy link
Collaborator

@samedguener thanks for the test case contribution! Might be easier to maintain if we submodule the acceptance test repo instead of duplicating/vendoring the app.

…ling status

It is possible that the read API call on an service instance does not yet
reflect the updated operation state of the service instance immediately and
this causes a short circuit in which the status of the previous operation
is read and so the wait method does not correctly pause until the desired
operation is done.  The fix, hopefully, is to always pause for at least the
regular polling cycle wait time before doing the first status check to give
Cloud Foundry enough time to return the correct values from the read API.
Copy link
Collaborator

@gberche-orange gberche-orange left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @samedguener and @janosbinder for this contribution. Please review and comments suggestions and questions.

var serviceInstance CCServiceInstance

for {
time.Sleep(appStatePingSleep)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe the aws_db_instance resource can provide some inspiration to leverage standard timeout configuration support as well as built-in retries that might avoid blocking sleeps, I ran into this vault retry loop might may provide inspiration ?

In particular leveraging the StateChangeConf helper using the same polling loop pattern than what the aws_db_instance does, leveraging the StateChangeConf helper

See related conversation with samed at #51 (comment)

c <- nil
return
case "failed":
c <- fmt.Errorf("service instance %s crashed", serviceInstanceID)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please consider "service instance with guid=%s async provisionning failed" alternative wording

@@ -505,6 +507,97 @@ func (sm *ServiceManager) ReadServiceInstance(serviceInstanceID string) (service
return serviceInstance, nil
}

// WaitServiceInstanceTo -
func (sm *ServiceManager) WaitServiceInstanceTo(operationType string, serviceInstanceID string) (err error) {
sm.log.UI.Say("Waiting for service instance %s to finish starting ..", terminal.EntityNameColor(serviceInstanceID))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternative wording suggestion: "Waiting for service instance %s to complete async provisionning"

var serviceInstance CCServiceInstance

for {
time.Sleep(appStatePingSleep)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same, please consider leveraging the StateChangeConf helper using the same polling loop pattern than what the aws_db_instance does

if serviceInstance, err = sm.ReadServiceInstance(serviceInstanceID); err != nil {
// if the service instance is gone the error message should contain 60004
// cf_service_instance.redis: Server error, status code: 404, error code: 60004, message: The service instance could not be found: babababa-d977-4e9c-9bd0-4903d146d822
if strings.Contains(err.Error(), "error code: 60004") {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor alternative wording suggestion

// if the service instance is gone the error message should contain error code 60004 ("ServiceInstanceNotFound")

  •   		// e.g. CLI output: cf_service_instance.redis: Server error, status code: 404, error code: 60004, message: The service instance could not be found: babababa-d977-4e9c-9bd0-4903d146d822
    

tests/data.json Outdated
}
}
},
"fake-async-plan-2-guid": {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

may be also cover the case of a service with async provisionning, and sync deprovisionning.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately this test broke. The way I managed to fix it to only allowing asynchronous deprovisioning.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks. This is a corner case we can probably live with for now. You may record the limitation into a distinct issue ?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Limitation has been added #107

@@ -74,7 +74,9 @@ One of the following arguments must be declared to locate application source or

* `service_binding` - (Optional, Array) Service instances to bind to the application.

- `service` - (Required, String) The service instance GUID.
* `service_binding` - (Optional, Array) Service instances to bind to.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

duplicated line ?

@@ -33,8 +33,8 @@ The following arguments are supported:
* `name` - (Required, String) The name of the Service Instance in Cloud Foundry
* `service_plan` - (Required, String) The ID of the [service plan](/docs/providers/cloudfoundry/d/service_plan.html)
* `space` - (Required, String) The ID of the [space](/docs/providers/cloudfoundry/r/space.html)
* `json_params` - (Optional, String) Json string of arbitrary parameters. Some services support providing additional configuration parameters within the provision request. By default, no params are provided.
* `tags` - (Optional, List) List of instance tags. Some services provide a list of tags that Cloud Foundry delivers in [VCAP_SERVICES Env variables](https://docs.cloudfoundry.org/devguide/deploy-apps/environment-variable.html#VCAP-SERVICES). By default, no tags are assigned.
* `json_params` - (Optional, String) Json string of arbitrary parameters. Some services support providing additional configuration parameters within the provision request
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs a rebase of changes made into #73 that would otherwise be removed

* `json_params` - (Optional, String) Json string of arbitrary parameters. Some services support providing additional configuration parameters within the provision request. By default, no params are provided.
* `tags` - (Optional, List) List of instance tags. Some services provide a list of tags that Cloud Foundry delivers in [VCAP_SERVICES Env variables](https://docs.cloudfoundry.org/devguide/deploy-apps/environment-variable.html#VCAP-SERVICES). By default, no tags are assigned.
* `json_params` - (Optional, String) Json string of arbitrary parameters. Some services support providing additional configuration parameters within the provision request
* `tags` - (Optional, List) List of instance tags. Some services provide a list of tags that Cloud Foundry delivers in [VCAP_SERVICES Env variables](https://docs.cloudfoundry.org/devguide/deploy-apps/environment-variable.html#VCAP-SERVICES)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs a rebase of changes made into #73 that would otherwise be removed

@@ -49,3 +49,9 @@ An existing Service Instance can be imported using its guid, e.g.
```
$ terraform import cf_service.redis a-guid
```

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this specific to the service_instance resource, or global timeouts that would apply on any resource ?

Copy link
Collaborator

@janosbinder janosbinder Jun 13, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be specific to the service_instance resource, and my recent commit enforces it: 72c9af8 (See the StateChangeConf implementations)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks

@janosbinder
Copy link
Collaborator

janosbinder commented Jun 15, 2018

@gberche-orange thank you for your valuable comments. I am finished implementing them, namely:

  • Implemented the pulling via StateChangeConf
  • Clarified the error mesages
  • Implemented a test by adding cf-acceptance-tests/service_broker to vendor to test all 3 fake plans, and pushing it against the acceptance environment
  • Implemented the ImportStatePassthrough pattern for imports

There are also a few points that I have not addressed:

  • duplicated lines - which could be resolved by rebasing - maybe we can discuss Jim's PR merge master back into dev #90 which would also solve this issue. - anyway I have merged the latest changes of the dev branch
  • on deletion distinguishing between synchronous and asynchronous deprovisioning via HTTP status code - please see my comments in the review - briefly reciting them - the HTTP status code is shallowed by CF CLI gateway login, and my initial solution broke fake-async-plan test (asynchronous provisioning and synchronous deprovisioning) - therefore I have kept asynchronous deprovisioning logic.

Let me know, whether you are fine with the current implementation.

@gberche-orange
Copy link
Collaborator

thanks @janosbinder for the contributed updates and detailed explanations.

on deletion distinguishing between synchronous and asynchronous deprovisioning via HTTP status code - please see my comments in the review - briefly reciting them - the HTTP status code is shallowed by CF CLI gateway login, and my initial solution broke fake-async-plan test (asynchronous provisioning and synchronous deprovisioning) - therefore I have kept asynchronous deprovisioning logic.

Thanks for the detailed analysis into https://github.com/mevansam/terraform-provider-cf/pull/52/files/eb4a68f701f66750ca7300816f4f3167b006e016#r194781592

Let's submit an issue "r.service_instance delete perform unnecessary sync delete polling" and record the CLI CLI swallowing of the response code as in a our list of limitations of "client friendly CF CLI api"

In order to have the acceptance tests trigger and report tests passing, please resubmit the PR by pushing your branch onto mevansam/terraform-provider-cf, and please consider squashing the 66 commits as to ease review by the team and avoid polutting the git history.

@janosbinder
Copy link
Collaborator

janosbinder commented Jun 18, 2018

Thanks for the feedback @gberche-orange . I have tailored further the opened issue with your comments.

In order to have the acceptance tests trigger and report tests passing, please resubmit the PR by pushing your branch onto mevansam/terraform-provider-cf, and please consider squashing the 66 commits as to ease review by the team and avoid polutting the git history.

I am afraid to start the acceptance test automatically, I would need access on this repository to trigger the Travis CI. I am happy to open another PR, once I have that, but I can assure you that I have run the acceptance tests yesterday with the credentials Mevan has provided. Currently Travis CI does not support executing acceptance tests on forks, as this would leak out the credentials.

I agree very much on the git history, and it would pollute the history. For these cases GitHub offers squashing the git history into a single commit upon merge. Then there will a single commit entry in dev commit history like "Merge pull request #52 from chrismathias/async-provisioning-service-instances" See the provider's commit history and for squash examples have a look on Arthur's PR #85 or Mevan's PR #80 . These PRs seem to be squashed and merged on dev via the built-in function of GitHub.

In summary it is a good idea to squash the commits via GitHub when you merge back my changes.

Copy link
Member

@ArthurHlt ArthurHlt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have just some reserved about timeout, normally cloud controller have a timeout for async. Maybe simply wait for cloud controller to say time is over is better.
What's your feelings @gberche-orange @psycofdj ?

Target: resourceServiceInstanceSucceesStates,
Refresh: resourceServiceInstanceStateFunc(id, "create", meta),
Timeout: d.Timeout(schema.TimeoutCreate),
PollInterval: 5 * time.Second,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I do remember well Cloud Controller is polling every 30 seconds maybe poll every 5 seconds is a bit too much

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ArthurHlt considering #107, wouldn't setting a 30s polling interval delay all deletes by 30s?
@janosbinder proposed setting a 5s min timeout, and a 30s refresh period.

@janosbinder
Copy link
Collaborator

Closing as the branch for this PR is now on the upstream repository (#111).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants