Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug 1536629 - send job state and credentials from job #610

Merged
merged 2 commits into from
Jan 19, 2018

Conversation

maleck13
Copy link
Contributor

@maleck13 maleck13 commented Jan 4, 2018

Note for Reviewer
There are two commits here

  1. making the changes required
  2. adding a set of test for the provision subscriber and job. If happy with the approach here I will expand the tests to cover the other subscribers and jobs. Making a number of changes like this, I felt a little disconcerted that I had no unit test to run so would love to add more to help protect against regressions in future changes.

These commits are independent of each other so if the new tests are not wanted I can remove.

I am currently trying this out in my local cluster but wanted to create the PR to let the integration tests run.

Describe what this PR does and why we need it:
The subscribers were deciding on the state of the Job, rather than the job informing the subscriber of the job state. This pr follows on from issue #608
Also the Job was Marshalling the extracted credentials and sending them as a string on the Msg field, then the subscriber would Unmarshal it and save it. Instead of doing this we just send the credentials through

Changes proposed in this pull request

  • simplify subscriber logic
  • send state from the individual job
  • send credentials directly via the Job rather than turning them into a JSON string and then back to credentials again in the subscriber (avoid unneeded marshalling and unmarshalling)
  • Make an attempt at adding test cases for the ProvisionJob as an example for how it might be done and to get feedback on the approach

Which issue this PR fixes (This will close that issue when PR gets merged)
fixes #608

@openshift-ci-robot openshift-ci-robot added the size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. label Jan 4, 2018
@maleck13
Copy link
Contributor Author

maleck13 commented Jan 4, 2018

@eriknelson PR from conversation on IRC yesterday about subscribers setting state.

@coveralls
Copy link

Coverage Status

Changes Unknown when pulling 98aa415 on maleck13:608-send-job-state-from-job into ** on openshift:master**.

@maleck13
Copy link
Contributor Author

maleck13 commented Jan 4, 2018

This pr is also part of the larger work being done for #475

}

// Provisioner defines a function that knows how to provision an apb
type Provisioner func(si *apb.ServiceInstance) (string, *apb.ExtractedCredentials, error)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are you thoughts on this being defined in the apb package?

Copy link
Contributor Author

@maleck13 maleck13 Jan 4, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah no issue with that. The flow of dependencies seems to go from apb to broker.

return &ProvisionJob{
serviceInstance: serviceInstance,
provision: provision,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 I like this approach

@shawn-hurley
Copy link
Contributor

This looks really good to me, want to discuss the provisioner type being defined in the apb package.

I like the way you have made it testable by making the provisioner function. That looks really correct to me, and I think that we should make a not to update the deprovision and update jobs with this style.

Copy link
Contributor

@rthallisey rthallisey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor change

type ProvisionWorkSubscriber struct {
dao *dao.Dao
msgBuffer <-chan JobMsg
dao SubcriberDAO
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/SubcriberDAO/SubscriberDAO/

@rthallisey
Copy link
Contributor

@mhrivnak if you change your initial commit to say fixes #608 instead of 608 it will automatically close the git issue after this PR merges.

@jmrodri
Copy link
Contributor

jmrodri commented Jan 4, 2018

I'll review this and see what the impact is on my current outstanding async work. I'm currently knee deep in this area.

@mhrivnak
Copy link
Member

mhrivnak commented Jan 4, 2018

@mhrivnak if you change your initial commit to say fixes #608 instead of 608 it will automatically close the git issue after this PR merges.

I assume that comment is for @maleck13

@maleck13
Copy link
Contributor Author

maleck13 commented Jan 5, 2018

@shawn-hurley Will wait for @jmrodri review before changing the location of the provision type, in case there are further changes required.
I think if all is good after the reviews perhaps we should merge at that point and I can follow up with the tests for the remaining subscribers and jobs? Just trying to avoid to big of a rebase and merge for others. If not I can do it on this PR not a big deal either way

@coveralls
Copy link

Coverage Status

Changes Unknown when pulling dd08f15 on maleck13:608-send-job-state-from-job into ** on openshift:master**.

@shawn-hurley
Copy link
Contributor

@maleck13 👍 for waiting for @jmrodri. If you want to follow up with the changes you can take it, but I wouldn't mind taking it off your plate if you want to work on your other PR? Completely up to you just offering my help if you want it.

@maleck13
Copy link
Contributor Author

maleck13 commented Jan 5, 2018

@shawn-hurley thanks for the offer. I think though I will naturally end up creating the additional tests as part of the PR work as I want to be able to run them when iterating on the feature pr and a good bit of the work for the last operation feature is around the Jobs and subscribers.

Funny enough almost all of my PRs are part of the feature :) its like building a feature in layers

@rthallisey rthallisey added the 3.9 | release-1.1 Kubernetes 1.9 | Openshift 3.9 | Broker release-1.1 label Jan 9, 2018
@rthallisey
Copy link
Contributor

rthallisey commented Jan 9, 2018

Partially-fixes #542

@jmrodri jmrodri self-requested a review January 12, 2018 04:21
Copy link
Contributor

@jmrodri jmrodri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall I like the approach. Some questions about why some things were changed. Also, the stuttering of State.State is slightly annoying so I'd prefer we address that.

@@ -43,42 +43,44 @@ func NewDeprovisionJob(serviceInstance *apb.ServiceInstance,
// Run - will run the deprovision job.
func (p *DeprovisionJob) Run(token string, msgBuffer chan<- JobMsg) {
metrics.DeprovisionJobStarted()
defer metrics.DeprovisionJobFinished()
var jobMsg = JobMsg{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any reason we do var jobMsg = ... vs jobMsg :=?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing specific, just a habit I have as it is accessed from several blocks within the function. Happy to change it

State: apb.JobState{
State: apb.StateInProgress,
Method: apb.JobMethodDeprovision,
Token: token,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

something about this setup bothers me, but I don't see a cleaner way.


if p.skipApbExecution {
log.Debug("skipping deprovision and sending complete msg to channel")
msgBuffer <- JobMsg{InstanceUUID: p.serviceInstance.ID.String(), PodName: "",
JobToken: token, SpecID: p.serviceInstance.Spec.ID, Error: ""}
jobMsg.State.State = apb.StateSucceeded
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather see us add a SetState(JobState) to the JobMsg struct. I don't like the stuttering of State.State. So it would look more like jobMsg.SetState(apb.StateSucceeded).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree on stuttering struggled with it also, but not a big fan of using setters either though. I wondered if perhaps we could remove JobMsg entirely and just have JobState as their is some duplication between them?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On a side note, I think we could simplify things by removing the subscribers and either have a set of observers registered for each job that act on JobState changes or simply give the Job access to the DAO to set the state itself.
Maybe better to discuss this outside of this PR though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's worth a discussion @maleck13. I don't remember the issue where you first mentioned this idea, but it's worth bringing it up again.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will follow up with a clear separate issue to allow for that discussion

// an error type in a struct you want marshalled
// https://github.com/golang/go/issues/5161
jobMsg.State.State = apb.StateFailed
jobMsg.State.Error = errMsg
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In addition to the SetState I'd add a SetError or SetErrorMsg to the JobMsg struct. That way it hides the implementation detail there there's a JobState hidden in there.

)
setFailedDeprovisionJob(d.dao, msg)
continue
for msg := range msgBuffer {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any particular reason for change from the for { to the for msg := range msgBuffer {?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 reasons:

  1. When you close a channel and you use the for range method, the loop also closes, where as if you just use a for loop and receive from the channel, once the channel is closed you will continue to receive the default value for that channel. If you don't want that behaviour your code would need to check the second value on a channel receive:
msg,closed <- chan 
if closed{
break
}

https://dave.cheney.net/2014/03/19/channel-axioms

  1. I am ok with checking if the channel is closed, but in my opinion it also reads a little better with the range statement.

continue
for msg := range msgBuffer {
log.Debug("received deprovision message from buffer")
if msg.State.State == apb.StateSucceeded {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would flip the check to look for the error conditions first and have escape valves, then do the success case afterwards without being in an if block.

func (d *DeprovisionWorkSubscriber) Subscribe(msgBuffer <-chan JobMsg) {
    d.msgBuffer = msgBuffer
    go func() {
        log.Info("Listening for deprovision messages")
        for msg := range msgBuffer {
            log.Debug("received deprovision message from buffer")

            if msg.State.State != apb.StateSucceeded { 
                if err := d.dao.SetState(msg.InstanceUUID, msg.State); err != nil {
                    log.Errorf("failed to set state after deprovision %v", err)
                }
                continue
            }

            instance, err := d.dao.GetServiceInstance(msg.InstanceUUID)
            if err != nil {
                log.Errorf(
                    "Error occurred getting service instance [ %s ] after deprovision job:",
                    msg.InstanceUUID,
                )   
                setFailedDeprovisionJob(d.dao, msg)
                continue
            }
            if err := cleanupDeprovision(instance, d.dao); err != nil {
                log.Errorf("Failed cleaning up deprovision after job, error: %v", err)
                // Cleanup is reporting something has gone wrong. Deprovision overall
                // has not completed. Mark the job as failed.
                setFailedDeprovisionJob(d.dao, msg)
                continue
            }   
        }
    }()
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed the if entirely as we always want the state to be set

}); err != nil {
log.Errorf("failed to set state after deprovision %#v", err)
// have to set the state here manually as the logic that triggers this is in the subscriber
dmsg.State.State = apb.StateFailed
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The stutter is definitely weird. If we don't want to add the method, then dmsg.JobState.State might work.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah thought about doing that also, but see above comment on perhaps removing the JobMsg and just having JobState.State

@@ -102,12 +77,12 @@ func cleanupDeprovision(instance *apb.ServiceInstance, dao *dao.Dao) error {
id := instance.ID.String()

if err = dao.DeleteExtractedCredentials(id); err != nil {
log.Error("failed to delete extracted credentials - %#v", err)
log.Error("failed to delete extracted credentials - %v", err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any reason you changed from %#v to %v? @shawn-hurley I think I've seen you use the %#v variant, thoughts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There seemed to be different approaches to this across the code base with %v seeming to be the most common. In my opinion, the %v is good for error messages as it will just print the error string, where as perhaps the %#v is good for debug statements as it prints the struct

&errors.errorString{s:"test error"}

see example https://play.golang.org/p/nkVaxztAG4_I

)

// ProvisionWorkSubscriber - Lissten for provision messages
// ProvisionWorkSubscriber - Listen for provision messages
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 to fixing typos

@shawn-hurley shawn-hurley added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jan 12, 2018
@maleck13
Copy link
Contributor Author

Will rebase and bring the new Bind and Unbind jobs into line with these changes

@openshift-ci-robot openshift-ci-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Jan 16, 2018
@maleck13
Copy link
Contributor Author

updated with refactored bind and unbind
added new tests for those jobs
changed the agreed on comments

@jmrodri Any further thoughts based on the comments?

I think the only outstanding things are the stutter, which I think could be resolved by removing the JobMsg and just sending through JobState (ie come up with a single type that suits the purposes). Maybe this could be a follow up issue and PR?
The other issues were the for loop ranging over the channel and the log format.

@coveralls
Copy link

Coverage Status

Changes Unknown when pulling 5229147 on maleck13:608-send-job-state-from-job into ** on openshift:master**.

@coveralls
Copy link

Coverage Status

Changes Unknown when pulling 658379f on maleck13:608-send-job-state-from-job into ** on openshift:master**.

@coveralls
Copy link

coveralls commented Jan 18, 2018

Coverage Status

Changes Unknown when pulling 434f7ed on maleck13:608-send-job-state-from-job into ** on openshift:master**.

@rthallisey
Copy link
Contributor

Looks like we may need to add github.com/apex/log to glide. Travis turned up with this error:

pkg/broker/binding_job.go:24:2: cannot find package "github.com/apex/log" in any of:
	/home/travis/gopath/src/github.com/openshift/ansible-service-broker/vendor/github.com/apex/log (vendor tree)
	/home/travis/.gimme/versions/go1.8.5.linux.amd64/src/github.com/apex/log (from $GOROOT)
	/home/travis/gopath/src/github.com/apex/log (from $GOPATH)

@maleck13
Copy link
Contributor Author

maleck13 commented Jan 18, 2018 via email

@rthallisey
Copy link
Contributor

Thanks Craig. I just restarted it. We'll see if it happens again.

@eriknelson
Copy link
Contributor

@maleck13 Sorry it's taken me a bit to review this. Trying to make my way through a review backlog. I'm guessing this still needs a rebase?

@maleck13
Copy link
Contributor Author

rebased

@coveralls
Copy link

coveralls commented Jan 19, 2018

Coverage Status

Changes Unknown when pulling fd5304a on maleck13:608-send-job-state-from-job into ** on openshift:master**.

@maleck13
Copy link
Contributor Author

The CI build is failing, but it is not clear to me how best to find the issue. The log is quite large https://api.travis-ci.org/v3/job/330711368/log.txt

@rthallisey
Copy link
Contributor

rthallisey commented Jan 19, 2018

It's failing on lint. The fix was merged an hour ago: #662 . Another rebase should fix it.

=================================
              Lint               
=================================
pkg/apb/types.go:318:2: struct field HttpProxy should be HTTPProxy
pkg/apb/types.go:319:2: struct field HttpsProxy should be HTTPSProxy
Found 2 lint suggestions; failing.
Makefile:27: recipe for target 'lint' failed
make: *** [lint] Error 1

@eriknelson
Copy link
Contributor

eriknelson commented Jan 19, 2018

Apologies. I broke the build :) Will take a closer look at this today.

@jmrodri
Copy link
Contributor

jmrodri commented Jan 19, 2018

I'll take another look at this one now that bind and unbind are in

…tests. Address pr feedback on decided comments

remove bad import. refactor job test to accept binding instance
@maleck13
Copy link
Contributor Author

@rthallisey Thanks for pointer! have rebased again now

@jmrodri
Copy link
Contributor

jmrodri commented Jan 19, 2018

My async bind scripts work fine with this deployed. I like the Binder interface func.

@coveralls
Copy link

Coverage Status

Changes Unknown when pulling fe0fd97 on maleck13:608-send-job-state-from-job into ** on openshift:master**.

@jmrodri jmrodri changed the title 608 send job state and credentials from job Bug 1536629 - send job state and credentials from job Jan 19, 2018
@jmrodri jmrodri merged commit 1526119 into openshift:master Jan 19, 2018
jianzhangbjz pushed a commit to jianzhangbjz/ansible-service-broker that referenced this pull request May 17, 2018
Bug 1536629 - send job state and credentials from job

fixes openshift#608
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.9 | release-1.1 Kubernetes 1.9 | Openshift 3.9 | Broker release-1.1 feature needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

JobState should be sent as part of the JobMsg rather than be set by the subscribers
8 participants