-
Notifications
You must be signed in to change notification settings - Fork 241
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Check provisioner ready status when applying reconcile logic #574
Check provisioner ready status when applying reconcile logic #574
Conversation
Hi @andfasano. Thanks for your PR. I'm waiting for a metal3-io member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/ok-to-test |
/test-integration |
cac72af
to
958f654
Compare
1c47a5e
to
9a3e71f
Compare
/test-integration |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This still feels like a lot of extra code to be able to make 2 HTTP calls. I know you mentioned doing it in parallel for performance, but I'm not sure the baremetal-operator is ever going to be the bottleneck in the system. Ironic can take ages to actually provision a host, for example. Maybe we can talk through the approach on a call Wednesday?
} | ||
|
||
m.server = httptest.NewUnstartedServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { | ||
m.requests += r.Host + r.RequestURI + ";" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't feel like very idiomatic go to me - can we use slices instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was done by design in reality to improve test readability / handling: with this testing technique events are accumulated in a string (using a separator), so that it would be easier to setup the expected string (and to read in case of failure, also the sequence is naturally represented). It's an approach that works well especially when there a few events.
{ | ||
name: "IsReady", | ||
ironic: newMockServer(6385).addDrivers(), | ||
inspector: newMockServer(5050), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are the ports hard coded so much that we have to do this? I'd feel better if we let go choose unused system ports which httptest can do.
If I'm running tests on my local dev box, I may very well have my own Ironic already running on these ports.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there's an impediment here:
baremetal-operator/pkg/provisioner/ironic/ironic.go
Lines 50 to 73 in 231ba34
func init() { | |
// NOTE(dhellmann): Use Fprintf() to report errors instead of | |
// logging, because logging is not configured yet in init(). | |
deployKernelURL = os.Getenv("DEPLOY_KERNEL_URL") | |
if deployKernelURL == "" { | |
fmt.Fprintf(os.Stderr, "Cannot start: No DEPLOY_KERNEL_URL variable set\n") | |
os.Exit(1) | |
} | |
deployRamdiskURL = os.Getenv("DEPLOY_RAMDISK_URL") | |
if deployRamdiskURL == "" { | |
fmt.Fprintf(os.Stderr, "Cannot start: No DEPLOY_RAMDISK_URL variable set\n") | |
os.Exit(1) | |
} | |
ironicEndpoint = os.Getenv("IRONIC_ENDPOINT") | |
if ironicEndpoint == "" { | |
fmt.Fprintf(os.Stderr, "Cannot start: No IRONIC_ENDPOINT variable set\n") | |
os.Exit(1) | |
} | |
inspectorEndpoint = os.Getenv("IRONIC_INSPECTOR_ENDPOINT") | |
if inspectorEndpoint == "" { | |
fmt.Fprintf(os.Stderr, "Cannot start: No IRONIC_INSPECTOR_ENDPOINT variable set") | |
os.Exit(1) | |
} | |
} |
init()
method gets triggered before the test, and thus the related (not accessible) package variables fetching the env vars are already set once the newProvisioner
call is made (and I think it's the reason why those env vars are set in the Makefile and also in hack/unit.sh). It could help to refactor the configuration initialization, maybe with a lazy setup.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not really convinced that a unit test should be making network calls at all.
65d4ce8
to
45d89c2
Compare
/test-integration |
/test unit |
provStatus, err := prov.IsReady() | ||
if err != nil { | ||
return reconcile.Result{}, errors.Wrap(err, | ||
fmt.Sprintf("failed to check services availability")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could just be an inline string, since there are no additional variables in the Sprintf(). That can be fixed in a follow-up PR unless there are other changes needed for this one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/approve
pkg/provisioner/fixture/fixture.go
Outdated
@@ -17,7 +17,7 @@ var provisionRequeueDelay = time.Second * 10 | |||
|
|||
// Provisioner implements the provisioning.Provisioner interface | |||
// and uses Ironic to manage the host. | |||
type fixtureProvisioner struct { | |||
type Provisioner struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the point of renaming this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To allow setting the ready flag from the test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The type is private and only accessible via an interface that doesn't have the SetIsReady() function, so renaming it makes it public but still requires a type coercion in the test. What if we move it up into pkg/provisioner instead? That would let us keep it private, and the tests could instantiate one directly to avoid the type coercion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it is not a problem to move it then I think it sounds fine
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed the struct, see comment below
pkg/provisioner/provisioner.go
Outdated
|
||
// IsReady checks if the provisioning backend is available to accept | ||
// all the incoming requests. | ||
IsReady() (result Result, err error) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm sceptical about returning a provisioner.Result here. It's even less obvious than usual what the 'Dirty' flag means. A function named something like IsReady
should return a bool.
I realise this is done so that we can return a RequeueAfter value, but I'm not sure that we really need that to be customisable by the provisioner.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggested the provisioner.Result in part to be consistent with the other methods, but I see your point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I reverted back the IsReady()
signature
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: andfasano, dhellmann, zaneb The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@@ -28,22 +28,25 @@ type fixtureProvisioner struct { | |||
publisher provisioner.EventPublisher | |||
// state to manage the two-step adopt process | |||
adopted bool | |||
// status of the provisioner | |||
ready bool | |||
} | |||
|
|||
// New returns a new Ironic Provisioner | |||
func New(host *metal3v1alpha1.BareMetalHost, bmcCreds bmc.Credentials, publisher provisioner.EventPublisher) (provisioner.Provisioner, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is public and you can add whatever arguments you like, so maybe rather than making the type public you could allow the caller to pass something here (e.g. number of times to return false before returning true from IsReady(), or a user-supplied function to call from IsReady() to get the result).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately the New
function must match the provisioner.Factory
type so I don't think it's recommended to modify its signature. Alternatively I could add another factory method (ie NewExt
or some other better name), to be used only from the test, with the required additional params, and keep the type private.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added a new factory to allow injecting test parameters. With this approach, the fixture remains private and no need to move the file from its original location
1809bc7
to
09b32a5
Compare
/test-integration This version looks OK to me. I'll leave it open for @zaneb to give the final approval since he also had some comments on earlier drafts. |
09b32a5
to
3f71294
Compare
/test-integration |
I have some reservations about @stbenjam's comments, but we can always resolve that later; I think it's better not to hold up this work. |
This PR aims to improve the reconcile loop by checking the status of the Provisioner dependencies before trying to reconcile the state: if the provisioner is not ready, the current request is requeued.
In particular, it checks the availability of the Ironic and Ironic Inspector services (thanks to @stbenjam for specific check logic, extracted from https://github.com/openshift-metal3/terraform-provider-ironic/blob/master/ironic/provider.go).
This approach should help especially during the bootstrap of those deployment scenarios where the Ironic services are launched at the same time of the operator - where the reconcile loop could start to operate before Ironic services were effectively up and running.