Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make cloneref expose baseSHA if not specified in config #10359

Closed
krzyzacy opened this issue Dec 6, 2018 · 30 comments
Closed

Make cloneref expose baseSHA if not specified in config #10359

krzyzacy opened this issue Dec 6, 2018 · 30 comments
Assignees
Labels
area/prow Issues or PRs related to prow

Comments

@krzyzacy
Copy link
Member

krzyzacy commented Dec 6, 2018

Currently we can have baseRef: master with an empty baseSHA, for example, for periodic jobs specifies an extra_ref points to k/k master.

Cloneref could potentially expose the actual baseSHA and make sidecar aware of the actual SHA so we can use the SHA instead of ref when record revision in finished.json.

/area prow
cc @cjwagner @BenTheElder @stevekuznetsov

@k8s-ci-robot k8s-ci-robot added the area/prow Issues or PRs related to prow label Dec 6, 2018
@stevekuznetsov
Copy link
Contributor

Nooooo!!!! We want a new type of job -- a ref-driven periodic. We should stop having two types of ref jobs. It breaks all sorts of mental models when half of the refs are declarative from spec and half are resolved at run-time -.-

@krzyzacy
Copy link
Member Author

krzyzacy commented Dec 6, 2018

We probably want one type of job instead of more types of jobs 😂

@cjwagner
Copy link
Member

cjwagner commented Dec 6, 2018

Periodics that use ExtraRefs are ref-driven periodics. We could make Prow components populate the baseSHA for ExtraRefs like trigger does for presubmits' Refs. Currently we determine the baseSHA via the GitHub API so unless we switch to using the git client, this would require components like horologium to use a GitHub token 👎 .
Is there a git command to look up the SHA for a remote refname without cloning the repo?

@BenTheElder
Copy link
Member

BenTheElder commented Dec 6, 2018 via email

@krzyzacy
Copy link
Member Author

krzyzacy commented Dec 6, 2018

good to know!

we probably also need to handle private repos on top of that.

@krzyzacy
Copy link
Member Author

/assign
will poke tomorrow

@stevekuznetsov
Copy link
Contributor

@cjwagner

Periodics that use ExtraRefs are ref-driven periodics.

Not technically true. They're driven but not at the specific commit level. The ref is still resolved at run-time tomorrow.

@krzyzacy
Copy link
Member Author

hummm, after a closer look, it's too late to resolve the remote refs in sidecar if we use ls-remote... wonder if there's some data can be passed along from cloneref itself instead?

@stevekuznetsov
Copy link
Contributor

We should implement a new type of job. We should not hack this backwards.

@cjwagner
Copy link
Member

@stevekuznetsov ExtraRefs allows a specific commit to be specified. It is just a slice of Refs structs that each can provide a baseSHA:

BaseSHA string `json:"base_sha,omitempty"`

We just don't have logic in horologium to resolve the SHA before creating the job.
I'm hesitant to create a new whole new type of job for this because it seems like we could get the same behavior by just adding an option to have the field populated by Prow when the job is created.
It might make sense to have Prow always provide a SHA for every ExtraRef that doesn't specify a SHA (for all job types). Is there any benefit to resolving the baseSHA from the job itself?

@krzyzacy
Copy link
Member Author

It might make sense to have Prow always provide a SHA for every ExtraRef that doesn't specify a SHA

in pjutils? that's the only unified place I can think of..

@krzyzacy
Copy link
Member Author

so, just pre-check: do all of the controller images have git?

@cjwagner
Copy link
Member

No, all controllers that create jobs now need to use the git-base image like hook and tide do:

base = "@git-base//image",

@cjwagner
Copy link
Member

cjwagner commented Dec 20, 2018

Lets get @stevekuznetsov's thoughts before we go ahead with this though.

@krzyzacy
Copy link
Member Author

yeah, implementation itself sounds pretty straight forward.

@stevekuznetsov
Copy link
Contributor

I meant that we don't set SHA today, not that we can't. I agree that getting horologium to resolve it would be good, but I worry about downstream logic that needs to do:

var refs prow.Refs
switch job.Type:
case prow.Presubmit:
    refs = job.Refs
case prow.Periodic:
    refs = job.ExtraRefs[0]

Seems like we could do better and have a type of job that always has job.Refs set and runs periodically. We can keep periodic jobs periodic if they don't touch repos like commenter etc. Using the test-infra git client inside of horologium to resolve the SHA SGTM as long as you don't find that breaking to the API for periodics.

@stevekuznetsov
Copy link
Contributor

If it's breaking it might be safer to create a new job type

@cjwagner
Copy link
Member

I don't know of any jobs that this would break and I would think it is ok as we are just populating a previously empty field. I don't think we have anything checking that today.
I agree that we should minimize the kind of job logic you detailed above, but making a new job type wouldn't get rid of the switch statement you shared, it would just move the check to a separate case for the new job type?

I think the ideal path forward would be to change the ProwJobs Refs field to a slice of Refs and combine it with ExtraRefs so that jobs don't need to consider both. Distinguishing the implicit refs from the extra refs probably isn't useful for jobs and it just makes it more likely that a job will handle only Refs and accidentally ignore ExtraRefs. That would be a breaking change to both the downward API and the ProwJob CRD though so until we version or CRDs this isn't even an option.

@stevekuznetsov
Copy link
Contributor

No, wouldn't we just always honor Refs if they are present?

@stevekuznetsov
Copy link
Contributor

Always having a slice would also work

@cjwagner
Copy link
Member

Ah I see what you are saying. I still don't really like that pattern because we still have two fields that define refs. Some logic to handle both Refs and ExtraRefs will be needed as long as both fields exist so the switch statement might disappear, but we'll still need to consider both anyways.
Since it probably isn't feasible to switch the CRD Refs field to be a slice of all the refs right now, all jobs should really be starting with something like this if they need to know what refs were cloned:

var allRefs []kube.Refs
if spec.Refs != nil {
  allRefs = append(allRefs, *spec.Refs)
}
allRefs = append(allRefs, spec.ExtraRefs...)

In any case it sounds like we can go forward with having Prow controllers populate the ExtraRefs' baseSHAs if they are unspecified? I think we should probably be doing that either way and that will unblock Sen. We can watch to see if any periodics are affected in case some job is doing something strange.

@krzyzacy
Copy link
Member Author

cc @BenTheElder

@krzyzacy
Copy link
Member Author

krzyzacy commented Jan 12, 2019

We need to solve this as we are starting to migrate verify/integration job onto podutils - technically we don't care extra_refs for presubmits (the short term goal is to show the commit number to testgrid), we can just let horologium to solve it for now. add the key to clone-record.json and let testgrid reads that

@krzyzacy
Copy link
Member Author

So now we are moving towards backfill the data into started/finished.json now, @stevekuznetsov wonder if the downstream spec is mutable when we are cloning? Or otherwise I'm thinking of we can make initupload understand clone-records.json?

@cjwagner
Copy link
Member

wonder if the downstream spec is mutable when we are cloning?

I don't think we should mutate the downstream spec from the job, even during the initial cloning step. If the downstream spec needs additional data I think it should be populated before the pod is scheduled.

@BenTheElder
Copy link
Member

I don't think we should mutate the downstream spec from the job, even during the initial cloning step. If the downstream spec needs additional data I think it should be populated before the pod is scheduled.

Unless we want to resolve master -> some sha within prow, that's unlikely. It can be resolved during cloning though.

@cjwagner
Copy link
Member

Unless we want to resolve master -> some sha within prow, that's unlikely. It can be resolved during cloning though.

Yeah, I'm saying if we need the downstream spec to include the SHA then all Prow components that trigger jobs should resolve the SHA when creating the job. That sounds like a potentially annoying constraint to add to PJ creation though.
It is unclear to me why the downstream spec needs to be updated at all though. Having initupload determine the SHA itself or from the clone-records.json seems preferable if we don't need the downstream spec to include the SHA for some other reason.

@krzyzacy
Copy link
Member Author

yeah I can add some logic to make inituploader to understand fields from clone-records.json

@BenTheElder
Copy link
Member

Yeah, I'm saying if we need the downstream spec to include the SHA then all Prow components that trigger jobs should resolve the SHA when creating the job. That sounds like a potentially annoying constraint to add to PJ creation though.
It is unclear to me why the downstream spec needs to be updated at all though. Having initupload determine the SHA itself or from the clone-records.json seems preferable if we don't need the downstream spec to include the SHA for some other reason.

💯
which object it is in isn't important, putting it in the pj spec is probably not a good avenue though, and prow probably shouldn't be trying to resolve these.
resolving it at clone time and uploading it in clone-records.json SGTM

@krzyzacy
Copy link
Member Author

🤞
Our metadata should be good to go now - just need to have testgrid/spyglass/gubernator treat them properly

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/prow Issues or PRs related to prow
Projects
None yet
Development

No branches or pull requests

5 participants