-
Notifications
You must be signed in to change notification settings - Fork 8
Compute PCRs #12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Compute PCRs #12
Conversation
operator/src/reference_values.rs
Outdated
| use crate::macros::info_if_exists; | ||
|
|
||
| const BOOT_IMAGE: &str = "quay.io/fedora/fedora-coreos:42.20250705.3.0"; | ||
| const COMPUTE_IMAGE: &str = "quay.io/jnaucke/compute-pcrs:latest"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably build the image as part of this repo
|
@Jakob-Naucke instead of reading the job pod output what about if we include the compute-pcrs binary in a new container image, and you add a new binary which launches the calculation and then update the config maps of trustee? I think it will be a cleaner |
Hmm, I have two concerns about it being cleaner.
Extra Rust crate or
Is API access nearly as easy as what we have from the operator? |
Well, if we convert the compute-pcrs into a library, it removes all those frictions. IMO, a library what we should get from the compute-pcrs repository. It should also pretty straightforward to convert the binary into a library, and then the new binary can rely on this library.
Well, the operator would need to fetch the logs and patch the reference values config maps, so why not let the job doing it immediately |
oh yeah that's cleaner indeed. Didn't realize you meant that based on
What I meant was: The job needs all the k8s API interaction logic again. That's probably fine though. |
Yes, it requires the k8s client and the rbac for updating the config map. But it is a self-contained job with a clear scope, so I personally prefer it rather then parsing the pod output. |
The source code is already divided into cli and lib. Is that what you meant? Is there something else we should do on the compute-pcrs side? |
I plan to make a new binary crate that uses compute-pcrs-lib. I think that side is fine as is. |
8a82c23 to
e4efa66
Compare
|
untested, pushed to the wrong remote, bound to happen. converting to draft. sorry for the noise. |
e4efa66 to
d26af80
Compare
|
@alicefr now tested and ready for review, comment on permissions still standing though |
Makefile
Outdated
| REGISTRY ?= quay.io | ||
| OPERATOR_IMAGE=$(REGISTRY)/confidential-clusters/cocl-operator:latest | ||
| COMPUTE_PCRS_IMAGE=$(REGISTRY)/confidential-clusters/compute-pcrs:latest | ||
| PCRS_BOOT_IMAGE=quay.io/fedora/fedora-coreos:42.20250705.3.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wouldn't put this image as part of the manifests. For now, it is fine to have it hardcoded, but it is something we need to infer from the cluster
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you saying
- cluster inference should be implemented before merge, or
- we can merge this but it will need to be changed, or
- cluster inference is not necessary now, but even then we should have something else (what?)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this logic will be part of the operator not the manifests. So, I would avoid to put this as part of the manifests generation. You can, for now, just hardcode the fedora version in the operator
Makefile
Outdated
| --trustee-namespace operators | ||
| --trustee-namespace operators \ | ||
| --pcrs-compute-image $(COMPUTE_PCRS_IMAGE) \ | ||
| --pcrs-boot-image $(PCRS_BOOT_IMAGE) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As comment as above
| # Hack: Set compute-pcrs as sole member to avoid needing to copy other crates. | ||
| # In that case, a rebuild would be triggered upon any change in those crates. | ||
| RUN sed -i 's/members =.*/members = ["compute-pcrs"]/' Cargo.toml && \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cannot you copy the Cargo.toml from compute-pcrs instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it has workspace dependencies and I prefer workspace dependencies over potentially deviating versions
compute-pcrs/Containerfile
Outdated
| "--efivars", "/reference-values/efivars/qemu-ovmf/fcos-42", \ | ||
| "--mokvars", "/reference-values/mok-variables/fcos-42"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: usually the args are specified at container creation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no this is intentional. These arguments are dependent on the path of reference-values, which is "known" in this Containerfile, not when the other arguments are fed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Kubernetes overwrites the entrypoint in any case. What I don't like is that there is fcos-42 hardcoded there
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Kubernetes overwrites the entrypoint in any case.
Not when you use args instead of command 🙂
What I don't like is that there is fcos-42 hardcoded there
I can see that. I'm considering adding a "reference values base directory" flag to the compute-pcrs binary though so this information isn't spread out. Or we clone with an init container. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Has a further decision been taken on this topic? I see this is one of the only references to reference-values/mok-variables/fcos-42 which I would like to update to reference-values/mok-variables/fedora-42 for the sake of simplicity.
Are we safe if I move trusted-execution-clusters/reference-values#3 on?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bgartzi go ahead.
compute-pcrs/Containerfile
Outdated
| git clone --depth 1 https://github.com/confidential-clusters/reference-values && \ | ||
| cargo build --release -p compute-pcrs | ||
|
|
||
| FROM docker.io/library/debian:trixie |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we use fedora as base image?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I used Debian because docker.io/library/rust only has Debian (or Alpine) as base and a Debian base will be 100% ABI compatible for execution. I can check if Fedora works though (or base the build container on Fedora too, probably requires an explicit Rust installation).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, it can be built on the debian rust image, but you should copy it in a fedora base image. Or are you afraid that the dynamic library won't match?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or are you afraid that the dynamic library won't match?
Yes I was, but maybe it's fine. I'll test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nobody asked, but, would a similar approach to the one proposed in trusted-execution-clusters/compute-pcrs/pull/19 work for this?
i.e. fedora as builder, install needed dependencies (a subset of the compute-pcrs https://github.com/confidential-clusters/compute-pcrs/blob/main/.github/Containerfile.buildroot), build binary, then copy the binary to a clean fedora image?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or use compute-pcrs's image? 🙃
| let client = Client::try_default().await?; | ||
| let config_maps: Api<ConfigMap> = Api::namespaced(client, &args.namespace); | ||
| match config_maps | ||
| .create(&PostParams::default(), &config_map) | ||
| .await | ||
| { | ||
| Ok(_) => info!("Create ConfigMap {}", args.configmap), | ||
| Err(kube::Error::Api(ae)) if ae.code == 409 => { | ||
| info!("ConfigMap {} already exists", args.configmap) | ||
| } | ||
| Err(e) => return Err(e.into()), | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what if the config map already exists? You probably wants to retrieve its value check if it is different from the reference values, and if not then not update it. Right now, it make little sense but when we have more coreos versions to handle then the logic will become useful
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as per #13, the RVs will be computed statelessly, and the config map will be overwritten. the code that I wrote for this which I momentarily refuse to delete is at Jakob-Naucke:shelved-append-rvs. I'm in favor of merging this PR first and moving on from there.
manifest-gen/src/main.rs
Outdated
| )] | ||
| pcrs_compute_image: String, | ||
|
|
||
| #[arg(long, default_value = "quay.io/fedora/fedora-coreos:42.20250705.3.0")] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would remove this and hardcoded directly in the compute-pcrs. We will then add a logic to detect the coreos version to calculate
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hardcoded directly in the compute-pcrs
This info is required when defining the container and its image volume, it's nothing that compute-pcrs lib/bin/image can influence
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but the job with the image volume is created by the operator, so you don't need this in the manifests
operator/src/trustee.rs
Outdated
| name: Some(name.to_string()), | ||
| namespace: Some(namespace.to_string()), | ||
| let pod_spec = PodSpec { | ||
| service_account_name: Some("cocl-operator".to_string()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would create a separate service account for the job only with a separate Role to only be able to create and modify the config maps in the trustee namespace
|
@Jakob-Naucke as far as it regards the permission, I think you are referring to job RBAC, right? I think you should split the permission for the job only as I mentioned already here |
ef749e3 to
e517c6c
Compare
scripts/clean-cluster-kind.sh
Outdated
| fi | ||
| done | ||
| kubectl delete deploy cocl-operator -n confidential-clusters || true | ||
| kubectl delete job compute-pcrs -n confidential-clusters || true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think should be handled by the operator
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the job completes, the operator can then remove it
e517c6c to
2b06f5d
Compare
operator/src/trustee.rs
Outdated
| let create = jobs.create(&PostParams::default(), &job).await; | ||
| info_if_exists!(create, "Job", job_name); | ||
| let completed = await_condition(jobs.clone(), job_name, is_job_completed()); | ||
| let _ = timeout(Duration::from_secs(900), completed).await?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this blocking until it completes?
The usual way in kubernetes is watching objects with a controller and then trigger a reconciliation loop if there is a change in the state. Can we implement something like this here? See the rust documentation for the Controller
2b06f5d to
bd7749d
Compare
Signed-off-by: Jakob Naucke <jnaucke@redhat.com>
Signed-off-by: Jakob Naucke <jnaucke@redhat.com> Based-on-patch-by: Alice Frosi <alicefr@redhat.com>
Signed-off-by: Jakob Naucke <jnaucke@redhat.com> Based-on-patch-by: Alice Frosi <afrosi@redhat.com>
in place of ClusterRoleBindings Signed-off-by: Jakob Naucke <jnaucke@redhat.com> Based-on-patch-by: Alice Frosi <afrosi@redhat.com>
and some extra dependency cleanups Signed-off-by: Jakob Naucke <jnaucke@redhat.com>
Create new binary to use confidential-clusters/compute-pcrs-lib and write the configmap. Build image with it and extend cocl spec with the respective image fields. Run with a job. Supersedes the reference value input workflow. Signed-off-by: Jakob Naucke <jnaucke@redhat.com>
bd7749d to
a5ed684
Compare
| Action::requeue(Duration::from_secs(60)) | ||
| } | ||
|
|
||
| pub async fn launch_rv_job_controller(client: Client, namespace: &str) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice, yes I meant exactly this, thx!
Run job to run confidential-clusters/compute-pcrs. Supersedes the reference value input workflow. Hard-coded for FCOS 42. Supersedes #11.
@alicefr
cc @bgartzi