Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove cloning label from target pods to fix pod affinity. #280

Merged
merged 1 commit into from Jul 31, 2018

Conversation

zvikorn
Copy link
Contributor

@zvikorn zvikorn commented Jul 29, 2018

Issue:
#279
If we remove the cloning label from the target pod then the target pod will be scheduled on the same node as the source pod.

@@ -491,6 +490,10 @@ func MakeCloneTargetPodSpec(image, verbose, pullPolicy string, pvc *v1.Persisten
Name: CLONER_TARGET_PODNAME,
Image: image,
ImagePullPolicy: v1.PullPolicy(pullPolicy),
SecurityContext: &v1.SecurityContext{
Privileged: &[]bool{true}[0],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does the target pod need to run privileged? Is there a more granular capability this pod can be granted?

BTW, good job deducing the problem!

Copy link
Contributor Author

@zvikorn zvikorn Jul 30, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!
It runs privileged as the source pod does. It was added to the source when I add permission issue cloning on minishift. I used to get this error:
"error processing pvc "target-ns/target-pvc": source pod API create errored: pods "source-pod-" is forbidden: unable to validate against any security context constraint: [spec.containers[0].securityContext.securityContext.runAsUser: Invalid value: 0: must be in the ranges: [1000080000, 1000089999]]"
This error happens when the service account that launches the pod lacks the privileged scc in openshift.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack. However, running a container privileged is the "sledge hammer" fix. If possible it is preferable to add just the CAPS needed by the container to read from the socket. Maybe that does require a fully privileged container but I'm asking here in case not.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I am not sure how this is related to removing a label on the pod?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It’s not. It’s something I forgot to do before. I can remove it for this PR and submit a different PR for this.

Copy link
Member

@aglitke aglitke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please make sure you rebase this on master instead of merging in master. It makes the commit history more readable.

@aglitke
Copy link
Member

aglitke commented Jul 30, 2018

Please add the description of the problem (from the issue) to the commit message. That way we can see the explanation along with the code right in the git history.

@zvikorn
Copy link
Contributor Author

zvikorn commented Jul 30, 2018 via email

@zvikorn
Copy link
Contributor Author

zvikorn commented Jul 30, 2018 via email

The target pod looks for a pod with a specific label (specified in the pod affinity) that matches the source pod label.
In my case the target pod included this label as well, so we can see that the target pod found matching pod, but it is the WRONG pod. It's itself!!
The target was running without finding the source pod first.
If we remove this label from the target pod, it will find the source pod and then will be scheduled on the same node.
If it does not find the source pod (because the scheduler tried to schedule it before the source pod), it will be in 'Pending' state until the source pod is scheduled, and then will be running on the same node.
kubevirt#279
@jeffvance jeffvance merged commit 723e222 into kubevirt:master Jul 31, 2018
aglitke pushed a commit to aglitke/containerized-data-importer that referenced this pull request Jul 31, 2018
…ls. (kubevirt#280)

The target pod looks for a pod with a specific label (specified in the pod affinity) that matches the source pod label.
In my case the target pod included this label as well, so we can see that the target pod found matching pod, but it is the WRONG pod. It's itself!!
The target was running without finding the source pod first.
If we remove this label from the target pod, it will find the source pod and then will be scheduled on the same node.
If it does not find the source pod (because the scheduler tried to schedule it before the source pod), it will be in 'Pending' state until the source pod is scheduled, and then will be running on the same node.
kubevirt#279
aglitke pushed a commit to aglitke/containerized-data-importer that referenced this pull request Aug 1, 2018
…ls. (kubevirt#280)

The target pod looks for a pod with a specific label (specified in the pod affinity) that matches the source pod label.
In my case the target pod included this label as well, so we can see that the target pod found matching pod, but it is the WRONG pod. It's itself!!
The target was running without finding the source pod first.
If we remove this label from the target pod, it will find the source pod and then will be scheduled on the same node.
If it does not find the source pod (because the scheduler tried to schedule it before the source pod), it will be in 'Pending' state until the source pod is scheduled, and then will be running on the same node.
kubevirt#279
jeffvance pushed a commit that referenced this pull request Aug 1, 2018
* Invoke tar from same location in source and target mode (#297)

When cloning we were invoking tar at the container root in the source
mode and within the volume mount in target mode.  The result was the
disk.img file appearing in a tmp/clone/image subdirectory in the target
pvc instead of being copied exactly.

Signed-off-by: Adam Litke <alitke@redhat.com>

* Having the cloning label in the target pod, make the pod affinity fails. (#280)

The target pod looks for a pod with a specific label (specified in the pod affinity) that matches the source pod label.
In my case the target pod included this label as well, so we can see that the target pod found matching pod, but it is the WRONG pod. It's itself!!
The target was running without finding the source pod first.
If we remove this label from the target pod, it will find the source pod and then will be scheduled on the same node.
If it does not find the source pod (because the scheduler tried to schedule it before the source pod), it will be in 'Pending' state until the source pod is scheduled, and then will be running on the same node.
#279
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants