Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CHE-220: Improving 'removeContainer' API. Refactoring #5766

Merged
merged 1 commit into from
Jul 24, 2017

Conversation

ibuziuk
Copy link
Member

@ibuziuk ibuziuk commented Jul 20, 2017

Signed-off-by: Ilya Buziuk ibuziuk@redhat.com

What does this PR do?

Enhances OpenShift resource deletion for workspaces (scaling down deployment before deleting resources)

What issues does this PR fix or reference?

https://issues.jboss.org/browse/CHE-220

Changelog

Enhances OpenShift resource deletion for workspaces (scaling down deployment before deleting resources)

Release Notes

N/A

Docs PR

N/A

@ibuziuk ibuziuk requested a review from l0rd as a code owner July 20, 2017 17:57
@ibuziuk ibuziuk assigned l0rd and unassigned l0rd Jul 20, 2017
@ibuziuk ibuziuk requested review from sunix and amisevsk July 20, 2017 17:58
@codenvy-ci
Copy link

Can one of the admins verify this patch?

@codenvy-ci
Copy link

Can one of the admins verify this patch?

Copy link
Contributor

@amisevsk amisevsk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple of comments. +1 for simplifying OpenShiftConnector with KubernetesResourceUtil

.withName(deploymentName);

if (deployment != null) {
deployment.scale(0, true);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does waiting for the scale down take significantly longer than the previous way, or does the scale down take grace termination period into account? I ask because we had the issue on OpenShift where stopping and restarting a pod quickly could case PV issues.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there should not be PV issues since this change was applied in openshift-connector rebased, but grace termination period is taken into account

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@amisevsk btw in master I do not see that we use "terminationGracePeriodSeconds", So while testing it on minishift scale down does not happen fast and require some time - this might be the main reason of quota limit issue that was reported today

Copy link
Member Author

@ibuziuk ibuziuk Jul 20, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this PR is not applied to master #5007
But this is a separate conversation, I will provide another PR for that

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good - thanks @ibuziuk !

List<ReplicaSet> replicaSets = KubernetesResourceUtil.getReplicaSetByLabel(OpenShiftConnector.OPENSHIFT_DEPLOYMENT_LABEL, deploymentName, namespace);

try (OpenShiftClient openShiftClient = new DefaultOpenShiftClient()) {
if (replicaSets != null && replicaSets.size() > 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I worry about the order of removal here, since replica sets are controlled by deployments. It shouldn't be a problem since the deployment is scaled to zero first, but I think the deployment should be deleted before the replica set just in case. IIRC there are cases where a deployment will ensure a replica set exists, causing it to maybe be recreated before the deployment is deleted.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, lemme change the order

@@ -165,6 +162,7 @@
public static final String OPENSHIFT_DEPLOYMENT_LABEL = "deployment";
public static final String CHE_MOUNTED_WORKSPACE_FOLDER = "/workspace-logs";
public static final String WORKSPACE_LOGS_FOLDER_SUFFIX = "-logs";
public static final int OPENSHIFT_WAIT_POD_TIMEOUT = 240;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may be worth it to leave this private and just have a similar field in OpenShiftDeploymentCleaner. There's no reason the timeout on scaling down deployments needs to be the same as the one for waiting for containers to start (in fact, we may want the timeout for scaling down to be less than 4 minutes, that seems like a long time). Ideally, the scale down happens in under 5 seconds.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@amisevsk good point, but I'm not sure about 5 seconds - in che starter[1] stop timeout is 150 ms

https://github.com/redhat-developer/che-starter/blob/master/src/main/resources/application.properties#L21

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well, 150 ms is under 5 seconds ;)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo, 150 seconds ;-)

@ibuziuk
Copy link
Member Author

ibuziuk commented Jul 20, 2017

@amisevsk btw, it is technically just cherry-picking of #5157 ;-)

Signed-off-by: Ilya Buziuk <ibuziuk@redhat.com>
@ibuziuk
Copy link
Member Author

ibuziuk commented Jul 20, 2017

@amisevsk PR has been updated

@l0rd
Copy link
Contributor

l0rd commented Jul 21, 2017

@ibuziuk can you please test on OSIO if this PR solves redhat-developer/rh-che#217?

@ibuziuk
Copy link
Member Author

ibuziuk commented Jul 21, 2017

@l0rd I'm quite sure it will not because we need to also apply - #5007

@ibuziuk
Copy link
Member Author

ibuziuk commented Jul 21, 2017

PR for grace period has been sent - #5773

Copy link
Contributor

@l0rd l0rd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants