Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement workspace start cancellation #3701

Merged
merged 1 commit into from
Jan 16, 2017
Merged

Conversation

voievodin
Copy link
Contributor

@voievodin voievodin commented Jan 12, 2017

Allows to cancel workspace start on the API level.

Allows to do things described by #3249, fixes #3742.

Implementation is based on deterministic points of time when components can check whether starting process is interrupted and react appropriately. The components like WorkspaceRuntimes & CheEnvironmentEngine are refactored to respect cancellation of starting thread. The other components are refactored to propagate interruption correctly, e.g. DockerConnector didn't propagate interrupted flag. So when we start the workspace starting task is saved and we keep it until workspace is started. If the task is canceled within thread interruption e.g. future.cancel(true) start task tries to stop the execution and cleanup corresponding resources. The caller of start(e.g. WorkspaceManager) will receive EnvironmentStartInterruptedException and may do whatever it needs to, for now workspace manager logs the info message that workspace start was directly interrupted and considers it as a normal behaviour.

WorkspaceRuntimes#startAsync now returns CompletableFuture<WorkspaceRuntime>
instead of regular Future which allows to handle start result in the same thread and reduces amount of threads needed for start. AlsoWorkspaceRuntimes doesn't expose RuntimeDescriptor anymore, it provides methods for getting & filling workspace with runtime information instead:

  • getStatus(workspaceId) - gets status
  • getRuntime(workspaceId) - gets runtime
  • injectRuntime(workspace) - injects status + runtime into workspace

WorkspaceRuntimes#getWorkspaces is removed as it's not used and it deals
with stale workspace statuses which is not good.

Agents now launched after a single machine is launched that allows to fail
faster if there is an error that doesn't allow to start the machine. In current version
of che agents are launched after all the machines are launched.

Also there are new properties for workspace pool adjustment, allows to configure
workspace starting queue dynamically. The properties are:

  • che.workspace.pool.type possible values are fixed, cached
  • che.workspace.pool.exact_size optional. This property is ignored when pool type is different from fixed. Configures the exact size of the pool, if it's set multiplier property is ignored.
    If this property is not set(0, < 0, NULL) then pool sized to number of cores, it can be modified
    within multiplier property
  • che.workspace.pool.cores_multiplier optional. This property is ignored when pool type is different from fixed or exact pool size is set. If it's set the pool size will be N_CORES * multiplier

@codenvy-ci
Copy link

Build # 1611 - FAILED

Please check console output at https://ci.codenvycorp.com/job/che-pullrequests-build/1611/ to view the results.

* }
* <pre>{@code
* StripedLocks stripedLocks = new StripedLocks(16);
* try (CloseableLock lock = stripedLocks.writeLock(myKey)) {
Copy link
Member

@sleshchenko sleshchenko Jan 13, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change CloseableLock to UnLocker

@codenvy-ci
Copy link

Build # 1614 - FAILED

Please check console output at https://ci.codenvycorp.com/job/che-pullrequests-build/1614/ to view the results.

@codenvy-ci
Copy link

Build # 1619 - FAILED

Please check console output at https://ci.codenvycorp.com/job/che-pullrequests-build/1619/ to view the results.

# operations that require asynchronous execution e.g. starting/stopping/snapshotting

# possible values are 'fixed', 'cached'
che.workspace.pool.type=fixed

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to overwrite these values from che.env? If not then why do we need these properties here?

Copy link
Contributor Author

@voievodin voievodin Jan 13, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If not then why do we need these properties here?

I didn't do anything for overwriting them from che.env.
I have the opposite question, if i put the properties into this place i probably mean that this is a part of workspace server configuration so WHY do i have to put them somewhere else?
From you question it looks like either property is present in che.properties and somewhere else or it's missing from che.properties.

The properties are low-level configuration for workspace threads pool, they allow users to provide more dynamic configuration of starting/stopping/snapshotting q.
So where should i put these properties?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@evoevodin has taken the right approach.

The way it works is that anything in che.properties can be overridden in the che.env with a naming convention. So in this case, an admin can enter CHE_WORKSPACE_POOL_TYPE=cached and it will override the property.

However, we are only adding entries in the default che.env which would be entries that we commonly expect an administrator to need to edit. For all other entries, which are more system operational internal entries, we just leave them in the che.properties and give admins a reference to it for investigation if they want.

I would include this property as an internal system property.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

@@ -86,7 +86,6 @@ public void shouldNotWriteIntoSubConsumersAfterClosingCompositeConsumer() throws
public Object[][] subConsumersExceptions() {
return new Throwable[][] {
{new ConsumerAlreadyClosedException("Error")},
{new ClosedByInterruptException()}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you changed this would you be able to add tests also?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

@@ -28,8 +28,9 @@
* <p>Workspace becomes starting only if it was {@link #STOPPED}.
* The status map:
* <pre>
* STOPPED -> <b>STARTING</b> -> RUNNING (normal behaviour)
* STOPPED -> <b>STARTING</b> -> STOPPED (failed to start)
* STOPPED -> <b>STARTING</b> -> RUNNING (normal behaviour)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Am I missing something or this docs should include snapshotting?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a definition for the STARTING status, there is now way you can go from STARTING to SNAPSHOTTING workspace, and only possible options are described in status doc, which are:

STOPPED -> STARTING -> RUNNING (regular flow)
STOPPED -> STARTING -> STOPPING (start cancellation)
STOPPED -> STARTING -> STOPPED (failed to start)

@Override
void close();
default void close() { unlock(); }

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to have unlock method inside of close if it is the same thing?

Copy link
Contributor Author

@voievodin voievodin Jan 13, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that it makes more sense to implement & use Unlocker.unlock instead of Unlocker.close.

@codenvy-ci
Copy link

@codenvy-ci
Copy link

@voievodin voievodin force-pushed the start_cancellation branch 2 times, most recently from f9ff29f to f87be46 Compare January 16, 2017 15:23
@voievodin voievodin merged commit 1ede484 into master Jan 16, 2017
@voievodin voievodin deleted the start_cancellation branch January 16, 2017 16:05
@codenvy-ci
Copy link

@slemeur slemeur added this to the 5.1.0 milestone Jan 17, 2017
@slemeur slemeur added the kind/enhancement A feature request - must adhere to the feature request template. label Jan 17, 2017
voievodin pushed a commit that referenced this pull request Jan 17, 2017
voievodin pushed a commit that referenced this pull request Jan 17, 2017
@JamesDrummond JamesDrummond mentioned this pull request Jan 31, 2017
9 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/enhancement A feature request - must adhere to the feature request template.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Provide ability to cancel workspace start
8 participants