Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds max running time termination criteria #474

Merged
merged 26 commits into from
Aug 9, 2017

Conversation

jeppe-style
Copy link
Collaborator

@jeppe-style jeppe-style commented Jul 2, 2017

as per title

  • Adds method to terminate experiment in task scheduler
  • Refactors running states handling to separate class
  • Adds morphia converter for Time

fixes #337 (except extensions, that are now in #517)

related to #281

depends on PR #469

TODO

@jeppe-style jeppe-style added this to the Priorities milestone Jul 2, 2017
@jeppe-style jeppe-style mentioned this pull request Jul 2, 2017
42 tasks
@VincenzoFerme VincenzoFerme changed the base branch from devel to feature-tm-dsl-integration July 6, 2017 12:37
@VincenzoFerme VincenzoFerme changed the base branch from feature-tm-dsl-integration to feature-tm-links-response-objects July 6, 2017 12:54
@VincenzoFerme VincenzoFerme force-pushed the feature-tm-links-response-objects branch from b765c0d to 5aff9fd Compare July 6, 2017 13:29
Adds method to terminate experiment in task scheduler. Refactors running states handling to separate class. Adds morphia converter for Time.
@@ -127,6 +128,8 @@ public void run(BenchFlowTestManagerConfiguration configuration, Environment env
// http://mongodb.github.io/mongo-java-driver/3.4/driver/getting-started/quick-start/
MongoClient mongoClient = configuration.getMongoDBFactory().build();
ExecutorService taskExecutor = configuration.getTaskExecutorFactory().build(environment);
ScheduledThreadPoolExecutor timeOutScheduledThreadPoolExecutor =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should build it as you do for the other executor service we have: https://github.com/benchflow/benchflow/pull/474/files#diff-3ebdba5e745234429a46f7536fa3feb8R130

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? I don't see any need to put this in the configuration

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Postponed in #517

break;

case VALIDATE_TERMINATION_CRITERIA:
validateTerminationCriteria(testID);
runningStatesHandler.validateTerminationCriteria(testID);
break;

case DERIVE_PREDICTION_FUNCTION:
testModelDAO.setTestRunningState(testID, VALIDATE_PREDICTION_FUNCTION);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why you do this here and not in the function called in the next line?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It actually happens also happens in the function in the next line. Although this wasn't changed in this PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in #474 (commits)

@@ -135,6 +135,11 @@ private synchronized void handleStartState(String testID) {
// wait for task to complete
future.get();

if (isTerminated(testID)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why don't you check this in the setTestState method, so that is done only in one place?

Anyway this should not be possible, because once you get a timeout, you need to get exclusive access to the data structure where you keep running information and cleanup everything before other code gets executed.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The running tasks cannot be cancelled - the only way is to wait for them to terminate. Therefore I need to check if the test has been terminated in the meantime. It is also what we discussed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in #474 (commits)

@@ -156,6 +161,18 @@ private synchronized void handleWaitingState(String testID) {
// set state as ready
testModelDAO.setTestState(testID, BenchFlowTestState.READY);

// update max running time timeout
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need the following code? Is it to restart the execution, and take into account of the time the test has already been executed? If yes, I think this should happen in the RUNNING state when the test goes back to RUNNING.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to update the maxRunning time since that is read when the test goes back to RUNNING. In the WAITING state that is the only place I know how much time has elapsed since the TimeoutTask was scheduled.

@@ -172,44 +189,51 @@ private synchronized void handleTestRunningState(String testID) {
logger
.info("handleTestRunningState for " + testID + " with state " + testRunningState.name());

// set timeout if not already set
if (!timeoutTasks.containsKey(testID)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if the test has no timeout?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand the question fully

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in #474 (commits)

try {

// wait for task to complete
future.get();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you make task cancellable, then probably here you are going to get an exception and you follow the exceptional flow, so that you do not need to check all the time if something is terminated. When you apply the previous comment, then the handling can be delegated to a method.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see previous comment about cancellable tasks.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worked on this in #474 (commits). It is fine to have.

// replace with new task
testTasks.put(testID, future);

// we don't wait for the task to complete since the experiment-manager
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably at this stage, the waiting for this state should be exactly to wait for the wanted data from the experiment manager, that should be the details for each of the trials that actually gets executed up to the point in which the experiment manager declares the experiment as completed in any of the possible states.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is what happens - the EM informs when the experiment has completed

Copy link
Member

@VincenzoFerme VincenzoFerme Aug 8, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worked on this in #474 (commits).


try {

// TODO - update: set next state as validate termination criteria
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You set it to REMOVE_NON_REACHABLE_EXPERIMENTS

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what the comment is about. This is a TODO

String experimentID = testModelDAO.getRunningExperiment(testID);

if (experimentID != null) {
experimentManagerService.abortBenchFlowExperiment(experimentID);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might want to get and ACK about the execution of the operation, because otherwise it is really complex to control the behaviour of the distributed system.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is covered in issue #499

@@ -267,6 +270,60 @@ public void runLoadTest() throws Exception {

}

@Test
public void runBenchFlowTestTimeoutTest() throws Exception {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need also some tests where you submit multiple tests, and check the results of some of them, so that you can experience what happens with concurrency

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added that to the issue #499

@VincenzoFerme VincenzoFerme merged commit a19d162 into devel Aug 9, 2017
@VincenzoFerme VincenzoFerme deleted the feature-tm-test-termination-criteria branch August 9, 2017 07:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

Test Termination Criteria - Max Time
2 participants