Job not getting stopped when number of available hosts < 'minHosts' property #579

srinathgit · 2016-12-05T05:47:46Z

This issue was observed with a specific forest configuration described below:
A. This test was run on a 3 node cluster (rh7v-intel64-90-test-4/5/6.marklogic.com) with a forest (WriteBatcher-1,2,3) on each of the node which are associated with a db. 'WriteBatcher-1' is not configured for failover. 'WriteBatcher-3' is configured to fail over to host 'rh7v-intel64-90-test-5.marklogic.com'. 'WriteBatcher-2' is configured to fail over to host 'rh7v-intel64-90-test-4.marklogic.com'

B. Now when 'ihb2' WB job is getting executed,nodes rh7v-intel64-90-test-6.marklogic.com is first stopped .

21:16:24.885 [main] ERROR c.m.c.d.HostAvailabilityListener - ERROR: host unavailable "rh7v-intel64-90-test-6.marklogic.com", black-listing it for PT15S
The forest fails over to 'rh7v-intel64-90-test-5.marklogic.com'. The writing of document to db resumes once the failover is complete.

C. Now 'rh7v-intel64-90-test-5.marklogic.com' is stopped. It gets blacklisted
21:17:02.508 [main] ERROR c.m.c.d.HostAvailabilityListener - ERROR: host unavailable "rh7v-intel64-90-test-5", black-listing it for PT15S

D. After that, the job is stopped as available hosts < minHosts

21:17:02.772 [pool-1-thread-1] ERROR c.m.c.d.HostAvailabilityListener - Encountered [com.sun.jersey.api.client.ClientHandlerException: org.apache.http.NoHttpResponseException: The target server failed to respond] on host "rh7v-intel64-90-test-5.marklogic.com" but black-listing it would drop job below minHosts (2), so stopping job "unnamed".

E. After that , retrying of failed batches keeps running infinitely

21:17:02.550 [main] WARN c.m.c.d.HostAvailabilityListener - Retrying failed batch: 132, results so far: 2640, uris: [/local/ABC-2620, /local/ABC-2621, /local/ABC-2622, /local/ABC-2623, /local/ABC-2624, /local/ABC-2625, /local/ABC-2626, /local/ABC-2627, /local/ABC-2628, /local/ABC-2629, /local/ABC-2630, /local/ABC-2631, /local/ABC-2632, /local/ABC-2633, /local/ABC-2634, /local/ABC-2635, /local/ABC-2636, /local/ABC-2637, /local/ABC-2638, /local/ABC-2639]

F. The client process was killed after sometime and the client logs and stack trace have been attached.
Client log
Stack trace

Test:

@Test
public void testFailOver() throws Exception{
	try{
		final String query1 = "fn:count(fn:doc())";
		
		final AtomicInteger successCount = new AtomicInteger(0);
		
		final MutableBoolean failState = new MutableBoolean(false);
		final AtomicInteger failCount = new AtomicInteger(0);
						
		WriteBatcher ihb2 =  dmManager.newWriteBatcher();
		ihb2.withBatchSize(20);
		//ihb2.withThreadCount(120);
		
		
		ihb2.setBatchFailureListeners(
				  new HostAvailabilityListener(dmManager)
					.withSuspendTimeForHostUnavailable(Duration.ofSeconds(15))
					.withMinHosts(2)
				);	
		ihb2.onBatchSuccess(
			   batch -> {

					successCount.addAndGet(batch.getItems().length);
					System.out.println("Success Host: "+ batch.getClient().getHost());
					System.out.println("Success batch number: "+ batch.getJobBatchNumber());
					 System.out.println("Success Job writes so far: "+ batch.getJobWritesSoFar());
				  }
				)
				.onBatchFailure(
				  (batch, throwable) -> {
					  System.out.println("Failed batch number: "+ batch.getJobBatchNumber());
					  /*try{
						  System.out.println("Retrying batch: "+ batch.getJobBatchNumber());
						  ihb2.retry(batch);
					  }
					 catch(Exception e){
						 System.out.println("Retry of batch "+ batch.getJobBatchNumber()+ " failed");
						 e.printStackTrace();
					 }*/
					 
					  throwable.printStackTrace();
					  failState.setTrue();
					  failCount.addAndGet(batch.getItems().length);
				  });
		
		
		dmManager.startJob(ihb2);    
		
		for (int j =0 ;j < 20000; j++){
			String uri ="/local/ABC-"+ j;
			ihb2.add(uri, stringHandle);
		}
	
		
		ihb2.flushAndWait();
	   
		
		System.out.println("Fail : "+failCount.intValue());
		System.out.println("Success : "+successCount.intValue());
		System.out.println("Count : "+ dbClient.newServerEval().xquery(query1).eval().next().getNumber().intValue());
  
		Assert.assertTrue(dbClient.newServerEval().xquery(query1).eval().next().getNumber().intValue()==20000);
		
	}
	catch(Exception e){
		e.printStackTrace();
	}
}

The text was updated successfully, but these errors were encountered:

sammefford · 2016-12-05T23:58:09Z

Srinath, please test the fix on branch "issue579". Since it could be tricky for me to setup this exact scenario, I'm hoping you can try it in your setup and see if this is the right fix. Thanks to your thorough issue report here, I feel confident that I understand what caused the issue.

srinathgit · 2016-12-07T20:34:54Z

I am yet to test this fix but found another issue that may or may not cause this behavior. The initial hosts added to rotation include 'rh7v-intel64-90-test-5'. I am guessing it is using the hostname provided when creating the client object. Note other hosts have fully qualified domain names (rh7v-intel64-90-test-4/6.marklogic.com)
When 'rh7v-intel64-90-test-6.marklogic.com' is stopped, it is removed from the Hashset.
But the next time it reads forestconfig, you see both rh7v-intel64-90-test-5, rh7v-intel64-90-test-5.marklogic.com in the Hashset

21:16:40.119 [pool-2-thread-1] INFO c.m.c.d.impl.WriteBatcherImpl - (withForestConfig) Using [rh7v-intel64-90-test-5, rh7v-intel64-90-test-5.marklogic.com, rh7v-intel64-90-test-4.marklogic.com] hosts with forests for "WriteHostBatcher"

When 'rh7v-intel64-90-test-5' is stopped, it is removed from hashset .But when reading forestconfig again, we have both rh7v-intel64-90-test-5.marklogic.com, rh7v-intel64-90-test-4.marklogic.com in the Hashset

21:15:24.303 [main] INFO c.m.c.d.impl.WriteBatcherImpl - (withForestConfig) Using [rh7v-intel64-90-test-5, rh7v-intel64-90-test-6.marklogic.com, rh7v-intel64-90-test-4.marklogic.com] hosts with forests for "WriteHostBatcher"
21:16:24.885 [main] ERROR c.m.c.d.HostAvailabilityListener - ERROR: host unavailable "rh7v-intel64-90-test-6.marklogic.com", black-listing it for PT15S
21:16:24.896 [main] INFO c.m.c.d.impl.WriteBatcherImpl - (withForestConfig) Using [rh7v-intel64-90-test-5, rh7v-intel64-90-test-4.marklogic.com] hosts with forests for "WriteHostBatcher"
21:16:40.119 [pool-2-thread-1] INFO c.m.c.d.impl.WriteBatcherImpl - (withForestConfig) Using [rh7v-intel64-90-test-5, rh7v-intel64-90-test-5.marklogic.com, rh7v-intel64-90-test-4.marklogic.com] hosts with forests for "WriteHostBatcher"
21:17:02.508 [main] ERROR c.m.c.d.HostAvailabilityListener - ERROR: host unavailable "rh7v-intel64-90-test-5", black-listing it for PT15S
21:17:02.508 [main] INFO c.m.c.d.impl.WriteBatcherImpl - (withForestConfig) Using [rh7v-intel64-90-test-5.marklogic.com, rh7v-intel64-90-test-4.marklogic.com] hosts with forests for "WriteHostBatcher"
21:17:02.772 [pool-1-thread-1] ERROR c.m.c.d.HostAvailabilityListener - Encountered [com.sun.jersey.api.client.ClientHandlerException: org.apache.http.NoHttpResponseException: The target server failed to respond] on host "rh7v-intel64-90-test-5.marklogic.com" but black-listing it would drop job below minHosts (2), so stopping job "unnamed"

sammefford · 2016-12-07T20:59:34Z

Keen observation, Srinath! I think it's worth its own issue. Could you log it separately and keep this issue focused on fixing the infinte retry scenario?

srinathgit · 2016-12-08T06:42:58Z

Sam,

With 'issue579' branch, I still face the same issue. I've attached the logs with logging done the way you wanted (like stress tests). One additional info is that the initial client object (used to obtain instance of DataMovementManager) is obtained from "rh7v-intel64-90-test-5" (which is one of the nodes shutdown) during the test.

New Client log

sammefford · 2016-12-08T17:16:37Z

Ok, looks like your keen observation may have hit the nail on the head. I was distracted by the idea "even after job is stopped" and that's what I fixed in the last commit. But now we're seeing a situation where the job is not getting stopped because, as you noticed, the hostname on the DatabaseClient doesn't match any of the the preferred hostnames from ForestConfiguration, so we're not black-listing a host because we don't see it in the list of preferred hosts. Yet we're retrying the failed batch on the assumption the black-listing has already occured.

  if ( ! Arrays.asList(preferredHosts).contains(host) ) {
    // skip all the logic below because the host in question here is already
    // missing from the list of hosts for this batcher
    return shouldWeRetry;
  }

I believe I have a fix. Please forgive me for asking you to test it again.

sammefford · 2016-12-08T20:58:25Z

Please test on the issue579 branch

srinathgit · 2016-12-10T09:56:32Z

Sam,

I am still seeing the issue (job not getting stopped) . I have attached the
clientLog
. In readForestConfig() method (DataMovementServices.java), I added statements to print Host, alternateHost and openReplicaHost and in withForestConfig() method (WriteBatcherImpl.java) to print preferred host. I observe the following the following:

A. Initially all 3 nodes are up with a forest in each of them(WriteHostBatcher-1/2/3 on rh7v-intel64-90-test-4/5/6.marklogic.com). It can be seen that if db client object used to query 'forestinfo' endpoint uses hostname as "rh7v-intel64-90-test-5"(as opposed to FQDN) , the Host is set to "rh7v-intel64-90-test-5.marklogic.com" and alternate host is set to "rh7v-intel64-90-test-5"

21:59:04.095 [main] DEBUG c.m.client.impl.JerseyServices - Getting forestinfo as application/json
Host is rh7v-intel64-90-test-5.marklogic.com
alternateHost is rh7v-intel64-90-test-5
openReplicaHost is null

Host is rh7v-intel64-90-test-6.marklogic.com
alternateHost is null
openReplicaHost is null

Host is rh7v-intel64-90-test-4.marklogic.com
alternateHost is null
openReplicaHost is null

Preferred Host is rh7v-intel64-90-test-5
Preferred Host is rh7v-intel64-90-test-6.marklogic.com
Preferred Host is rh7v-intel64-90-test-4.marklogic.com
21:59:04.142 [main] INFO  c.m.c.d.impl.WriteBatcherImpl - (withForestConfig) Using [rh7v-intel64-90-test-5, rh7v-intel64-90-test-6.marklogic.com, rh7v-intel64-90-test-4.marklogic.com] hosts with forests for "WriteHostBatcher"

B. Now , node rh7v-intel64-90-test-6.marklogic.com (hosting WriteHostBatcher-3) is stopped. The node is blacklisted and hence removed from the hashset as expected

Preferred Host is rh7v-intel64-90-test-5
Preferred Host is rh7v-intel64-90-test-4.marklogic.com
21:59:10.919 [pool-1-thread-1] INFO  c.m.c.d.impl.WriteBatcherImpl - (withForestConfig) Using [rh7v-intel64-90-test-5, rh7v-intel64-90-test-4.marklogic.com] hosts with forests for "WriteHostBatcher"

C. This is when forest failover starts to occur. The forest on rh7v-intel64-90-test-6.marklogic.com is configured to fail over to rh7v-intel64-90-test-5.marklogic.com. As can be seen,now when 'forestinfo' endpoint is queried, for 'WriteHostBatcher-3' it returns 'Host' as rh7v-intel64-90-test-6.marklogic.com (where it was configured to run originally) and 'alternateHost' is rh7v-intel64-90-test-5.marklogic.com (where the forest failed over to and currently running). So, preferred host for this 'forest' object is now set to 'rh7v-intel64-90-test-5.marklogic.com' and hashset now contains [rh7v-intel64-90-test-5, rh7v-intel64-90-test-5.marklogic.com, rh7v-intel64-90-test-4.marklogic.com]

21:59:25.974 [pool-2-thread-1] INFO  c.m.c.d.HostAvailabilityListener - it's been PT15S since host rh7v-intel64-90-test-6.marklogic.com failed, opening communication to all server hosts [[rh7v-intel64-90-test-4.marklogic.com, rh7v-intel64-90-test-5, rh7v-intel64-90-test-5.marklogic.com]]

Host is rh7v-intel64-90-test-4.marklogic.com
alternateHost is null
openReplicaHost is null

Host is rh7v-intel64-90-test-5.marklogic.com
alternateHost is rh7v-intel64-90-test-5
openReplicaHost is null

Host is rh7v-intel64-90-test-6.marklogic.com
alternateHost is rh7v-intel64-90-test-5.marklogic.com
openReplicaHost is null

Preferred Host is rh7v-intel64-90-test-5
Preferred Host is rh7v-intel64-90-test-5.marklogic.com
Preferred Host is rh7v-intel64-90-test-4.marklogic.com
21:59:25.974 [pool-2-thread-1] INFO  c.m.c.d.impl.WriteBatcherImpl - (withForestConfig) Using [rh7v-intel64-90-test-5, rh7v-intel64-90-test-5.marklogic.com, rh7v-intel64-90-test-4.marklogic.com] hosts with forests for "WriteHostBatcher"

D. Now, when host "rh7v-intel64-90-test-5" is stopped, "rh7v-intel64-90-test-5" is blacklisted .The job should stop as 2 hosts are down but hashset still contains [rh7v-intel64-90-test-5.marklogic.com, rh7v-intel64-90-test-4.marklogic.com] and hence it continues to run. Based on this, it looks like hashset 'hosts' should be populated with object 'host' (forestNode.get("host").asText())

21:59:48.991 [main] ERROR c.m.c.d.HostAvailabilityListener - ERROR: host unavailable "rh7v-intel64-90-test-5", black-listing it for PT15S
com.sun.jersey.api.client.ClientHandlerException: org.apache.http.NoHttpResponseException: The target server failed to respond
	at com.sun.jersey.client.apache4.ApacheHttpClient4Handler.handle(ApacheHttpClient4Handler.java:184) ~[jersey-apache-client4-1.17.jar:1.17]
	at com.marklogic.client.impl.DigestChallengeFilter.handle(DigestChallengeFilter.java:34) ~[classes/:na]
	at com.sun.jersey.api.client.filter.HTTPDigestAuthFilter.handle(HTTPDigestAuthFilter.java:493) ~[jersey-client-1.17.jar:1.17]
	at com.sun.jersey.api.client.Client.handle(Client.java:648) ~[jersey-client-1.17.jar:1.17]
	at com.sun.jersey.api.client.WebResource.handle(WebResource.java:680) ~[jersey-client-1.17.jar:1.17]
	at com.sun.jersey.api.client.WebResource.access$200(WebResource.java:74) ~[jersey-client-1.17.jar:1.17]
	at com.sun.jersey.api.client.WebResource$Builder.post(WebResource.java:568) ~[jersey-client-1.17.jar:1.17]
	at com.marklogic.client.impl.JerseyServices.doPost(JerseyServices.java:3976) ~[classes/:na]
	at com.marklogic.client.impl.JerseyServices.postResource(JerseyServices.java:3263) ~[classes/:na]
	at com.marklogic.client.impl.JerseyServices.postBulkDocuments(JerseyServices.java:3381) ~[classes/:na]
	at com.marklogic.client.impl.DocumentManagerImpl.write(DocumentManagerImpl.java:618) ~[classes/:na]
	at com.marklogic.client.impl.GenericDocumentImpl.write(GenericDocumentImpl.java:1) ~[classes/:na]
	at com.marklogic.client.impl.DocumentManagerImpl.write(DocumentManagerImpl.java:611) ~[classes/:na]
	at com.marklogic.client.impl.GenericDocumentImpl.write(GenericDocumentImpl.java:1) ~[classes/:na]
	at com.marklogic.client.datamovement.impl.WriteBatcherImpl$BatchWriter.run(WriteBatcherImpl.java:973) ~[classes/:na]
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[na:1.8.0_74]
	at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[na:1.8.0_74]
	at java.util.concurrent.ThreadPoolExecutor$CallerRunsPolicy.rejectedExecution(ThreadPoolExecutor.java:2022) ~[na:1.8.0_74]
	at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:823) ~[na:1.8.0_74]
	at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1369) ~[na:1.8.0_74]
	at java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:112) ~[na:1.8.0_74]
	at com.marklogic.client.datamovement.impl.WriteBatcherImpl$WriteThreadPoolExecutor.submit(WriteBatcherImpl.java:1070) ~[classes/:na]
	at com.marklogic.client.datamovement.impl.WriteBatcherImpl.add(WriteBatcherImpl.java:281) ~[classes/:na]
	at com.marklogic.client.datamovement.impl.WriteBatcherImpl.add(WriteBatcherImpl.java:247) ~[classes/:na]
	at com.marklogic.client.datamovement.functionaltests.WBFailoverTest.testFailOver(WBFailoverTest.java:331) ~[bin/:na]
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[na:1.8.0_74]
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[na:1.8.0_74]
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.8.0_74]
	at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_74]
	at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) ~[junit-4.11.jar:na]
	at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) ~[junit-4.11.jar:na]
	at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) ~[junit-4.11.jar:na]
	at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) ~[junit-4.11.jar:na]
	at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) ~[junit-4.11.jar:na]
	at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) ~[junit-4.11.jar:na]
	at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271) ~[junit-4.11.jar:na]
	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70) ~[junit-4.11.jar:na]
	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50) ~[junit-4.11.jar:na]
	at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238) ~[junit-4.11.jar:na]
	at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63) ~[junit-4.11.jar:na]
	at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236) ~[junit-4.11.jar:na]
	at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53) ~[junit-4.11.jar:na]
	at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229) ~[junit-4.11.jar:na]
	at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) ~[junit-4.11.jar:na]
	at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) ~[junit-4.11.jar:na]
	at org.junit.runners.ParentRunner.run(ParentRunner.java:309) ~[junit-4.11.jar:na]
	at org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:86) ~[.cp/:na]
	at org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38) ~[.cp/:na]
	at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:459) ~[.cp/:na]
	at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:675) ~[.cp/:na]
	at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:382) ~[.cp/:na]
	at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:192) ~[.cp/:na]
Caused by: org.apache.http.NoHttpResponseException: The target server failed to respond
	at org.apache.http.impl.conn.DefaultResponseParser.parseHead(DefaultResponseParser.java:101) ~[httpclient-4.1.1.jar:4.1.1]
	at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:252) ~[httpcore-4.1.jar:4.1]
	at org.apache.http.impl.AbstractHttpClientConnection.receiveResponseHeader(AbstractHttpClientConnection.java:281) ~[httpcore-4.1.jar:4.1]
	at org.apache.http.impl.conn.DefaultClientConnection.receiveResponseHeader(DefaultClientConnection.java:247) ~[httpclient-4.1.1.jar:4.1.1]
	at org.apache.http.impl.conn.AbstractClientConnAdapter.receiveResponseHeader(AbstractClientConnAdapter.java:219) ~[httpclient-4.1.1.jar:4.1.1]
	at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:298) ~[httpcore-4.1.jar:4.1]
	at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125) ~[httpcore-4.1.jar:4.1]
	at org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:633) ~[httpclient-4.1.1.jar:4.1.1]
	at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:454) ~[httpclient-4.1.1.jar:4.1.1]
	at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:820) ~[httpclient-4.1.1.jar:4.1.1]
	at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:776) ~[httpclient-4.1.1.jar:4.1.1]
	at com.sun.jersey.client.apache4.ApacheHttpClient4Handler.handle(ApacheHttpClient4Handler.java:170) ~[jersey-apache-client4-1.17.jar:1.17]
	... 51 common frames omitted
Preferred Host is rh7v-intel64-90-test-5.marklogic.com	
Preferred Host is rh7v-intel64-90-test-4.marklogic.com
21:59:48.991 [main] INFO  c.m.c.d.impl.WriteBatcherImpl - (withForestConfig) Using [rh7v-intel64-90-test-5.marklogic.com, rh7v-intel64-90-test-4.marklogic.com] hosts with forests for "WriteHostBatcher"

srinathgit · 2017-04-03T23:31:01Z

This issue still occurs in develop branch. All the information mentioned in the previous comment still apply. The latest client log is attached (WriteHostBatcher-1/2/3 on rh7v-intel64-90-test-19/20/21.marklogic.com)

clientLog

…ull host names; #579 - Job not stopped

srinathgit · 2017-10-11T00:04:17Z

The job stops after the fix

srinathgit added Bug minor new labels Dec 5, 2016

srinathgit added this to the java-client-api-4.0-ea4 milestone Dec 5, 2016

srinathgit assigned sammefford Dec 5, 2016

sammefford added a commit that referenced this issue Dec 5, 2016

fix #579 - stop queued tasks from writing when job has been stopped

58587fc

sammefford added test and removed new labels Dec 5, 2016

sammefford assigned srinathgit and unassigned sammefford Dec 5, 2016

srinathgit added verify and removed test labels Dec 7, 2016

srinathgit added test fix and removed verify test labels Dec 8, 2016

srinathgit assigned sammefford and unassigned srinathgit Dec 8, 2016

sammefford added a commit that referenced this issue Dec 8, 2016

fix #579 - always use getPreferredHost to find the host for a forest

0a9c86f

sammefford added test and removed fix labels Dec 8, 2016

sammefford assigned srinathgit and unassigned sammefford Dec 8, 2016

srinathgit added the verify label Dec 10, 2016

srinathgit removed the test label Dec 10, 2016

srinathgit assigned sammefford and unassigned srinathgit Dec 13, 2016

sammefford modified the milestones: java-client-api-4.0.1, java-client-api-4.0-ea4 Dec 13, 2016

srinathgit changed the title ~~Retrying failed batches continue (infintely) even after job is stopped~~ Job not getting stopped when number of available hosts < 'minHosts' property Dec 13, 2016

sammefford modified the milestones: java-client-api-4.0.2, java-client-api-4.0.1 Jan 20, 2017

sammefford modified the milestones: java-client-api-4.0.2, java-client-api-4.0.3 May 17, 2017

sammefford assigned vivekmuniyandi and unassigned sammefford Jul 28, 2017

vivekmuniyandi added a commit that referenced this issue Aug 24, 2017

#744 - WriteBatcher hangs; #668 - White list and black list; #666 - N…

8a3f4f9

…ull host names; #579 - Job not stopped

vivekmuniyandi assigned srinathgit and unassigned vivekmuniyandi Aug 24, 2017

vivekmuniyandi added test and removed verify labels Aug 24, 2017

srinathgit added ship and removed test labels Oct 11, 2017

vivekmuniyandi closed this as completed Feb 6, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Job not getting stopped when number of available hosts < 'minHosts' property #579

Job not getting stopped when number of available hosts < 'minHosts' property #579

srinathgit commented Dec 5, 2016 •

edited

Loading

sammefford commented Dec 5, 2016 •

edited

Loading

srinathgit commented Dec 7, 2016

sammefford commented Dec 7, 2016

srinathgit commented Dec 8, 2016

sammefford commented Dec 8, 2016

sammefford commented Dec 8, 2016

srinathgit commented Dec 10, 2016 •

edited

Loading

srinathgit commented Apr 3, 2017

srinathgit commented Oct 11, 2017

Job not getting stopped when number of available hosts < 'minHosts' property #579

Job not getting stopped when number of available hosts < 'minHosts' property #579

Comments

srinathgit commented Dec 5, 2016 • edited Loading

sammefford commented Dec 5, 2016 • edited Loading

srinathgit commented Dec 7, 2016

sammefford commented Dec 7, 2016

srinathgit commented Dec 8, 2016

sammefford commented Dec 8, 2016

sammefford commented Dec 8, 2016

srinathgit commented Dec 10, 2016 • edited Loading

srinathgit commented Apr 3, 2017

srinathgit commented Oct 11, 2017

srinathgit commented Dec 5, 2016 •

edited

Loading

sammefford commented Dec 5, 2016 •

edited

Loading

srinathgit commented Dec 10, 2016 •

edited

Loading