New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Waiting for index range calculation before switching deflector alias. #2278
Changes from 12 commits
d571f4e
c74d5c6
8af9812
32846cf
d523b74
4211a3f
d30d23c
562ea41
4ee1bd9
6a1fa50
9fdf88a
ff0755c
7eb08c5
56dd94a
af3d619
74ea13c
449ab41
477a01d
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -31,4 +31,5 @@ public interface IndexRangeService { | |
void save(IndexRange indexRange); | ||
|
||
IndexRange calculateRange(String index); | ||
IndexRange createForDeflector(String index); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think Maybe adding it as a default method (https://docs.oracle.com/javase/tutorial/java/IandI/defaultmethods.html) would make sense here as its implementation doesn't rely on any internals of the specific implementation of There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That's a good idea! 👍 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Turning out it's not as easy as it seems, as it currently relies on the choice of the value object in the service implementation. Using |
||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -26,17 +26,39 @@ | |
import org.slf4j.Logger; | ||
import org.slf4j.LoggerFactory; | ||
|
||
import javax.annotation.Nullable; | ||
import javax.inject.Inject; | ||
import javax.validation.constraints.Null; | ||
import java.util.Map; | ||
import java.util.concurrent.ConcurrentHashMap; | ||
import java.util.concurrent.Executors; | ||
import java.util.concurrent.ScheduledExecutorService; | ||
import java.util.concurrent.ScheduledFuture; | ||
import java.util.concurrent.ThreadFactory; | ||
import java.util.concurrent.TimeUnit; | ||
|
||
import static com.codahale.metrics.MetricRegistry.name; | ||
|
||
public class SystemJobManager { | ||
|
||
public class ScheduleResult { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This should be a static nested class or no nested class at all (as it's also used by other classes). It would also be suitable to be converted to an Auto Value class. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why is it necessary to make it static? It does not access any members of the enclosing class. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ✅ You're right. |
||
private final String jobId; | ||
private final ScheduledFuture<?> future; | ||
|
||
ScheduleResult(final String jobId, final ScheduledFuture<?> future) { | ||
this.jobId = jobId; | ||
this.future = future; | ||
} | ||
|
||
public String getJobId() { | ||
return jobId; | ||
} | ||
|
||
public ScheduledFuture<?> getFuture() { | ||
return future; | ||
} | ||
} | ||
|
||
private static final Logger LOG = LoggerFactory.getLogger(SystemJobManager.class); | ||
private static final int THREAD_POOL_SIZE = 15; | ||
|
||
|
@@ -60,10 +82,18 @@ private ScheduledExecutorService executorService(final MetricRegistry metricRegi | |
} | ||
|
||
public String submit(final SystemJob job) throws SystemJobConcurrencyException { | ||
return submitWithDelay(job, 0, TimeUnit.SECONDS); | ||
return submitForResult(job).getJobId(); | ||
} | ||
|
||
public ScheduleResult submitForResult(final SystemJob job) throws SystemJobConcurrencyException { | ||
return submitWithDelayForResult(job, 0, TimeUnit.SECONDS); | ||
} | ||
|
||
public String submitWithDelay(final SystemJob job, final long delay, TimeUnit timeUnit) throws SystemJobConcurrencyException { | ||
return submitWithDelayForResult(job, delay, timeUnit).getJobId(); | ||
} | ||
|
||
public ScheduleResult submitWithDelayForResult(final SystemJob job, final long delay, TimeUnit timeUnit) throws SystemJobConcurrencyException { | ||
// for immediate jobs, check allowed concurrency right now | ||
if (delay == 0) { | ||
checkAllowedConcurrency(job); | ||
|
@@ -74,7 +104,7 @@ public String submitWithDelay(final SystemJob job, final long delay, TimeUnit ti | |
job.setId(new UUID().toString()); | ||
jobs.put(job.getId(), job); | ||
|
||
executor.schedule(new Runnable() { | ||
final ScheduledFuture<?> future = executor.schedule(new Runnable() { | ||
@Override | ||
public void run() { | ||
try { | ||
|
@@ -89,7 +119,7 @@ public void run() { | |
x.stop(); | ||
|
||
final String msg = "SystemJob <" + job.getId() + "> [" + jobClass + "] finished in " + x.elapsed( | ||
TimeUnit.MILLISECONDS) + "ms."; | ||
TimeUnit.MILLISECONDS) + "ms."; | ||
LOG.info(msg); | ||
activityWriter.write(new Activity(msg, SystemJobManager.class)); | ||
} catch (SystemJobConcurrencyException ignored) { | ||
|
@@ -102,7 +132,7 @@ public void run() { | |
}, delay, timeUnit); | ||
|
||
LOG.info("Submitted SystemJob <{}> [{}]", job.getId(), jobClass); | ||
return job.getId(); | ||
return new ScheduleResult(job.getId(), future); | ||
} | ||
|
||
protected void checkAllowedConcurrency(SystemJob job) throws SystemJobConcurrencyException { | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the worst case this would block the thread for the duration of
deflectorIndexReadOnlyTimeout
(1 hour by default). What would happen if there is the need to do another cycle during that time?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From what I have seen in the code, this would mean that another cycle would be done, which would be waiting at the same point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wouldn't feel good adding this "waiting on system job" logic in here.
Would it be enough to trigger an index range calculation job in the "make-read-only" job after it's done?
addSingleIndexRanges(oldTarget)
in line 160SetIndexReadOnlyJob
to trigger aCreateNewSingleIndexRangeJob
for the index once the read-only status is setWith this we wouldn't need the timeout logic and all the changes to the system jobs to return a future. We can also be sure that the index rage is correct because the index cannot be written anymore. Until the index is read-only, it will be included in the searches because it has the "deflector" index range.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Coupling the job which sets indices read only to the index range calculation job sounds wrong to me, because they are not logically coupled and executing the first one should not need you to know (and be surprised about) that the second one is triggered implicitly too. The coupling should be done from the outside (as it is done here) via composition without any inherent coupling.
The logic you are describing is exactly what is happening in the PR, why it should be better to give up a separation of concerns to shift the waiting logic one level lower, while gaining nothing, is beyond me, tbh.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the problem is at line 182.
Why should we wait for job finished?
Is it possible to schedule one job after another (sorry, don't knwo deeply Java)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only problem I see - you will wait here at least for 30 seconds (submit delay) and will block scheduler (single thread) for that time.
however, I'am not sure if it is real problem :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's why I've asked if the thread is blocked and what happened if a new deflector cycle was necessary while waiting for the last one. 😉
In the end, this would in the worst case consume all (executor) threads in
SystemJobManager
and block all subsequent deflector cycles (until some threads are freed again).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, but from what I've seen this is already the case before this change. I am not sure if you are implying that there is a regression in this PR or that we should also change the current behavior (which I would perceive as scope creep).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
scheduleResult.getFuture().get(deflectorIndexReadOnlyTimeout.toMilliseconds(), TimeUnit.MILLISECONDS)
was introduced by this commit.In matster there is no any waiting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that the coupling introduced by this wouldn't be nice. We could introduce a separate job class that takes care of both. (i.e.
SetIndexReadOnlyAndCalculateRange
) That would make it clear what the job is supposed to do.I also agree that it's nicer to be able to compose smaller jobs to avoid having lots of specialized jobs, but in this case I would vote for having one to avoid the introduction of timeout logic in this class and to avoid the async/sync issue you described later.
Another option would be to extend the system jobs subsystem to be able to run another job (or more) after one has finished. This would make it possible to compose a chain of system jobs without having to use timeouts or block the calling thread.
I would happily give up a separation of concern when I can avoid using a timeout. 😉 The problem I see with using timeouts is that it's really hard to find a good default for them and that the user probably needs to tune them to make them work with their setup which makes using the product harder.