New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added mechanism to rampup Containerized from POLL dispatch #2779
Conversation
Codecov Report
@@ Coverage Diff @@
## master #2779 +/- ##
============================================
- Coverage 39.73% 39.72% -0.01%
- Complexity 4159 4171 +12
============================================
Files 569 569
Lines 37508 37544 +36
Branches 4365 4370 +5
============================================
+ Hits 14903 14914 +11
- Misses 21359 21382 +23
- Partials 1246 1248 +2
Continue to review full report at Codecov.
|
Pull Request Test Coverage Report for Build 5965
💛 - Coveralls |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for looking into this.
As part of containerized dispatch we mode modifications to both ExecutorApiGateway
and ExecutorApiClient
. Most of them related to the reverse proxy but fairly orthogonal - including, TLS client support, generating prefix and making calls agains the reverse-proxy (ambassador) uri.
For executing any API calls against both the bare-metal executors as well as the FlowContainers we will likely need to have two separate instances of the ExecutorApiGateway
with different configs.
Wanted to confirm our plan for addressing it.
@@ -123,7 +138,18 @@ public long getQueuedFlowSize() { | |||
*/ | |||
@Override | |||
public DispatchMethod getDispatchMethod() { | |||
return DispatchMethod.CONTAINERIZED; | |||
if (this.rampUp == 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DispatchMethod is very useful and it controlling the behaviour at the moment by turning off and turning on feature. DispatchMethod is currently specified through configs. Hence, dispatch method is initialized during web server startup. The containerization feature is turned off if dispath method is POLL. If dispatch method is set to CONTAINERIZED then only all the guava juice binding for containerization are initialized. As we are dynamically toggling between POLL and CONTAINERIZED based on rampup we need to reload the bindings as well. In other words, we need to always intialized the bindings or look at different way of handling this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1. I didn't see any changes related to injection over here. How will implementation for POLL method be injected in case of ramp up?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a difference between DispatchMethod.CONTAINERIZED specified through the configs and that will always be DispatchMethod.CONTAINERIZED for the ramp-up feature.
Here it is about the return value of getDispatchMethod() which till now is not used anywhere in the code and hence doesn't affect anything. The return value of this is utilized to set the value of dispatch_method column in the execution_flows table while inserting the entry to the table. Hence at no place injections will be affected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. And specifying dispatch method as POLL from executor side will take care of polling executions. Thank you for clarifying this.
this.numVal = numVal; | ||
} | ||
|
||
public int getNumVal() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any need to add method which will return enum for given num value?
What is the significance of this num value? Can you please add it in comment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Till now we don't have a use-case for num to enum.
We are using Int as the data-type of dispatch_method in the execution_flows table and hence getNumVal is needed.
@@ -84,7 +87,8 @@ public void uploadExecutableFlow(final ExecutableFlow flow) | |||
final SQLTransaction<Long> insertAndGetLastID = transOperator -> { | |||
transOperator.update(INSERT_EXECUTABLE_FLOW, flow.getProjectId(), | |||
flow.getFlowId(), flow.getVersion(), flow.getStatus().getNumVal(), | |||
submitTime, flow.getSubmitUser(), submitTime, executorId, flowPriority, executionSource); | |||
submitTime, flow.getSubmitUser(), submitTime, executorId, flowPriority, executionSource | |||
, dispatchMethod.getNumVal()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume you are keeping num value for the optimization that int will provide from database side instead of enum name itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes.
@@ -123,7 +138,18 @@ public long getQueuedFlowSize() { | |||
*/ | |||
@Override | |||
public DispatchMethod getDispatchMethod() { | |||
return DispatchMethod.CONTAINERIZED; | |||
if (this.rampUp == 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1. I didn't see any changes related to injection over here. How will implementation for POLL method be injected in case of ramp up?
|
||
-- TODO: Add the alter table script in the specific release | ||
-- Adding dispatch_method column in execution_flows | ||
-- alter table execution_flows add column dispatch_method TINYINT default 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please add index for this field? It is in the where clause in select query so it will be useful to have it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@@ -15,6 +15,7 @@ CREATE TABLE execution_flows ( | |||
use_executor INT DEFAULT NULL, | |||
flow_priority TINYINT NOT NULL DEFAULT 5, | |||
execution_source VARCHAR(32) DEFAULT NULL, | |||
dispatch_method TINYINT DEFAULT 1, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please add index for this field?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Upgrade sql was not added for this change, but it should be (under https://github.com/azkaban/azkaban/tree/master/azkaban-db/src/main/sql).
Same for the added CREATE INDEX below.
Edit: fix candidate here #2932
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for taking care of all the comments.
This ramp-up mechanism requires that containerization be enabled and ramp-up percentages are set appropriately. Setting it to 100 will make all the executions use Containerization and setting it to 0 implies they will be polled by the executor server.
This allows ContainerizedDispatchManager to submit/upload some flows to the execution_flows table based on the ramp percentage with an additional column value dispatch_method. Whenever executions are pulled from the execution_flows table, respective dispatch managers will query by filtering the dispatch_method. Any future changes to the ContainerizedDispatchManager will automatically be honored as there is no new DispatchManager.
If this mechanism needs to removed then the only change will be required in the class azkaban.executor.container.ContainerizedDispatchManager where getDispatchMethod will be changed to always return DispatchMethod.CONTAINERIZED
Following is the summary of the change: