-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implemented new REST API hpFlows to add high priority flows to Azkaban. #3014
base: master
Are you sure you want to change the base?
Conversation
…ll only use ACTIVE version of the docker image of all the jobtypes required. This ensures that these flows always run on a stable version rather than any version which is being ramped up and is potentially unstable. There API can be only called by either Azkaban admins or the designated owners. Currently the ability to add owners is a manual process. An ADMIN only API to add such owners will come in the future. API to remove a flow or list of flows from the high priority flows will also come in future. Added unit tests as and where needed.
*/ | ||
@Override | ||
public boolean isHPFlowOwner(final String userId) { | ||
final FetchHPFlowOwnership fetchHPFlowOwnership = new FetchHPFlowOwnership(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it necessary to create a new object every time?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure, I believe I followed the pattern used in the rest of the code. Do you see any harm in doing this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understanding is that in general dependencies should be injected. I don't see a specific issue here, but I think it might be better memory management to only create the object as necessary, especially since it seems like we're using this in a mostly static context (only to access the encapsulated query).
log.info("Added the HP flows by user " + userId); | ||
return count; | ||
} catch (final SQLException e) { | ||
log.error("Unable to add HP flows", e); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we interested in adding the user in this log message, if we're adding the user in the info message?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is a good catch. Will add.
if (e.getErrorCode() == SQL_ERROR_CODE_DUPLICATE_ENTRY) { | ||
errorMessage = "Reason: One or more flows already exists."; | ||
} else { | ||
errorMessage = "Reason: "; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shall we say "Reason: Unknown" here, or something else?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do.
public class HPFlowDTO extends BaseDTO { | ||
// CSV of flow ids. | ||
@JsonProperty("flowIds") | ||
@NotBlank(message = "flowList cannot be empty.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be flowIds?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do. Should be consistent.
* @return list of flow IDs. | ||
*/ | ||
public List<String> getFlowIdList() { | ||
if (this.flowIds.isEmpty()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is this possible without an error from the Not Blank annotation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Its a sanity check. If you think its useless, let me know.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be better not to validate twice, but that is just my personal preference.
} | ||
|
||
@Override | ||
protected void handleGet(final HttpServletRequest req, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OOC: Is this a todo or do we have no plans to implement the Get call?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At the time when this PR was created, expectation was to have basic functionality implemented hence this one was not implemented. However, things have now changed, it will be implemented in the upcoming commit for this PR.
* @param user | ||
* @return | ||
*/ | ||
private boolean hasHPFlowManagementPermission(final User user) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I might be incorrect as I am going from memory, but wouldn't a direct call to the permission manager provide the same functionality?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This API checks if the user is Azakaban admin or not, if yes, then PermissionManager is not called.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're correct. I was miss-remembering. Thank you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How will users be added as owners for hp_flows? Will it be manual task and db entry directly? Can you please also add ticket for getHpFlows? It will be useful API. It will also be good to show this information on /status page. Can you please add task for that too?
* @return | ||
* @throws ImageMgmtException | ||
*/ | ||
public boolean hasPermission(final String userId) throws ImageMgmtException; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method is validating permission for hp flow. Can you please rename this method to convey the same?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do.
} | ||
|
||
// Throw exception with FORBIDDEN error. | ||
log.error(String.format("API access permission check failed. The user %s " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please create object for error message and use it for both logging as well as exception?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, will do.
*/ | ||
public interface HPFlowDao { | ||
/** | ||
* isHPFlowOwner |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please appropriate comment for each method? For example, why is hpFlowOwner only taking userId and no reference for flow? Shouldn't there be a mapping between user and flow in metadata?
Please add appropriate method doc for rest of the methods too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As stated in reply to main comment, the idea was to have only a handful of owners for a certain jobtype to use this functionality. The logic with mapping will come in next commit.
|
||
// Flow ID is a fully qualified flow name with format, | ||
// <project_name>.<flow_ame> | ||
private static String INSERT_HP_FLOW_OWNER = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any mapping between hp_flow_owners and hp_flows tables? Can anyone be owner for any flows and insert anything? Can you please add more detail about who can be the owner?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replied in earlier comment.
final FetchHPFlowOwnership fetchHPFlowOwnership = new FetchHPFlowOwnership(); | ||
try { | ||
final String returnedName = this.databaseOperator | ||
.query(FetchHPFlowOwnership.FETCH_HP_OWNER_BY_NAME, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please rename FETCH_HP_OWNER_BY_NAME to FETCH_HP_FLOW_OWNER_BY_NAME?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense!
-- Definition for hp_flow_owners table. It contains list of owners who have | ||
-- access to add/remove high priority flows. | ||
CREATE TABLE IF NOT EXISTS hp_flow_owners ( | ||
owner VARCHAR(64) NOT NULL PRIMARY KEY, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I saw insert and select query where this column name is "name". This sql script and code in PR will break things. Can you please test this part and change column name appropriately?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. I tested it myself but I am admin so it never went to user query, will fix it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please test everything as user on solo-server and verify all the APIs. You will miss many bugs otherwise.
-- Definition for hp_flows. It contains the flow_id in format "project_name.flow_name". | ||
|
||
CREATE TABLE IF NOT EXISTS hp_flows ( | ||
flow_id VARCHAR(256) NOT NULL PRIMARY KEY, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please rename this field based on my comments about flowId and flowName.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will do.
|
||
@VisibleForTesting | ||
List<String> getFlowIdList(final HPFlowDTO hpFlowDTO) { | ||
return hpFlowDTO.getFlowIdList(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the input to get FlowIdList? Why is it needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add appropriate scope for this method and documentation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method is for testing only, will add comments.
|
||
private static final Logger log = Logger.getLogger(HPFlowServlet.class); | ||
private static final String BASE_HP_FLOW_URI = "/hpFlows"; | ||
private ConverterUtils converterUtils; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Util should not be an instance. All the methods in utils should be public static. Can you please remove this instance?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 with @jakhani
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not an instance, in the init() method, it is fetched from the server.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Utils should only have static methods.
-- Definition for hp_flows. It contains the flow_id in format "project_name.flow_name". | ||
|
||
CREATE TABLE IF NOT EXISTS hp_flows ( | ||
flow_id VARCHAR(256) NOT NULL PRIMARY KEY, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add index for flow_id if there is chance of it growing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is primary key, it is indexed by default.
@@ -120,3 +120,20 @@ CREATE UNIQUE INDEX version_set_md5 | |||
-- alter table execution_flows add column dispatch_method TINYINT default 1; | |||
-- CREATE INDEX ex_flows_dispatch_method ON execution_flows (dispatch_method); | |||
|
|||
-- Definition for hp_flow_owners table. It contains list of owners who have |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a design doc for the db tables, dao and service?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't see a method to update the hp_flow_owners table, such as insert or delete. Also it might be useful if we add a state (active 1 vs inactive 0) col for the owners table.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As earlier stated, the original intent of this PR was to get basic functionality out and this was supposed to be followed by implementation for ownership mapping. However, this did not go in time due to other factors and the rest of the implementation will be done in this PR only.
* @return | ||
* @throws ImageMgmtException | ||
*/ | ||
public boolean hasPermission(final String userId) throws ImageMgmtException; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Method name should be hasHPFlowPermission
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I kept it same as there is already a method with same name.
* @return boolean | ||
*/ | ||
@Override | ||
public boolean hasPermission(final String userId) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems all HP owners have permission to update the hp_flows table.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this PR, yes. Eventually no.
-- Definition for hp_flows. It contains the flow_id in format "project_name.flow_name". | ||
|
||
CREATE TABLE IF NOT EXISTS hp_flows ( | ||
flow_id VARCHAR(256) NOT NULL PRIMARY KEY, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about index col and project_id col?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
flow_id is fully qualified with project name.
At this PR creation time, the idea was to keep handful of owners for a jobtype partner who will add the HP flows. However, now the priorities have changed and we can work on implementation to manage ownership mapping. |
*/ | ||
private boolean hasHPFlowManagementPermission(final User user) { | ||
/** | ||
* Azkaban ADMIN role must have full permission to access image management APIs. Hence, no |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this copy-paste text perhaps?
These high priority flows when run in containeirzed dispatch mode, will only use ACTIVE version of the docker image of all the jobtypes required.
This ensures that these flows always run on a stable version rather than any version which is being ramped up and is potentially unstable.
There API can be only called by either Azkaban admins or the designated owners.
Currently the ability to add owners is a manual process. An ADMIN only API to add such owners will come in the future.
API to remove a flow or list of flows from the high priority flows will also come in future.
Added unit tests as and where needed.