Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a test to run a v2 flow file with the new DAG engine #1840

Merged

Conversation

HappyRay
Copy link
Contributor

@HappyRay HappyRay commented Jul 9, 2018

This is a step towards building an integration test for the new DAG engine.

The jobs runs are skipped in this test.

Refactored the DagBuilder API to make it easier to use.
Instead of using a DagBuilder class to link nodes, use the name of the
nodes directly.

This facility can also be used in the future to build the tools to run
flows locally for testing purposes.

This is a step towards building an integration test for the new DAG engine.

Refactored the DagBuilder API to make it easier to use.
Instead of using a DagBuilder class to link nodes, use the name of the
nodes directly.

This facility can also be used in the future to build the tools to run
flows locally for testing purposes.
@HappyRay HappyRay requested a review from jamiesjc July 9, 2018 18:02
public DagBuilder(final String name, final DagProcessor dagProcessor) {
requireNonNull(name, "The name of the DagBuilder can't be null");
this.name = name;
requireNonNull(name, "The dagProcessor of the DagBuilder can't be null");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be requireNonNull(dagProcessor, ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Thanks

Should check non null of the dagProcessor parameter.
@@ -98,16 +141,16 @@ private void checkCircularDependencies() {
class CircularDependencyChecker {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The dependency check has already been done when loading the flow files: azkaban.project.DirectoryYamlFlowLoader#buildFlowEdges
Is there any particular reason why this needs to be implemented again in building the DAG?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The DAG component is designed with minimal assumptions including the validations done to its inputs prior to passing these inputs to the DAG engine.

This allows this component to be used in more scenarios. e.g. the test included in this PR doesn't have a need to convert the v2 flow into Flow object which is where the azkaban.project.DirectoryYamlFlowLoader#buildFlowEdges is called.

In the future, we can consider getting rid of one of the validations in the AZ loading scenario. For the time being, the cost of double validation seems small.

this.dagBuilder = new DagBuilder(flowName, new SimpleDagProcessor());
}

Dag create() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to me that logic of creating Dag is similar to what has been done in DirectoryYamlFlowLoader. Would it be easier to just reuse the existing part? Do you plan to eventually integrate them together or keep them as separate paths?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't looked at the DirectoryYamlFlowLoader code in detail.

Eventually, yes, we will need to handle running both v1 and v2 flows. Will look into reducing logic duplication then.

Copy link
Contributor

@jamiesjc jamiesjc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@HappyRay HappyRay merged commit e49b22d into azkaban:master Jul 9, 2018
@HappyRay HappyRay deleted the run-flow-v2-in-tests-new-dag-engine branch July 9, 2018 22:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants