Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

allow setting stage to false for transform steps that dont use input or output nodes #179

Merged
merged 2 commits into from
Nov 18, 2015

Conversation

cliu587
Copy link
Contributor

@cliu587 cliu587 commented Nov 17, 2015

PTAL @sb2nov, @darinyu-coursera. Will land after the new load_reload_pk step is tested and with this diph.

…or output nodes

Conflicts:
	dataduct/steps/load_reload_pk.py
@cliu587
Copy link
Contributor Author

cliu587 commented Nov 18, 2015

@sb2nov do you know why we set output_node=base_output_node at https://github.com/coursera/dataduct/pull/179/files#diff-59074e91ee415f9f629abf53692c99b4L114?

It seems self._output is potentially different from base_output_nodeas per the computation in L103.

self.get_output_s3_path(get_modified_s3_path(output_path)))
# Create output_node based on output_path
base_output_node = self.create_s3_data_node(
self.get_output_s3_path(get_modified_s3_path(output_path)))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why remove the if as it will create a useless S3Node, doesn't make an actual difference

@sb2nov
Copy link
Contributor

sb2nov commented Nov 18, 2015

If we used self_output it will create multiple staging directories instead what we want is a single staging directory that gets mapped to multiple nodes based on subdirectories so that the command doesn't need to figure out which staging directory maps to what output and is easier to manage.

@sb2nov
Copy link
Contributor

sb2nov commented Nov 18, 2015

Let me know if you want more details on it.

@sb2nov
Copy link
Contributor

sb2nov commented Nov 18, 2015

LGTM though

cliu587 added a commit that referenced this pull request Nov 18, 2015
…output

allow setting stage to false for transform steps that dont use input or output nodes
@cliu587 cliu587 merged commit 5d0dfe6 into develop Nov 18, 2015
@sb2nov sb2nov deleted the cliu_use_stage_for_no_input_or_output branch November 19, 2015 16:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants