Skip to content

Comments

Implement pipeline writers for artifact collection (includes PR-737)#738

Merged
yoonspark merged 24 commits intomainfrom
implement-pipeline-writers-for-artifact-collection
Aug 17, 2022
Merged

Implement pipeline writers for artifact collection (includes PR-737)#738
yoonspark merged 24 commits intomainfrom
implement-pipeline-writers-for-artifact-collection

Conversation

@yoonspark
Copy link
Contributor

@yoonspark yoonspark commented Aug 8, 2022

Description

Implement pipeline writers for SCRIPT (LIN-477) and AIRFLOW (LIN-484) frameworks. These writers use new graph refactoring implemented by ArtifactCollection for pipeline file generation; and they are meant to eventually replace existing plugins.

NOTE: This PR "absorbs" #737.

Type of change

  • New feature (non-breaking change which adds functionality)

How Has This Been Tested?

New unit tests were added to validate if pipeline files are properly generated.

@mingjerli mingjerli changed the title Implement pipeline writers for artifact collection Implement pipeline writers for artifact collection (include PR-737) Aug 8, 2022
@yoonspark yoonspark force-pushed the implement-pipeline-writers-for-artifact-collection branch from d52f60d to 2cba35d Compare August 8, 2022 14:30
@andycui97
Copy link
Contributor

General comment, how were the "expected" files generated, and how did we validate that they are correct?

@yoonspark
Copy link
Contributor Author

@andycui97 The "expected" files were generated by running the test scenarios over local Jupyter notebook sessions; after manually inspecting/testing they look good and runnable, these files were copied to the test folder as "expected" results.

@yoonspark yoonspark force-pushed the implement-pipeline-writers-for-artifact-collection branch from 72c90f4 to 2947ee5 Compare August 8, 2022 19:41
Comment on lines 1 to 2
lineapy==0.1.4
lineapy==0.1.4
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are there duplicates here? i think we ensure that we dont have duplicate ones. also for lineapy the plugins dont include the exact version for lineapy (the tests break every time we try to release because the versions wont match)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like it is concatenating requirements from each session. In this case, it is from two sessions, so it is duplicated. I've fixed the problem.

However, it actually raises an interesting question. What should we do if the requirements for two sessions conflict?

@lionsardesai lionsardesai changed the base branch from main to 158-userdef-classmethods-mutate August 10, 2022 18:11
@lionsardesai lionsardesai changed the base branch from 158-userdef-classmethods-mutate to main August 10, 2022 18:11
Comment on lines +17 to +23
dag = DAG(
dag_id="{{ DAG_NAME }}_dag",
schedule_interval="{{ SCHEDULE_INTERVAL }}",
max_active_runs={{ MAX_ACTIVE_RUNS }},
catchup={{ CATCHUP }},
default_args=default_dag_args,
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could simplify this to use context manager that auto adds dags to the operators. see: https://airflow.apache.org/docs/apache-airflow/stable/tutorial.html

if seems that using context manager is preferred but i leave it to @mingjerli and @andycui97 to determine if this is backwards compatible to sufficiently old airflow version.

Copy link
Contributor

@lionsardesai lionsardesai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

talked to @yoonspark, and holding off the merge till changes are done.

keep_lineapy_save: bool = False,
pipeline_name: str = "pipeline",
output_dir: str = ".",
dag_config: Optional[AirflowDagConfig] = {},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should generalize this later. for now we can proceed as is.

Copy link
Contributor

@lionsardesai lionsardesai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lg!

@yoonspark yoonspark changed the title Implement pipeline writers for artifact collection (include PR-737) Implement pipeline writers for artifact collection (includes PR-737) Aug 17, 2022
@yoonspark yoonspark merged commit 1301da8 into main Aug 17, 2022
@yoonspark yoonspark deleted the implement-pipeline-writers-for-artifact-collection branch August 17, 2022 20:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants