Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test that DagFileProcessor can operator against on a Serialized DAG #8739

Merged
merged 1 commit into from
May 30, 2020

Conversation

ashb
Copy link
Member

@ashb ashb commented May 6, 2020

As part of the scheduler HA work we are going to want to separate the
parsing from the scheduling, so this changes the tests to ensure that
the important methods of DagFileProcessor can do everything the need to
when given a SerializedDAG, not just a DAG. i.e. that we have correctly
serialized all the necessary fields.


Make sure to mark the boxes below before creating PR: [x]

  • Description above provides context of the change
  • Unit tests coverage for changes (not needed for documentation changes)
  • Target Github ISSUE in description if exists
  • Commits follow "How to write a good git commit message"
  • Relevant documentation is updated including usage instructions.
  • I will engage committers as explained in Contribution Workflow Example.

In case of fundamental code change, Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in UPDATING.md.
Read the Pull Request Guidelines for more information.

@boring-cyborg boring-cyborg bot added the area:Scheduler Scheduler or dag parsing Issues label May 6, 2020
@kaxil
Copy link
Member

kaxil commented May 6, 2020

Looks like it works fine :)

@ashb ashb marked this pull request as ready for review May 7, 2020 10:38
@ashb ashb marked this pull request as draft May 7, 2020 10:52
tests/jobs/test_scheduler_job.py Outdated Show resolved Hide resolved
@ashb ashb force-pushed the can-schedule-on-s10n-dag branch 2 times, most recently from 81e6ae8 to e29c35c Compare May 15, 2020 09:04
@ashb ashb force-pushed the can-schedule-on-s10n-dag branch from b36b276 to f3dbc38 Compare May 21, 2020 09:51
@ashb ashb marked this pull request as ready for review May 21, 2020 15:39
@ashb
Copy link
Member Author

ashb commented May 21, 2020

Just the Travis/Kube tests failing now. And they have been doing that for a while now :(

@ashb ashb force-pushed the can-schedule-on-s10n-dag branch from 26110c3 to 06932e0 Compare May 25, 2020 13:15
As part of the scheduler HA work we are going to want to separate the
parsing from the scheduling, so this changes the tests to ensure that
the important methods of DagFileProcessor can do everything the need to
when given a SerializedDAG, not just a DAG. i.e. that we have correctly
serialized all the necessary fields.
@ashb ashb force-pushed the can-schedule-on-s10n-dag branch from 06932e0 to da07181 Compare May 29, 2020 16:42
@@ -117,7 +116,6 @@ def test_remove_stale_dags(self):
self.assertFalse(SDM.has_dag(stale_dag.dag_id))
self.assertTrue(SDM.has_dag(fresh_dag.dag_id))

@mock.patch('airflow.models.serialized_dag.STORE_SERIALIZED_DAGS', True)
def test_bulk_sync_to_db(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does this test work now, this doesn't store serialized DAG anymore, does it ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean with the patch, it would only sync and write DAGs to DAGModel table and assert that queries count is 7, is that the intention

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SDM.bulk_sync_to_db will now always write to the DB when called.

Before it would return if the config was false. Now the check is moved in to DagBag.bulk_sync:

        if self.store_serialized_dags:
            SerializedDagModel.bulk_sync_to_db(self.dags.values())

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(This test isn't great, but that wasn't changed by this)

@kaxil kaxil merged commit 735bf45 into apache:master May 30, 2020
@kaxil kaxil deleted the can-schedule-on-s10n-dag branch May 30, 2020 16:36
@kaxil kaxil mentioned this pull request Jul 15, 2020
6 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:Scheduler Scheduler or dag parsing Issues
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants