Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Relax mandatory requirement for start_date when schedule=None #35356

Merged
merged 6 commits into from Nov 28, 2023

Conversation

vishnucoder1
Copy link
Contributor

@vishnucoder1 vishnucoder1 commented Nov 1, 2023

As a part of this PR, existing start_date validation is removed to handle schedule=None

Closes: #35199

dag = DAG("dag_without_start_date")
dag.add_task(BaseOperator(task_id="task_without_start_date"))
dagrun = dag.create_dagrun(
state=State.RUNNING, run_type=DagRunType.SCHEDULED, execution_date=DEFAULT_DATE
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How can DagRunType be scheduled ? With scheduled runs we must have start_date

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How can DagRunType be scheduled ? With scheduled runs we must have start_date

IMHO it should be ok if catchup is False 🤔

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The scheduler creates runs with type scheduled. The fix in this PR is for manual runs that are created by the user.

catchup parameter has no affect on manual runs

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated DagRunType.SCHEDULED to DagRunType.MANUAL

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but I wonder if we can do the same thing for dags with schedule!=None if the catchup is set to False, wdyt?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont think so because the first run depends on the start_date. If you set it in the past and interval passed when dag is activated it will create a run. If you set it for a future date then interval is not completed thus no run will be created when dag is set to active.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, but we can consider that this condition is always met when the user does not provide a start date.

@eladkal eladkal added this to the Airflow 2.7.4 milestone Nov 3, 2023
@kaxil kaxil added the full tests needed We need to run full set of tests for this PR to merge label Nov 5, 2023
@uranusjr
Copy link
Member

uranusjr commented Nov 6, 2023

Do we have a test for a non-None schedule raising an exception?

@vishnucoder1
Copy link
Contributor Author

vishnucoder1 commented Nov 8, 2023

@uranusjr Do we need to fail explicitly for non None Schedule and empty start date? In this case, it will start the job on the next schedule

@vishnucoder1
Copy link
Contributor Author

Also want to mention that previously start date and task date were checked to throw the start date error which is part of add task. But schedule will not be empty then as it will be the default schedule interval value.

@vishnucoder1
Copy link
Contributor Author

@uranusjr @eladkal @hussein-awala
Need your inputs here. Do we need to fail explicitly for non None Schedule and empty start date? Should task date also be considered. But considering both start date and task date and schedule is not possible as schedule takes default interval value when task is added.

@uranusjr
Copy link
Member

uranusjr commented Nov 17, 2023

If I remember correctly, we currently fail explicitly for that combination, so it’s better to continue the behaviour. Having a DAG created silently but can never fire can be confusing for users.

Task start dates are fine, just consider the DAG date.

@vishnucoder1
Copy link
Contributor Author

Thanks for the input. Changes are added.

@eladkal
Copy link
Contributor

eladkal commented Nov 17, 2023

Tests are failing:

ERROR tests/providers/amazon/aws/operators/test_athena.py::TestAthenaOperator::test_return_value - ValueError: DAG is missing the start_date parameter
ERROR tests/providers/amazon/aws/operators/test_datasync.py::TestDataSyncOperatorCreate::test_return_value - ValueError: DAG is missing the start_date parameter
ERROR tests/providers/amazon/aws/operators/test_datasync.py::TestDataSyncOperatorGetTasks::test_return_value - ValueError: DAG is missing the start_date parameter
ERROR tests/providers/amazon/aws/operators/test_datasync.py::TestDataSyncOperatorUpdate::test_return_value - ValueError: DAG is missing the start_date parameter
ERROR tests/providers/amazon/aws/operators/test_datasync.py::TestDataSyncOperator::test_return_value - ValueError: DAG is missing the start_date parameter
ERROR tests/providers/amazon/aws/operators/test_datasync.py::TestDataSyncOperatorDelete::test_return_value - ValueError: DAG is missing the start_date parameter
ERROR tests/providers/amazon/aws/operators/test_dms.py::TestDmsDescribeTasksOperator::test_describe_tasks_return_value - ValueError: DAG is missing the start_date parameter
========== 394 passed, 1861 skipped, 35 warnings, 7 errors in 48.17s ===========

@vishnucoder1
Copy link
Contributor Author

Fixed the failing tests

airflow/utils/helpers.py Outdated Show resolved Hide resolved
@eladkal eladkal added the type:improvement Changelog: Improvements label Nov 25, 2023
@eladkal eladkal merged commit 930f165 into apache:main Nov 28, 2023
72 checks passed
@vishnucoder1 vishnucoder1 deleted the make-start-date-optional branch November 29, 2023 17:34
ephraimbuddy pushed a commit that referenced this pull request Dec 5, 2023
* Relax mandatory requirement for start_date when schedule=None

* Updated run_type in unit tests

* Added check for empty start_date and non empty schedule

* Fix the build failures

* Fix the build failures

* Update based on review comments

(cherry picked from commit 930f165)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
full tests needed We need to run full set of tests for this PR to merge type:improvement Changelog: Improvements
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Relax mandatory requirement for start_date when schedule=None
6 participants