Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[QUARANTINE] The test_scheduler_verify_pool_full test is quarantined #17224

Closed
potiuk opened this issue Jul 26, 2021 · 0 comments · Fixed by #19860
Closed

[QUARANTINE] The test_scheduler_verify_pool_full test is quarantined #17224

potiuk opened this issue Jul 26, 2021 · 0 comments · Fixed by #19860
Labels
kind:bug This is a clearly a bug Quarantine Issues that are occasionally failing and are quarantined

Comments

@potiuk
Copy link
Member

potiuk commented Jul 26, 2021

The test fails occasionally with the below stacktrace, so I am marking this as Quarantined.

  _______________ TestSchedulerJob.test_scheduler_verify_pool_full _______________
  
  self = <tests.jobs.test_scheduler_job.TestSchedulerJob object at 0x7fbaaaba0f40>
  
      def test_scheduler_verify_pool_full(self):
          """
          Test task instances not queued when pool is full
          """
          dag = DAG(dag_id='test_scheduler_verify_pool_full', start_date=DEFAULT_DATE)
      
          BashOperator(
              task_id='dummy',
              dag=dag,
              owner='airflow',
              pool='test_scheduler_verify_pool_full',
              bash_command='echo hi',
          )
      
          dagbag = DagBag(
              dag_folder=os.path.join(settings.DAGS_FOLDER, "no_dags.py"),
              include_examples=False,
              read_dags_from_db=True,
          )
          dagbag.bag_dag(dag=dag, root_dag=dag)
          dagbag.sync_to_db()
      
          session = settings.Session()
          pool = Pool(pool='test_scheduler_verify_pool_full', slots=1)
          session.add(pool)
          session.flush()
      
          dag = SerializedDAG.from_dict(SerializedDAG.to_dict(dag))
          SerializedDagModel.write_dag(dag)
      
          self.scheduler_job = SchedulerJob(executor=self.null_exec)
          self.scheduler_job.processor_agent = mock.MagicMock()
      
          # Create 2 dagruns, which will create 2 task instances.
          dr = dag.create_dagrun(
              run_type=DagRunType.SCHEDULED,
              execution_date=DEFAULT_DATE,
              state=State.RUNNING,
          )
  >       self.scheduler_job._schedule_dag_run(dr, session)
  
  tests/jobs/test_scheduler_job.py:2108: 
  _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
  airflow/jobs/scheduler_job.py:1020: in _schedule_dag_run
      dag = dag_run.dag = self.dagbag.get_dag(dag_run.dag_id, session=session)
  airflow/utils/session.py:67: in wrapper
      return func(*args, **kwargs)
  airflow/models/dagbag.py:186: in get_dag
      self._add_dag_from_db(dag_id=dag_id, session=session)
  _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
  
  self = <airflow.models.dagbag.DagBag object at 0x7fbaaab95100>
  dag_id = 'test_scheduler_verify_pool_full'
  session = <sqlalchemy.orm.session.Session object at 0x7fbaaad9d0a0>
  
      def _add_dag_from_db(self, dag_id: str, session: Session):
          """Add DAG to DagBag from DB"""
          from airflow.models.serialized_dag import SerializedDagModel
      
          row = SerializedDagModel.get(dag_id, session)
          if not row:
  >           raise SerializedDagNotFound(f"DAG '{dag_id}' not found in serialized_dag table")
  E           airflow.exceptions.SerializedDagNotFound: DAG 'test_scheduler_verify_pool_full' not found in serialized_dag table
@potiuk potiuk added kind:bug This is a clearly a bug Quarantine Issues that are occasionally failing and are quarantined labels Jul 26, 2021
potiuk added a commit to potiuk/airflow that referenced this issue Jul 26, 2021
potiuk added a commit that referenced this issue Jul 26, 2021
potiuk added a commit to potiuk/airflow that referenced this issue Dec 11, 2021
The scheduler job tests were pretty flaky and some of them were
quarantined already (especially the query count). This PR improves
the stability in the following ways:

* clean the database between tests for TestSchedulerJob to avoid
  side effects
* forces UTC timezone in tests where date missed timezone specs
* updates number of queries expected in the query count tests
* stabilizes the sequence of retrieval of tasks in case tests
  depended on it
* adds more stack trace levels (5) to compare where extra
  methods were called.
* increase number of scheduler runs where it was needed
* add session.flush() where it was missing
* add requirement to have serialized dags ready when needed
* increase dagruns number to process where we could have
  some "too slow" tests comparing to fast processing of
  dag runs.

Hopefully:

* Fixes: apache#18777
* Fixes: apache#17291
* Fixes: apache#17224
* Fixes: apache#15255
* Fixes: apache#15085
potiuk added a commit that referenced this issue Dec 13, 2021
* Restore stability and unquarantine all test_scheduler_job tests

The scheduler job tests were pretty flaky and some of them were
quarantined already (especially the query count). This PR improves
the stability in the following ways:

* clean the database between tests for TestSchedulerJob to avoid
  side effects
* forces UTC timezone in tests where date missed timezone specs
* updates number of queries expected in the query count tests
* stabilizes the sequence of retrieval of tasks in case tests
  depended on it
* adds more stack trace levels (5) to compare where extra
  methods were called.
* increase number of scheduler runs where it was needed
* add session.flush() where it was missing
* add requirement to have serialized dags ready when needed
* increase dagruns number to process where we could have
  some "too slow" tests comparing to fast processing of
  dag runs.

Hopefully:

* Fixes: #18777
* Fixes: #17291
* Fixes: #17224
* Fixes: #15255
* Fixes: #15085

* Update tests/jobs/test_scheduler_job.py

Co-authored-by: Tzu-ping Chung <uranusjr@gmail.com>

Co-authored-by: Tzu-ping Chung <uranusjr@gmail.com>
potiuk added a commit that referenced this issue Jan 22, 2022
* Restore stability and unquarantine all test_scheduler_job tests

The scheduler job tests were pretty flaky and some of them were
quarantined already (especially the query count). This PR improves
the stability in the following ways:

* clean the database between tests for TestSchedulerJob to avoid
  side effects
* forces UTC timezone in tests where date missed timezone specs
* updates number of queries expected in the query count tests
* stabilizes the sequence of retrieval of tasks in case tests
  depended on it
* adds more stack trace levels (5) to compare where extra
  methods were called.
* increase number of scheduler runs where it was needed
* add session.flush() where it was missing
* add requirement to have serialized dags ready when needed
* increase dagruns number to process where we could have
  some "too slow" tests comparing to fast processing of
  dag runs.

Hopefully:

* Fixes: #18777
* Fixes: #17291
* Fixes: #17224
* Fixes: #15255
* Fixes: #15085

* Update tests/jobs/test_scheduler_job.py

Co-authored-by: Tzu-ping Chung <uranusjr@gmail.com>

Co-authored-by: Tzu-ping Chung <uranusjr@gmail.com>
(cherry picked from commit 9b277db)
jedcunningham pushed a commit that referenced this issue Jan 27, 2022
* Restore stability and unquarantine all test_scheduler_job tests

The scheduler job tests were pretty flaky and some of them were
quarantined already (especially the query count). This PR improves
the stability in the following ways:

* clean the database between tests for TestSchedulerJob to avoid
  side effects
* forces UTC timezone in tests where date missed timezone specs
* updates number of queries expected in the query count tests
* stabilizes the sequence of retrieval of tasks in case tests
  depended on it
* adds more stack trace levels (5) to compare where extra
  methods were called.
* increase number of scheduler runs where it was needed
* add session.flush() where it was missing
* add requirement to have serialized dags ready when needed
* increase dagruns number to process where we could have
  some "too slow" tests comparing to fast processing of
  dag runs.

Hopefully:

* Fixes: #18777
* Fixes: #17291
* Fixes: #17224
* Fixes: #15255
* Fixes: #15085

* Update tests/jobs/test_scheduler_job.py

Co-authored-by: Tzu-ping Chung <uranusjr@gmail.com>

Co-authored-by: Tzu-ping Chung <uranusjr@gmail.com>
(cherry picked from commit 9b277db)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind:bug This is a clearly a bug Quarantine Issues that are occasionally failing and are quarantined
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant