Skip to content

fix: sanitize Dag processor metric file names#67029

Open
anmolxlight wants to merge 1 commit into
apache:mainfrom
anmolxlight:fix/67028-dag-processing-metric-names
Open

fix: sanitize Dag processor metric file names#67029
anmolxlight wants to merge 1 commit into
apache:mainfrom
anmolxlight:fix/67028-dag-processing-metric-names

Conversation

@anmolxlight
Copy link
Copy Markdown
Contributor

Summary

  • Normalize the Dag file stem used in dag_processing.last_run.seconds_ago.* metric names.
  • Add a regression test covering a Dag file name containing a space.

Test Plan

  • uv run --project airflow-core pytest airflow-core/tests/unit/dag_processing/test_manager.py::TestDagFileProcessorManager::test_log_file_processing_stats_sanitizes_last_run_seconds_ago_metric_name -xvs
  • uv run --project airflow-core pytest airflow-core/tests/unit/dag_processing/test_manager.py::TestDagFileProcessorManager::test_send_file_processing_statsd_timing airflow-core/tests/unit/dag_processing/test_manager.py::TestDagFileProcessorManager::test_log_file_processing_stats_sanitizes_last_run_seconds_ago_metric_name -q
  • uv run --project airflow-core pytest airflow-core/tests/unit/dag_processing/test_manager.py -q
  • uv run prek run ruff --files airflow-core/src/airflow/dag_processing/manager.py airflow-core/tests/unit/dag_processing/test_manager.py
  • uv run prek run ruff-format --files airflow-core/src/airflow/dag_processing/manager.py airflow-core/tests/unit/dag_processing/test_manager.py

Closes #67028

Copy link
Copy Markdown
Contributor

@SameerMesiah97 SameerMesiah97 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine. Just one question and a tiny nit.

)

@mock.patch("airflow.dag_processing.manager.stats.gauge")
def test_log_file_processing_stats_sanitizes_last_run_seconds_ago_metric_name(self, statsd_gauge_mock):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def test_log_file_processing_stats_sanitizes_last_run_seconds_ago_metric_name(self, statsd_gauge_mock):
def test_log_file_processing_stats_normalizes_metric_name(self, statsd_gauge_mock):

num_dags = stat.num_dags
num_errors = stat.import_errors
file_name = Path(file.rel_path).stem
file_name = normalize_name_for_stats(Path(file.rel_path).stem)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is _log_file_processing_stats() the only place where Dag file names are incorporated into metric names?

Since the bug is fundamentally about unsanitized file-derived metric segments, it’d be good to verify there aren’t other DAG-processing stats paths still using Path(...).stem or rel_path directly without normalize_name_for_stats().

@potiuk potiuk added the ready for maintainer review Set after triaging when all criteria pass. label May 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:DAG-processing ready for maintainer review Set after triaging when all criteria pass.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Dag processor error

3 participants