Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DagFileProcessor produces invalid metric tags #30716

Open
1 of 2 tasks
sungwy opened this issue Apr 18, 2023 · 10 comments · May be fixed by #42934
Open
1 of 2 tasks

DagFileProcessor produces invalid metric tags #30716

sungwy opened this issue Apr 18, 2023 · 10 comments · May be fixed by #42934
Assignees
Labels

Comments

@sungwy
Copy link
Contributor

sungwy commented Apr 18, 2023

Apache Airflow version

2.6.0b1

What happened

The recently added dag_processing.processes file_path metric tag always fails to publish the metric tag because file path delimiter '/' is not a valid character according to the stat_name_default_handler

airflow.exceptions.InvalidStatsNameException: The stat name (dag_processing.processes,file_path=/mnt/c/Users/user/Documents/GitHub/airflow-dir/test_dag.py,action=finish) has to be composed of ASCII alphabets, numbers, or the underscore, dot, or dash characters.
[2023-04-18T12:21:39.738-0400] {stats.py:245} ERROR - Invalid stat name: dag_processing.processes,file_path=/mnt/c/Users/user/Documents/GitHub/airflow-dir/test_dag.py,action=start.
Traceback (most recent call last):
File "/mnt/c/Users/user/Documents/GitHub/airflow-dir/venv/lib/python3.9/site-packages/airflow/stats.py", line 242, in wrapper
stat = handler_stat_name_func(stat)
File "/mnt/c/Users/user/Documents/GitHub/airflow-dir/venv/lib/python3.9/site-packages/airflow/stats.py", line 210, in stat_name_default_handler
raise InvalidStatsNameException(
airflow.exceptions.InvalidStatsNameException: The stat name (dag_processing.processes,file_path=/mnt/c/Users/user/Documents/GitHub/airflow-dir/test_dag.py,action=start) has to be composed of ASCII alphabets, numbers, or the underscore, dot, or dash characters.
[2023-04-18T12:21:51.375-0400] {stats.py:245} ERROR - Invalid stat name: dag_processing.processes,file_path=/mnt/c/Users/user/Documents/GitHub/airflow-dir/test_dag.py,action=finish.

What you think should happen instead

Although it is not a fatal error it feels erroneous that the default stats name handler is not able to support the metric tag out of the box.

We do have the following parameters that allows a user to get around this issue:

  1. stat_name_handler
  2. statsd_disabled_tags

But, I would like to advocate that we include '/' as a supported character to stat_name_default_handler, or sanitize the file_path value to use a supported character instead. It would feel more intuitive for a new user using the feature to have metric tags work correctly with the default configurations, rather than needing to implement their own stat_name_handler to work around the issue.

Examples Metrics:
https://github.com/apache/airflow/blob/main/airflow/dag_processing/manager.py#L998
https://github.com/apache/airflow/blob/main/airflow/dag_processing/processor.py#L767

How to reproduce

Enable stats with:

[metrics]
statsd_on = True
statsd_host = localhost
statsd_port = 8125
statsd_prefix = 
statsd_influxdb_enabled = True

Operating System

Red Hat Enterprise Linux Server 7.6 (Maipo)

Versions of Apache Airflow Providers

No response

Deployment

Virtualenv installation

Deployment details

No response

Anything else

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

@sungwy sungwy added area:core kind:bug This is a clearly a bug needs-triage label for new issues that we didn't triage yet labels Apr 18, 2023
@sungwy sungwy changed the title Metrics Tag produces invalid tags DagFileProcessor produces invalid metric tags Apr 18, 2023
@potiuk potiuk added this to the Airflow 2.6.0 milestone Apr 18, 2023
@potiuk potiuk added good first issue and removed needs-triage label for new issues that we didn't triage yet labels Apr 18, 2023
@eladkal eladkal added the affected_version:2.6 Issues Reported for 2.6 label Apr 18, 2023
@Gowthami03B
Copy link

Can I take this one?

@potiuk
Copy link
Member

potiuk commented Apr 19, 2023

Sure

@eladkal eladkal removed this from the Airflow 2.6.1 milestone Apr 28, 2023
Copy link

This issue has been automatically marked as stale because it has been open for 365 days without any activity. There has been several Airflow releases since last activity on this issue. Kindly asking to recheck the report against latest Airflow version and let us know if the issue is reproducible. The issue will be closed in next 30 days if no further activity occurs from the issue author.

@shalberd
Copy link

shalberd commented Jun 21, 2024

@eladkal @potiuk similar error in Airflow 2.8.2 dag-processor pod/container

 {dag_processor_job_runner.py:60} INFO - Starting the Dag Processor Job
[2024-06-18T02:50:07.781+0000] {validators.py:101} ERROR - Invalid stat name: dag_processing.last_duration.random error 2-0424133757V
/python3.8/site-packages/airflow/metrics/validators.py", line 185, in stat_name_default_handler
    raise InvalidStatsNameException(
airflow.exceptions.InvalidStatsNameException: The stat name (dag_processing.last_run.seconds_ago.random error 2-0424133757V5) has to be composed of ASCII alphabets, numbers, or the underscore, dot, or dash characters.

@Gowthami03B Gowthami03B removed their assignment Jun 26, 2024
@ares-b
Copy link

ares-b commented Jul 11, 2024

Same error in Airflow 2.9.1

ERROR - Invalid stat name: dag_processing.processes,file_path=/opt/airflow/dags/live/sha/appconf/contracts/contracts.py,action=start.
Traceback (most recent call last):
  File "/home/airflow/.local/lib/python3.11/site-packages/airflow/metrics/validators.py", line 134, in wrapper
    stat = handler_stat_name_func(stat)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/airflow/.local/lib/python3.11/site-packages/airflow/metrics/validators.py", line 221, in stat_name_default_handler
    raise InvalidStatsNameException(
airflow.exceptions.InvalidStatsNameException: The stat name (dag_processing.processes,file_path=/opt/airflow/dags/live/sha/appconf/contracts/contracts.py,action=start) has to be composed of ASCII alphabets, numbers, or the underscore, dot, or dash characters.

@potiuk
Copy link
Member

potiuk commented Jul 11, 2024

Yes. It still waits for someone who will investigate and fix it. Can be anyone - even those who experience it (actually it would be best as they could easily test if it's fixed).

@josix
Copy link
Contributor

josix commented Aug 13, 2024

I could help investigate it, @Lee-W could you help assign it to me, thanks!

@Lee-W
Copy link
Member

Lee-W commented Aug 13, 2024

Sure thing 🙂

@Jeoffreybauvin
Copy link

Same error in 2.10.0 :(.

@eladkal eladkal added affected_version:2.10 Issues Reported for 2.10 area:metrics and removed affected_version:2.6 Issues Reported for 2.6 labels Aug 31, 2024
@awdavidson awdavidson linked a pull request Oct 11, 2024 that will close this issue
@awdavidson
Copy link
Contributor

Can we not allow / character in the validator? I've applied this to a local instance which addresses the issue and I can see the metrics as expected in prometheus

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.