Skip to content

Metrics Improvement Project - bootstrap metrics registry#43618

Closed
dannyl1u wants to merge 5 commits intoapache:mainfrom
dannyl1u:metrics-registry
Closed

Metrics Improvement Project - bootstrap metrics registry#43618
dannyl1u wants to merge 5 commits intoapache:mainfrom
dannyl1u:metrics-registry

Conversation

@dannyl1u
Copy link
Contributor

@dannyl1u dannyl1u commented Nov 3, 2024

Metrics Improvement Project: #42881

After #43340 is merged, we can check in the get_name function that the metric_name exists in the metrics-registry.yaml file, and throw an error if it doesn't. This will enforce that the metrics-registry.yaml exists as a single source of truth for all metrics names. The metrics names from https://airflow.apache.org/docs/apache-airflow/stable/administration-and-deployment/logging-monitoring/metrics.html# will be generated from the metrics-registry.yaml file, ensuring that the documentation for metrics names is always up-to-date.

The following bash script was used to grep all metric names from the codebase, excluding files ignored by git and tests.

git ls-files | grep -vE '^tests/|^tests_common/' | xargs -I {} grep -oE '\.(incr|decr|gauge|timer|timing)\("([^"]+)"' "{}" | awk -F'"' '{
    type = ($1 ~ /\.incr|\.decr/) ? "counter" : ($1 ~ /\.gauge/) ? "gauge" : "timer";
    print "- name: "$2"\n  type: "type"\n  description: \"\""
}' > metrics-registry.yaml

TODOs:

  • populate description fields from documentation
  • cross reference metrics-registry.yaml with documentation to ensure no metrics were missed during the initial grep
  • implement existence check when a metric name is used, blocked by Metrics Improvement Project #42881

^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

Copy link
Contributor

@ferruzzi ferruzzi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your method for finding them looks right to me, and the next steps you laid out sound good. Left a formatting suggestion to look into.

# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about using sections to organize the registry? I think YAML calls them nested levels. For example:

asset:
    - name: updates
      type: counter
      description: ""
dag_processing:
    - name: manager_stalls
      type: counter
      description: ""
    - name: other_callback_count
      type: counter
      description: ""
... etc...

Maybe experiment with that and with what the code to process it would look like before we commit to a flat structure like this? We have a bunch of .yml files int he codebase you can look at for inspiration and to see how they are parsed in code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree it looks neater that way but my rational for the flat structure is that the metrics docs is a flat list, so we save some parsing/de-parsing by having a flat structure.
Let me know what you think

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also if we sort it alphabetically, the registry will be naturally "organized" into the respective groups as well

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You make some good points. I can see it working both ways, let's see if anyone else has an opinion. Many people won't review a PR which is marked as a "draft" assuming it's still early WIP, so you might want to drop a message on Slack asking for thoughts.

@github-actions
Copy link

github-actions bot commented Jan 2, 2025

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 5 days if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the stale Stale PRs per the .github/workflows/stale.yml policy file label Jan 2, 2025
@ferruzzi
Copy link
Contributor

ferruzzi commented Jan 2, 2025

still blocked waiting for #43340

@github-actions github-actions bot removed the stale Stale PRs per the .github/workflows/stale.yml policy file label Jan 3, 2025
@github-actions
Copy link

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 5 days if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the stale Stale PRs per the .github/workflows/stale.yml policy file label Feb 17, 2025
@github-actions github-actions bot closed this Feb 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

stale Stale PRs per the .github/workflows/stale.yml policy file

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants