New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug 1885856: Exporting registry v1 protocol usage metric #949
Bug 1885856: Exporting registry v1 protocol usage metric #949
Conversation
@ricardomaraschini: This pull request references Bugzilla bug 1885856, which is invalid:
Comment In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/bugzilla refresh |
@ricardomaraschini: This pull request references Bugzilla bug 1885856, which is invalid:
Comment In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/cherrypick 4.6 |
@ricardomaraschini: once the present PR merges, I will cherry-pick it on top of 4.6 in a new PR and assign it to you. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/bugzilla refresh |
@ricardomaraschini: This pull request references Bugzilla bug 1885856, which is valid. The bug has been moved to the POST state. The bug has been updated to refer to the pull request using the external bug tracker. 3 validation(s) were run on this bug
In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Documentation/data-collection.md
Outdated
# (image-registry-team) v1_image_imports_total counts the number of image stream imports made | ||
# using registry protocol v1, we are moving forward with deprecating this version hence we | ||
# need to know how often it is used. | ||
- '{__name__="v1_image_imports_total"}' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is not the full name of the metric apiserver_v1_image_imports_total
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @lilic! I wasn't sure if I should or not add the "subsystem" into it, going to fix it right now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Name here should be as it appears in Prometheus, as this is just a match selector for the metrics (series) themselves.
Note in case you missed it here are the full docs around how to get approval on this https://docs.google.com/document/d/1a6n5iBGM2QaIQRg9Lw4-Npj6QY9--Hpx3XYut-BrUSY/edit tl;dr: You need to get Claytons 👍 on it, we already had a look at the actual metric. Let me know if any step don't make sense, happy to clarify! :) |
Thanks for helping me on this one! |
/assign @smarterclayton |
Documentation/data-collection.md
Outdated
# (image-registry-team) apiserver_v1_image_imports_total counts the number of image stream | ||
# imports made using registry protocol v1, we are moving forward with deprecating this version | ||
# hence we need to know how often it is used. | ||
- '{__name__="apiserver_v1_image_imports_total"}' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this by default in every openshift, as I did not see it on my cluster?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes it is by default in all clusters. It will show up as soon as you import an image from a image registry using protocol v1. This protocol version has already been deprecated by docker and we don't expect this metric to show up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lilic will it expose the metric labels? We need only sum(apiserver_v1_image_imports_total), labels may have sensitive data (names of private registries and repositories).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You have to create a recording rule doing the sum() then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@simonpasquier I have sent over a new commit that adds :sum
at the end of the metric we are querying. Could you please confirm this is the right way of achieving the desired outcome?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you defined the recording rule that generates the apiserver_v1_image_imports:sum
somewhere (typically in the operator managing the component emitting the apiserver_v1_image_imports_total
metric)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@simonpasquier no, could you please point me to a sample where this is done?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@simonpasquier I have created the rule in here, could you take a look? openshift/cluster-image-registry-operator#626
/hold we need a recording rule |
/retest 502 Bad Gateway |
@@ -230,6 +230,10 @@ data: | |||
- '{__name__="che_workspace_start_time_seconds_count"}' | |||
# (cloud credential operator, @openshift/openshift-team-hive) Track current mode the cloud-credentials-operator is functioning under. | |||
- '{__name__="cco_credentials_mode"}' | |||
# (image-registry-team) :apiserver_v1_image_imports:sum counts the number of image stream | |||
# imports made using registry protocol v1, we are moving forward with deprecating this version |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would rephrase as
"we are deprecating this version of the protocol and will use this metric to identify how many users are impacted by deprecation and when usage has reached zero".
That way someone reading this (someone who cares what data we send) understands how we will use the data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed, tks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we have this merged?
/bugzilla cc-qa |
@wzheng1: This pull request references Bugzilla bug 1885856, which is valid. 3 validation(s) were run on this bug
Requesting review from QA contact: In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
LGTM based on https://bugzilla.redhat.com/show_bug.cgi?id=1885856#c3 |
/retest |
This PR exports the sum of v1_image_imports_total metric. We use this metric to count the number of image streams imported using registry v1 as we are planning to deprecate this version soon.
/test e2e-agnostic |
/test e2e-agnostic-upgrade |
/retest |
Could someone please LGTM this? |
@ricardomaraschini you need a explicit approval from Clayton, after that we can do an lgtm. I don't see it from his comment #949 (comment) |
@@ -238,6 +238,10 @@ data: | |||
- '{__name__="che_workspace_start_time_seconds_count"}' | |||
# (cloud credential operator, @openshift/openshift-team-hive) Track current mode the cloud-credentials-operator is functioning under. | |||
- '{__name__="cco_credentials_mode"}' | |||
# (image-registry-team) :apiserver_v1_image_imports:sum counts the number of image stream |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is there no prefix on this metric? Was there a reason for that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't actually understand why the metric name is bad (i saw lili's question above)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@smarterclayton the name was suggested by @simonpasquier at openshift/cluster-image-registry-operator#626 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like this is following a convention from upstream docs that we don't follow elsewhere, so we have one metric that is different from all the others.
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: ricardomaraschini, simonpasquier, smarterclayton The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/hold |
/hold cancel |
@ricardomaraschini: All pull requests linked via external trackers have merged: Bugzilla bug 1885856 has been moved to the MODIFIED state. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
@ricardomaraschini: cannot checkout 4.6: error checking out 4.6: exit status 1. output: error: pathspec '4.6' did not match any file(s) known to git In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/cherrypick release-4.6 |
@ricardomaraschini: #949 failed to apply on top of branch "release-4.6":
In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
This PR exports v1_image_imports_total metric. We use this metric to
count the number of image streams imported using registry v1 as we are
planning to deprecate this version soon.