Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug 1885856: Exporting registry v1 protocol usage metric #949

Merged
merged 1 commit into from Dec 1, 2020
Merged

Bug 1885856: Exporting registry v1 protocol usage metric #949

merged 1 commit into from Dec 1, 2020

Conversation

ricardomaraschini
Copy link
Contributor

This PR exports v1_image_imports_total metric. We use this metric to
count the number of image streams imported using registry v1 as we are
planning to deprecate this version soon.

  • I added CHANGELOG entry for this change.
  • No user facing changes, so no entry in CHANGELOG was needed.

@openshift-ci-robot openshift-ci-robot added bugzilla/severity-unspecified Referenced Bugzilla bug's severity is unspecified for the PR. bugzilla/invalid-bug Indicates that a referenced Bugzilla bug is invalid for the branch this PR is targeting. labels Oct 7, 2020
@openshift-ci-robot
Copy link
Contributor

@ricardomaraschini: This pull request references Bugzilla bug 1885856, which is invalid:

  • expected the bug to target the "4.7.0" release, but it targets "---" instead

Comment /bugzilla refresh to re-evaluate validity if changes to the Bugzilla bug are made, or edit the title of this pull request to link to a different bug.

In response to this:

Bug 1885856: Exporting registry v1 protocol usage metric

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@ricardomaraschini
Copy link
Contributor Author

/bugzilla refresh

@openshift-ci-robot openshift-ci-robot removed the bugzilla/severity-unspecified Referenced Bugzilla bug's severity is unspecified for the PR. label Oct 7, 2020
@openshift-ci-robot
Copy link
Contributor

@ricardomaraschini: This pull request references Bugzilla bug 1885856, which is invalid:

  • expected the bug to target the "4.7.0" release, but it targets "---" instead

Comment /bugzilla refresh to re-evaluate validity if changes to the Bugzilla bug are made, or edit the title of this pull request to link to a different bug.

In response to this:

/bugzilla refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci-robot openshift-ci-robot added the bugzilla/severity-low Referenced Bugzilla bug's severity is low for the branch this PR is targeting. label Oct 7, 2020
@ricardomaraschini
Copy link
Contributor Author

/cherrypick 4.6

@openshift-cherrypick-robot

@ricardomaraschini: once the present PR merges, I will cherry-pick it on top of 4.6 in a new PR and assign it to you.

In response to this:

/cherrypick 4.6

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@ricardomaraschini
Copy link
Contributor Author

/assign @dmage @lilic

@ricardomaraschini ricardomaraschini changed the title Bug 1885856: Exporting registry v1 protocol usage metric WIP - Bug 1885856: Exporting registry v1 protocol usage metric Oct 7, 2020
@openshift-ci-robot openshift-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Oct 7, 2020
@ricardomaraschini
Copy link
Contributor Author

/bugzilla refresh

@openshift-ci-robot openshift-ci-robot added the bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. label Oct 7, 2020
@openshift-ci-robot
Copy link
Contributor

@ricardomaraschini: This pull request references Bugzilla bug 1885856, which is valid. The bug has been moved to the POST state. The bug has been updated to refer to the pull request using the external bug tracker.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target release (4.7.0) matches configured target release for branch (4.7.0)
  • bug is in the state NEW, which is one of the valid states (NEW, ASSIGNED, ON_DEV, POST, POST)

In response to this:

/bugzilla refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci-robot openshift-ci-robot removed the bugzilla/invalid-bug Indicates that a referenced Bugzilla bug is invalid for the branch this PR is targeting. label Oct 7, 2020
# (image-registry-team) v1_image_imports_total counts the number of image stream imports made
# using registry protocol v1, we are moving forward with deprecating this version hence we
# need to know how often it is used.
- '{__name__="v1_image_imports_total"}'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is not the full name of the metric apiserver_v1_image_imports_total?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @lilic! I wasn't sure if I should or not add the "subsystem" into it, going to fix it right now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Name here should be as it appears in Prometheus, as this is just a match selector for the metrics (series) themselves.

@lilic
Copy link
Contributor

lilic commented Oct 7, 2020

Note in case you missed it here are the full docs around how to get approval on this https://docs.google.com/document/d/1a6n5iBGM2QaIQRg9Lw4-Npj6QY9--Hpx3XYut-BrUSY/edit

tl;dr: You need to get Claytons 👍 on it, we already had a look at the actual metric. Let me know if any step don't make sense, happy to clarify! :)

@ricardomaraschini
Copy link
Contributor Author

Note in case you missed it here are the full docs around how to get approval on this https://docs.google.com/document/d/1a6n5iBGM2QaIQRg9Lw4-Npj6QY9--Hpx3XYut-BrUSY/edit

tl;dr: You need to get Claytons on it, we already had a look at the actual metric. Let me know if any step don't make sense, happy to clarify! :)

Thanks for helping me on this one!

@ricardomaraschini
Copy link
Contributor Author

/assign @smarterclayton

# (image-registry-team) apiserver_v1_image_imports_total counts the number of image stream
# imports made using registry protocol v1, we are moving forward with deprecating this version
# hence we need to know how often it is used.
- '{__name__="apiserver_v1_image_imports_total"}'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this by default in every openshift, as I did not see it on my cluster?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it is by default in all clusters. It will show up as soon as you import an image from a image registry using protocol v1. This protocol version has already been deprecated by docker and we don't expect this metric to show up.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lilic will it expose the metric labels? We need only sum(apiserver_v1_image_imports_total), labels may have sensitive data (names of private registries and repositories).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have to create a recording rule doing the sum() then.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@simonpasquier I have sent over a new commit that adds :sum at the end of the metric we are querying. Could you please confirm this is the right way of achieving the desired outcome?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you defined the recording rule that generates the apiserver_v1_image_imports:sum somewhere (typically in the operator managing the component emitting the apiserver_v1_image_imports_total metric)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@simonpasquier no, could you please point me to a sample where this is done?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@simonpasquier I have created the rule in here, could you take a look? openshift/cluster-image-registry-operator#626

@ricardomaraschini ricardomaraschini changed the title WIP - Bug 1885856: Exporting registry v1 protocol usage metric Bug 1885856: Exporting registry v1 protocol usage metric Oct 7, 2020
@openshift-ci-robot openshift-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Oct 7, 2020
@ricardomaraschini
Copy link
Contributor Author

/hold

we need a recording rule

@openshift-ci-robot openshift-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 13, 2020
@openshift-ci-robot openshift-ci-robot added lgtm Indicates that a PR is ready to be merged. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Oct 14, 2020
@openshift-ci-robot openshift-ci-robot removed the lgtm Indicates that a PR is ready to be merged. label Oct 23, 2020
@ricardomaraschini
Copy link
Contributor Author

/retest

502 Bad Gateway

@@ -230,6 +230,10 @@ data:
- '{__name__="che_workspace_start_time_seconds_count"}'
# (cloud credential operator, @openshift/openshift-team-hive) Track current mode the cloud-credentials-operator is functioning under.
- '{__name__="cco_credentials_mode"}'
# (image-registry-team) :apiserver_v1_image_imports:sum counts the number of image stream
# imports made using registry protocol v1, we are moving forward with deprecating this version
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rephrase as

"we are deprecating this version of the protocol and will use this metric to identify how many users are impacted by deprecation and when usage has reached zero".

That way someone reading this (someone who cares what data we send) understands how we will use the data.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed, tks!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we have this merged?

@wzheng1
Copy link

wzheng1 commented Oct 29, 2020

/bugzilla cc-qa

@openshift-ci-robot
Copy link
Contributor

@wzheng1: This pull request references Bugzilla bug 1885856, which is valid.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target release (4.7.0) matches configured target release for branch (4.7.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, ON_DEV, POST, POST)

Requesting review from QA contact:
/cc @wzheng1

In response to this:

/bugzilla cc-qa

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@wzheng1
Copy link

wzheng1 commented Oct 29, 2020

@ricardomaraschini
Copy link
Contributor Author

/retest

This PR exports the sum of v1_image_imports_total metric. We use this
metric to count the number of image streams imported using registry
v1 as we are planning to deprecate this version soon.
@ricardomaraschini
Copy link
Contributor Author

/test e2e-agnostic

@ricardomaraschini
Copy link
Contributor Author

/test e2e-agnostic-upgrade

@ricardomaraschini
Copy link
Contributor Author

/retest

@ricardomaraschini
Copy link
Contributor Author

Could someone please LGTM this?

@lilic
Copy link
Contributor

lilic commented Nov 13, 2020

@ricardomaraschini you need a explicit approval from Clayton, after that we can do an lgtm. I don't see it from his comment #949 (comment)

@@ -238,6 +238,10 @@ data:
- '{__name__="che_workspace_start_time_seconds_count"}'
# (cloud credential operator, @openshift/openshift-team-hive) Track current mode the cloud-credentials-operator is functioning under.
- '{__name__="cco_credentials_mode"}'
# (image-registry-team) :apiserver_v1_image_imports:sum counts the number of image stream
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is there no prefix on this metric? Was there a reason for that?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't actually understand why the metric name is bad (i saw lili's question above)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this is following a convention from upstream docs that we don't follow elsewhere, so we have one metric that is different from all the others.

@smarterclayton
Copy link
Contributor

/lgtm
/approve

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Dec 1, 2020
@openshift-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ricardomaraschini, simonpasquier, smarterclayton

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@smarterclayton
Copy link
Contributor

/hold

@smarterclayton
Copy link
Contributor

/hold cancel

@openshift-ci-robot openshift-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Dec 1, 2020
@openshift-merge-robot openshift-merge-robot merged commit 5785987 into openshift:master Dec 1, 2020
@openshift-ci-robot
Copy link
Contributor

@ricardomaraschini: All pull requests linked via external trackers have merged:

Bugzilla bug 1885856 has been moved to the MODIFIED state.

In response to this:

Bug 1885856: Exporting registry v1 protocol usage metric

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-cherrypick-robot

@ricardomaraschini: cannot checkout 4.6: error checking out 4.6: exit status 1. output: error: pathspec '4.6' did not match any file(s) known to git

In response to this:

/cherrypick 4.6

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@ricardomaraschini
Copy link
Contributor Author

/cherrypick release-4.6

@openshift-cherrypick-robot

@ricardomaraschini: #949 failed to apply on top of branch "release-4.6":

Applying: Exporting registry v1 protocol usage metric
Using index info to reconstruct a base tree...
M	Documentation/data-collection.md
M	Documentation/sample-metrics.md
M	Documentation/telemeter_query
M	manifests/0000_50_cluster-monitoring-operator_04-config.yaml
Falling back to patching base and 3-way merge...
Auto-merging manifests/0000_50_cluster-monitoring-operator_04-config.yaml
CONFLICT (content): Merge conflict in manifests/0000_50_cluster-monitoring-operator_04-config.yaml
Auto-merging Documentation/telemeter_query
CONFLICT (content): Merge conflict in Documentation/telemeter_query
Auto-merging Documentation/sample-metrics.md
CONFLICT (content): Merge conflict in Documentation/sample-metrics.md
Auto-merging Documentation/data-collection.md
CONFLICT (content): Merge conflict in Documentation/data-collection.md
error: Failed to merge in the changes.
hint: Use 'git am --show-current-patch=diff' to see the failed patch
Patch failed at 0001 Exporting registry v1 protocol usage metric
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".

In response to this:

/cherrypick release-4.6

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. bugzilla/severity-low Referenced Bugzilla bug's severity is low for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants