Skip to content

Conversation

@feorlen
Copy link
Collaborator

@feorlen feorlen commented Sep 19, 2024

More rearranging to feature metrics v3 information:

  • Remove metrics v2 and Grafana dashboard pages from the TOC, access by link only.
  • Replace examples that use v2 metrics with corresponding v3 metrics.
  • Consolidate remaining references to v2 behavior/metrics on the metrics v2 page.
  • Update mc admin prometheus generate.
  • Update mc admin prometheus metrics.

Update: v2 isn't actually deprecated.

  • "un-deprecate" (more-or-less) but it's still demoted.

Note: refs to the new mc admin prometheus parameters don't work yet and cause build warnings. Likely require a fixup PR after merge, because intersphinx.

Staged:

Metrics and alerts
http://192.241.195.202:9000/staging/remove-recommended-v2-metrics/linux/operations/monitoring/metrics-and-alerts.html

Prometheus
http://192.241.195.202:9000/staging/remove-recommended-v2-metrics/linux/operations/monitoring/collect-minio-metrics-using-prometheus.html

Deprecated v2
http://192.241.195.202:9000/staging/remove-recommended-v2-metrics/linux/operations/monitoring/metrics-v2-deprecated.html#minio-metrics-v2

Deprecated (?) Grafana dashboards
http://192.241.195.202:9000/staging/remove-recommended-v2-metrics/linux/operations/monitoring/grafana.html

mc admin prometheus generate
http://192.241.195.202:9000/staging/remove-recommended-v2-metrics/linux/reference/minio-mc-admin/mc-admin-prometheus-generate.html

mc admin prometheus metrics
http://192.241.195.202:9000/staging/remove-recommended-v2-metrics/linux/reference/minio-mc-admin/mc-admin-prometheus-metrics.html

@feorlen feorlen marked this pull request as draft September 19, 2024 21:23
Use :mc-cmd:`mc admin prometheus generate --api-version v3 <mc admin prometheus generate --api-version>` to generate a scrape configuration for each :ref:`type of metric <minio-metrics-and-alerts-available-metrics>` you want to scrape with Prometheus.

- Set the ``targets`` array with a hostname that resolves to the MinIO deployment.
For example, the following command scrapes all version 3 audit metrics for the MinIO cluster:
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of this "for v3" stuff is clumsy, but I'm concerned users will forget which they are using and follow incorrect instructions. Ideas?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tabs or separate pages is my only suggestion.
No matter what, people will use the wrong command for the wrong version, unfortunately.

static_configs:
- targets: [minio.example.net]
To scrape multiple types of metrics, run :mc-cmd:`mc admin prometheus generate --api-version v3 <mc admin prometheus generate --api-version>` for each type and add the ``job_name`` section to the ``scrape_configs`` in your Prometheus configuration.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this is how it works? I found examples in the prom docs that have multiple job_name sections.

Also, do we call parts of yaml files "sections?"

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think they are nodes officially, but that doesn't seem very widely known or user friendly.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, nodes is a conflict with other terms for our product.
So, section works well enough, I think?

Comment on lines +79 to +80
global:
scrape_interval: 60s
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added this global: so if when somebody copies this into their own config there's less a chance they cause a problem with a too-small interval.

Comment on lines +97 to +98
If needed, edit the generated configuration for your environment.
Common changes include:
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a lot of stuff here for an unordered list. But tables suck. Ideas?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we aren't trying to be exhaustive with covering all the possible changes they might make, tabs might work here.

- Set the ``job_name`` to a value associated to the MinIO deployment.

minio_node_drive_latency_us{job-"minio-job"}[5m]
Use a unique value for each job to ensure isolation of the deployment metrics from any others collected by that Prometheus service.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is accurate?

Comment on lines +145 to +155
.. code-block:: shell
:class: copyable
* - ``minio_node_drive_used_bytes``
- Total storage used on a drive.
minio_system_drive_used_bytes{job-"minio-job"}[5m]
minio_system_drive_used_inodes{job-"minio-job"}[5m]
* - ``minio_node_drive_errors_timeout``
- Total number of drive timeout errors since server start.
minio_cluster_usage_buckets_total_bytes{job-"minio-job"}[5m]
minio_cluster_usage_buckets_objects_count{job-"minio-job"}[5m]
* - ``minio_node_drive_errors_availability``
- Total number of drive I/O errors, permission denied and timeouts since server start.
minio_api_requests_total{job-"minio-job"}[5m]
minio_api_requests_errors_total{job-"minio-job"}[5m]
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replaced the original v2 metrics with something that looked likely from v3. Are there better examples?

rules:
- alert: NodesOffline
expr: avg_over_time(minio_cluster_nodes_offline_total{job="minio-job"}[5m]) > 0
expr: avg_over_time(minio_cluster_health_nodes_offline_count{job="minio-job"}[5m]) > 0
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here with what appeared to be a rational v2-v3 swap

- alert: DisksOffline
expr: avg_over_time(minio_cluster_drive_offline_total{job="minio-job"}[5m]) > 0
expr: avg_over_time(minio_system_drive_offline_count{job="minio-job"}[5m]) > 0
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here with what appeared to be a rational v2-v3 swap

- Set :envvar:`MINIO_PROMETHEUS_URL` to the URL of the Prometheus service
- Set :envvar:`MINIO_PROMETHEUS_JOB_ID` to the unique job ID assigned to the collected metrics

MinIO Grafana Dashboard
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove reference to Grafana entirely in the v3 text. Moved to the v2 page.

For deployments with a load balancer managing connections between MinIO nodes, specify the address of the load balancer.


Configure Prometheus to Collect and Alert using MinIO Metrics
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not bothering to make the headings sentence case on this deprecated page

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No actually I'll go back and do that. Because Reasons

Comment on lines 54 to 55
[--api_version v3] \
[TYPE --bucket <bucket name> --api_version v3]
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this is a reasonable way to represent optional parameters that themselves have required parameters?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔
I think I'd list them as separate, optional flags. Then in the description of them mention they are required if using TYPE. And in the TYPE description, add that it requires those two flags.


Example
-------
Examples
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tried to add examples that would clarify the complexity of invoking two incompatible APIs with the same command. Suggestions welcome.

contents *must* have consistent encryption, use the
:mc:`mc mv` mc with the :mc-cmd:`~mc mv --encrypt` or
:mc-cmd:`~mc mv --encrypt-key` arguments to manually modify the
:mc:`mc mv` mc with the :mc-cmd:`~mc mv --enc-c` argument to manually modify the
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

drive-by build warning fix

Copy link
Collaborator

@djwfyi djwfyi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestions for your consideration.
We should also verify whether v2 really is deprecated. If it isn't, we need to back off that language throughout.

Use :mc-cmd:`mc admin prometheus generate --api-version v3 <mc admin prometheus generate --api-version>` to generate a scrape configuration for each :ref:`type of metric <minio-metrics-and-alerts-available-metrics>` you want to scrape with Prometheus.

- Set the ``targets`` array with a hostname that resolves to the MinIO deployment.
For example, the following command scrapes all version 3 audit metrics for the MinIO cluster:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tabs or separate pages is my only suggestion.
No matter what, people will use the wrong command for the wrong version, unfortunately.

- MinIO deployments started with :envvar:`MINIO_PROMETHEUS_AUTH_TYPE` set to ``"public"`` can omit the ``bearer_token`` field.

minio_node_drive_total{job-"minio-job"}[5m]
- Set the ``scheme`` to http for MinIO deployments not using TLS.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since it is the value you're setting, seems like http should also be monotyped.

Generate a v3 cluster metrics config
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Use :mc-cmd:`mc admin prometheus generate --api-version v3` to generate a scrape configuration that collects v3 cluster metrics for a MinIO deployment:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Use :mc-cmd:`mc admin prometheus generate --api-version v3` to generate a scrape configuration that collects v3 cluster metrics for a MinIO deployment:
Use :mc-cmd:`mc admin prometheus generate --api-version v3` to generate a scrape configuration that collects v3 cluster type metrics for a MinIO deployment:

Generate a v3 bucket replication metrics config
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The following example generates a scrape configuration for v3 replication metrics of bucket ``mybucket``:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The following example generates a scrape configuration for v3 replication metrics of bucket ``mybucket``:
The following example generates a scrape configuration for v3 replication type metrics of bucket ``mybucket``:

I was confused by cluster metrics earlier. I don't think the type is as needed here, but if you agree to put it with the cluster metrics, then have to be consistent.

Use :mc-cmd:`mc admin prometheus generate` to generate a scrape configuration that collects bucket metrics for a MinIO deployment:

Generate a default metrics v2 config
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So perhaps consider having headings like:

Examples
----------

v3 Examples
~~~~~~~~~~

v3 example 1
++++++++++


v2 examples
~~~~~~~~~~

etc.

Comment on lines +97 to +98
If needed, edit the generated configuration for your environment.
Common changes include:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we aren't trying to be exhaustive with covering all the possible changes they might make, tabs might work here.

@feorlen feorlen changed the title Relocate additional metrics v3 content Relocate additional metrics v2 content Sep 20, 2024
@feorlen
Copy link
Collaborator Author

feorlen commented Nov 25, 2024

Closing this to start over on undeprecation with a current branch

@feorlen
Copy link
Collaborator Author

feorlen commented Feb 12, 2025

I don't remember why this draft issue wasn't actually closed. The undeprecation was handled with #1375

Any additional tweaks can be done with a new PR.

@feorlen feorlen closed this Feb 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants