Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

quincy: dashboard: 2nd backport batch #44899

Merged
merged 21 commits into from Feb 15, 2022

Conversation

epuertat
Copy link
Member

@epuertat epuertat commented Feb 4, 2022

#44812 cephadm/box: fix remove image tar error
#44609 mgr/dashboard: perform daemon actions
#44059 mgr/dashboard: monitoring: refactor into ceph-mixin
#43707 mgr: Fix ceph_daemon label in ceph_rgw_* metrics
#44384 mgr/dashboard: cephadm e2e job: display info on error & other improvements
#44920 cephadm: change shared_folder directory for prometheus and grafana
#44825 mgr/dashboard: fix for cephadm e2e failing because of rgw commands getting stuck
#44857 doc: update kcli test env documentation
#44934 mgr/dashboard: set appropriate baseline branch for applitools
#44956 mgr/dashboard: change the readFile to readFileSync
#44808 mgr/dashboard: support snmp-gateway service creation from UI
#44849 mgr/dashboard: Directories Menu Can't Use on Ceph File System Dashboard

@epuertat epuertat added this to the quincy milestone Feb 4, 2022
@epuertat epuertat requested a review from a team February 4, 2022 17:10
@epuertat epuertat requested review from a team as code owners February 4, 2022 17:10
@epuertat epuertat requested review from callithea and aaSharma14 and removed request for a team February 4, 2022 17:10
@github-actions github-actions bot added this to In progress in Dashboard Feb 4, 2022
@epuertat
Copy link
Member Author

epuertat commented Feb 4, 2022

Dashboard automation moved this from In progress to Reviewer approved Feb 7, 2022
@github-actions
Copy link

github-actions bot commented Feb 9, 2022

This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved

The `DaemonStateCollection` used to always contain the daemon name in its
`DaemonKey`, but since ceph#40220 (or more specifically
afc3375), the RadosGW registers with its
instance ID instead (`rados.get_instance_id()`).

As a result, the `ceph_rgw_*` metrics returned by `ceph-mgr` through the
`prometheus` module have their `ceph_daemon` label include that ID instead of
the daemon name, e.g.

```
ceph_rgw_req{ceph_daemon="rgw.127202"}
```

instead of

```
ceph_rgw_req{ceph_daemon="rgw.my-hostname.rgw0"}
```

This commit adds the daemon name from `state->metadata["id"]` if available, as
`service.name` in the JSON document returned by `dump_server()`.

Signed-off-by: Benoît Knecht <bknecht@protonmail.ch>
(cherry picked from commit 2db1aaa)
Now that the RadosGW returns its instance ID instead of its daemon name,
replace the `ceph_daemon` label with an `instance_id` label on the `rgw`
metrics.

Signed-off-by: Benoît Knecht <bknecht@protonmail.ch>
(cherry picked from commit 9f6573b)
BenoitKnecht and others added 14 commits February 9, 2022 20:48
With the `ceph_daemon` label now replaced by `instance_id` on all `ceph_rgw_*`
metrics, we need to update Grafana dashboards get that label back from
`ceph_rgw_metadata` using this type of construct:

```
ceph_rgw_req * on (instance_id) group_left(ceph_daemon) ceph_rgw_metadata
```

Signed-off-by: Benoît Knecht <bknecht@protonmail.ch>
(cherry picked from commit adc36de)
Some of the expressions modified in c402903 were not covered by any tests,
especially those in the `radosgw-detail.json` dashboard.

This commit fills in those gaps.

Signed-off-by: Benoît Knecht <bknecht@protonmail.ch>
(cherry picked from commit 2daaa05)
Add golang as a build dependency to build golang project in the test
for monitoring/ceph-mixin.

Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@cern.ch>
(cherry picked from commit e102620)
Mixin is a way to bundle dashboards, prometheus rules and alerts into
jsonnet package. Shifting to mixin will allow easier integration with
monitoring automation that some users may use.

This commit moves `/monitoring/grafana/dashboards` and
`/monitoring/prometheus` to `/monitoring/ceph-mixin`. Prometheus alerts
was also converted to Jsonnet using an automated way (from yaml to json
to jsonnet). This commit minimises any change made to the generated files
and should not change neithers the dashboards nor the Prometheus alerts.

In the future some configuration will also be added to jsonnet to add
more functionalities to the dashboards or alerts (i.e.: multi cluster).

Fixes: https://tracker.ceph.com/issues/53374
Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@cern.ch>
(cherry picked from commit 98236e3)
As this new version is recently released it's still not in every distro
we use. We now build jsonnet from source so that we can use this new
version of jsonnet. This commit could be reverted later on when the new
version would be available everywhere.

Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@cern.ch>
(cherry picked from commit ecaf907)
Build jsonnet and jb in the testso that we can build ceph without
internet access and still be able to run the test needed for monitoring
using jsonnet tools.

Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@cern.ch>
(cherry picked from commit 8ff1e6b)
Signed-off-by: Pere Diaz Bou <pdiazbou@redhat.com>
Fixes: https://tracker.ceph.com/issues/50322
(cherry picked from commit 239f884)
Signed-off-by: Pere Diaz Bou <pdiazbou@redhat.com>
(cherry picked from commit 71c4935)
Fixes: https://tracker.ceph.com/issues/54105
Signed-off-by: Nizamudeen A <nia@redhat.com>
(cherry picked from commit 8feb2b8)
…ments

- Fix: ensure that on_error trap is called (display more info on error).
- Set static IPs to VMs.
- Remove domain in cluster definition to avoid side effects of potential dns misconfiguration.
- Minor improvements.

Fixes: https://tracker.ceph.com/issues/53991

Signed-off-by: Alfonso Martínez <almartin@redhat.com>
(cherry picked from commit 39af61e)
After ceph#44059 the monitoring/prometheus
and monitoring/grafana/dashboards directories are changed to
monitoring/ceph-mixins. That broke the shared_folders in the cephadm
bootstrap script.

Changed all the instances of monitoring/prometheus and
monitoring/grafana/dashboards to monitoring/ceph-mixins

Also, renaming all the instances of prometheus_alerts.yaml to
prometheus_alerts.yml.

Fixes: https://tracker.ceph.com/issues/54176
Signed-off-by: Nizamudeen A <nia@redhat.com>
(cherry picked from commit 27592b7)
…tting stuck

Delaying the rgw service creation in the tests until the cluster is
healthy

also changing the node_ip_offset to 110 because in the jenkins I saw

Fixes: https://tracker.ceph.com/issues/54030
Signed-off-by: Nizamudeen A <nia@redhat.com>
(cherry picked from commit 347fb2e)
All the dashboard PRs are checked against a baseline branch called
'default' in the visual regresstion testing. This will cause issues when
testing PRs in different branches. For eg: currently our master and
pacific has to save two different screenshots since the two of them
differ slightly.

Disabling the applitools logs as well because its too 'noisy'

Fixes: https://tracker.ceph.com/issues/54190
Signed-off-by: Nizamudeen A <nia@redhat.com>
(cherry picked from commit 40c902a)
Apparently the readFile i added in ceph#44934 is async and that's not what
we want. so changing it to the synchronous call that is readFileSync

Fixes: https://tracker.ceph.com/issues/54190
Signed-off-by: Nizamudeen A <nia@redhat.com>
(cherry picked from commit cbfdd55)
Fixes: https://tracker.ceph.com/issues/54034
Signed-off-by: Avan Thakkar <athakkar@redhat.com>
(cherry picked from commit ad6fcfc)
Signed-off-by: Avan Thakkar <athakkar@redhat.com>
(cherry picked from commit 81c93a2)
Fixes: https://tracker.ceph.com/issues/54034
Signed-off-by: Avan Thakkar <athakkar@redhat.com>
(cherry picked from commit 76dcf6a)
@epuertat
Copy link
Member Author

epuertat commented Feb 10, 2022

Added exception handling to opendir() in cephfs.py for directories with no execute permission.

Fixes: https://tracker.ceph.com/issues/51611
Signed-off-by: Sarthak0702 <sarthak.0702@gmail.com>
(cherry picked from commit ea1af54)
@epuertat
Copy link
Member Author

jenkins test make check

@epuertat
Copy link
Member Author

jenkins test dashboard

@epuertat
Copy link
Member Author

To be merged on Monday

image

@nizamial09 nizamial09 moved this from Reviewer approved to Ready-to-merge in Dashboard Feb 15, 2022
@epuertat epuertat merged commit d31f0b1 into ceph:quincy Feb 15, 2022
Dashboard automation moved this from Ready-to-merge to Done Feb 15, 2022
@epuertat epuertat deleted the wip-dashboard-quincy-backports2 branch February 15, 2022 11:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Archived in project
Dashboard
  
Done
8 participants