mgr/dashboard: introduce grafana frontend e2e testing #45811

nizamial09 · 2022-04-07T13:42:03Z

Used the https://www.npmjs.com/package/@grafana/e2e npm packages and
followed
https://github.com/grafana/grafana/blob/main/contribute/style-guides/e2e.md
to understand the style of the grafana e2e testing.

In this PR I introduces the tests for the Hosts Overall
Performance and also RGW per Daemon and Overall Performance

Description for a prometheus internal server error
After we increase/decrease the count of the node-exporter, we get a 500
Internal server error from api/prometheus/rules endpoint. On further
debugging its caused by the jsonDecodder, because I guess the expected
input for the json.loads() is not a json formatted input. So to fix
that issue I can either do an error handling on the json.loads() or I
can move the json.loads() on the already existing try block. I went for
the second approach here.

ERROR TRACEBACK:

Apr 26 08:32:51 ceph-node-00 ceph-mgr[11429]: [dashboard ERROR exception] Internal Server Error
                                              Traceback (most recent call last):
                                                File "/usr/share/ceph/mgr/dashboard/services/exception.py", line 47, in dashboard_exception_handler
                                                  return handler(*args, **kwargs)
                                                File "/lib/python3.6/site-packages/cherrypy/_cpdispatch.py", line 54, in __call__
                                                  return self.callable(*self.args, **self.kwargs)
                                                File "/usr/share/ceph/mgr/dashboard/controllers/_base_controller.py", line 263, in inner
                                                  ret = func(*args, **kwargs)
                                                File "/usr/share/ceph/mgr/dashboard/controllers/_rest_controller.py", line 191, in wrapper
                                                  return func(*vpath, **params)
                                                File "/usr/share/ceph/mgr/dashboard/controllers/prometheus.py", line 79, in rules
                                                  return self.prometheus_proxy('GET', '/rules', params)
                                                File "/usr/share/ceph/mgr/dashboard/controllers/prometheus.py", line 34, in prometheus_proxy
                                                  verify=Settings.PROMETHEUS_API_SSL_VERIFY)
                                                File "/usr/share/ceph/mgr/dashboard/controllers/prometheus.py", line 58, in _proxy
                                                  content = json.loads(response.content)
                                                File "/lib64/python3.6/json/__init__.py", line 354, in loads
                                                  return _default_decoder.decode(s)
                                                File "/lib64/python3.6/json/decoder.py", line 339, in decode
                                                  obj, end = self.raw_decode(s, idx=_w(s, 0).end())
                                                File "/lib64/python3.6/json/decoder.py", line 357, in raw_decode
                                                  raise JSONDecodeError("Expecting value", s, err.value) from None
                                              json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
Apr 26 08:32:51 ceph-node-00 ceph-mgr[11429]: [dashboard INFO request] [::ffff:192.168.100.1:55554] [GET] [200] [0.032s] [admin] [583.0B] /api/prometheus
Apr 26 08:32:51 ceph-node-00 ceph-mgr[11429]: [dashboard ERROR request] [::ffff:192.168.100.1:55528] [GET] [500] [0.043s] [admin] [513.0B] /api/prometheus/rules
Apr 26 08:32:51 ceph-node-00 ceph-mgr[11429]: [dashboard ERROR request] [b'{"status": "500 Internal Server Error", "detail": "The server encountered an unexpected condition which prevented it from fulfilling the request.", "request_id": "797162d6-0f9b-4d94-89c1-cdc919a6fd3b"}

Fixes: https://tracker.ceph.com/issues/54356
Signed-off-by: Nizamudeen A nia@redhat.com

Checklist

Tracker (select at least one)
- References tracker ticket
- Very recent bug; references commit where it was introduced
- New feature (ticket optional)
- Doc update (no ticket needed)
- Code cleanup (no ticket needed)
Component impact
- Affects Dashboard, opened tracker ticket
- Affects Orchestrator, opened tracker ticket
- No impact that needs to be tracked
Documentation (select at least one)
- Updates relevant documentation
- No doc update is appropriate
Tests (select at least one)
- Includes unit test(s)
- Includes integration test(s)
- Includes bug reproducer
- No tests

Show available Jenkins commands

jenkins retest this please
jenkins test classic perf
jenkins test crimson perf
jenkins test signed
jenkins test make check
jenkins test make check arm64
jenkins test submodules
jenkins test dashboard
jenkins test dashboard cephadm
jenkins test api
jenkins test docs
jenkins render docs
jenkins test ceph-volume all
jenkins test ceph-volume tox
jenkins test windows

nizamial09 · 2022-04-07T13:43:35Z

jenkins test dashboard

nizamial09 · 2022-04-07T13:43:43Z

jenkins test dashboard cephadm

nizamial09 · 2022-04-08T11:16:45Z

...gr/dashboard/frontend/cypress/integration/orchestrator/workflow/06-cluster-check.e2e-spec.ts

+      const dashboardArr: Input[] = [
+        {
+          id: 'GRAFANA_API_URL',
+          newValue: 'https://192.168.100.100:3000',


@epuertat @aaSharma14 recently some changes went into cephadm which basically means fetching the fqdn for some config urls. And our kcli configuration causes the fqdn to be something like https://ceph-node-00.ceph-dashboard:3000 and this doesn't work. So for now I had to manually change the grafana-api-url.

github-actions · 2022-04-13T13:26:55Z

This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved

nizamial09 · 2022-04-18T09:54:58Z

jenkins test make check

avanthakkar · 2022-04-19T16:25:11Z

jenkins test api

avanthakkar · 2022-04-19T16:31:07Z

src/pybind/mgr/dashboard/ci/cephadm/bootstrap-cluster.sh

@@ -11,7 +11,7 @@ mkdir -p /etc/ceph
 mon_ip=$(ifconfig eth0  | grep 'inet ' | awk '{ print $2}')

 bootstrap_extra_options='--allow-fqdn-hostname --dashboard-password-noupdate'
-bootstrap_extra_options_not_expanded='--skip-monitoring-stack'
+bootstrap_extra_options_not_expanded=''


Looks great!
Can we instead remove all these lines (14-17)? As anyways we don't need extra options or we can just comment these lines maybe with a comment that they are temporarily commented, if any extra options needs to be added in future can be done here. Wdyt @nizamial09 ?

Done! @avanthakkar

epuertat

Really nice job @nizamial09 ! Just some suggestions & comments over there, but overall looks good to me.

epuertat · 2022-04-08T09:40:50Z

src/pybind/mgr/dashboard/ci/cephadm/bootstrap-cluster.sh

@@ -11,7 +11,7 @@ mkdir -p /etc/ceph
 mon_ip=$(ifconfig eth0  | grep 'inet ' | awk '{ print $2}')

 bootstrap_extra_options='--allow-fqdn-hostname --dashboard-password-noupdate'
-bootstrap_extra_options_not_expanded='--skip-monitoring-stack'
+bootstrap_extra_options_not_expanded=''


I didnt' realize we weren't running this... So technically our cephadm CI didn't cover the Grafana image PR. Well, it's good to have this enabled. Are we somehow testing the the monitoring stack is healthy? Grafana, prometheus, node-exporter,... are up?

I disabled it previously because of the mgr restart that occurs when prometheus is started which interrupts the dashboard e2e. Now I disabled it found a workaround for it. Also I introduced a check to check whether all the monitoring stacks are up and running in this PR.

epuertat · 2022-04-08T09:50:42Z

src/pybind/mgr/dashboard/ci/cephadm/run-cephadm-e2e-tests.sh

+# otherwise creating the prometheus service cause
+# mgr restart and cause the dashboard to not accessible
+# briefly.
+sleep 120


You know I'm not a fan of unconditional waits: if this finishes in 60 seconds, we'll waste 1 minute waiting; if it takes longer, it'll fail. If this starts failing, we'll simply increase the timer to 3 minutes, 4 minutes, and so on.

Instead, what about actively checking whether all required services are ok? Having pre-conditions or assertions (or waits based on those) prior to start a test is a good practice that stabilizes complex (integration, e2e) testings. Conditionally waiting up to... 5 or 10 minutes (or the average wait + 3 times the std dev, which is exactly the 3-σ or >99% for a "process following a normal distribution") is ok, as long as it's conditional (most of the times is less than that).

I agree. and I've made some changes here. Can you take a look and see if its okay?

epuertat · 2022-04-08T09:51:45Z

src/pybind/mgr/dashboard/frontend/cypress.json

@@ -18,5 +18,6 @@
      "mochaFile": "cypress/reports/results-[hash].xml"
    }
  },
-  "retries": 1
+  "retries": 1,
+  "chromeWebSecurity": false


Why is it needed (I guess iframe or cookie-related?) and... are there any consequences of this?

Its because of the iframe yes and we need to bypass that issue. I went through the cypress official doc (workarounds are not really applicable for the iframe). Also I searched for a while for some serious consequences and I couldn't find one. Still, I removed the chromeWebSecurity from the cypress.json file and limited it to only use on the cephadm e2e tests.

src/pybind/mgr/dashboard/frontend/cypress/integration/orchestrator/grafana/grafana.feature

epuertat · 2022-04-19T16:50:06Z

src/pybind/mgr/dashboard/frontend/cypress/plugins/index.js

+    log({ message, optional }) {
+      optional ? console.log(message, optional) : console.log(message);
+      return null;


What's this for?

the grafana-e2e commands like e2e.components.Panels.Panel.title... were failing because it couldn't find a command called log. So I had to manually add it on our code. Then it started passing.

nizamial09 · 2022-04-25T06:50:36Z

jenkins test dashboard cephadm

src/pybind/mgr/dashboard/controllers/prometheus.py

epuertat · 2022-04-26T10:57:48Z

src/pybind/mgr/dashboard/ci/cephadm/bootstrap-cluster.sh

+# commenting the below lines. Uncomment it when any extra options are
+# needed for the bootstrap.
+# bootstrap_extra_options_not_expanded=''
+# {% if expanded_cluster is not defined %}
+#   bootstrap_extra_options+=" ${bootstrap_extra_options_not_expanded}"
+# {% endif %}


extra_options are just commented for now since we are not passing any extra options. It might be useful for later uses if we decides to pass some extra options so I thought I'll leave it here. Also, Avan pointed out to comment it out too. Maybe its useful in local too if we want to start a cluster and we don't want monitoring stacks, we can just add that skip option here.

src/pybind/mgr/dashboard/frontend/cypress/support/commands.ts

Used the https://www.npmjs.com/package/@grafana/e2e npm packages and followed https://github.com/grafana/grafana/blob/main/contribute/style-guides/e2e.md to understand the style of the grafana e2e testing. In this PR I introduces the tests for the Hosts Overall Performance and also RGW per Daemon and Overall Performance Fixes: https://tracker.ceph.com/issues/54356 Signed-off-by: Nizamudeen A <nia@redhat.com>

After we increase/decrease the count of the node-exporter, we get a 500 - Internal server error from api/prometheus/rules endpoint. On further debugging its caused by the jsonDecodder, because I guess the expected input for the json.loads() is not a json formatted input. So to fix that issue I can either do an error handling on the json.loads() or I can move the json.loads() on the already existing try block. I went for the second approach here. Fixes: https://tracker.ceph.com/issues/54356 Signed-off-by: Nizamudeen A <nia@redhat.com>

nizamial09 · 2022-04-27T07:37:39Z

jenkins test dashboard

nizamial09 requested a review from a team as a code owner April 7, 2022 13:42

nizamial09 requested review from avanthakkar and pereman2 and removed request for a team April 7, 2022 13:42

github-actions bot added dashboard pybind labels Apr 7, 2022

nizamial09 changed the title ~~mgr/dashboard: introduce grafana frontend e2e testing~~ mgr/dashboard: introduce grafana frontend e2e testing Apr 7, 2022

nizamial09 force-pushed the grafana-e2e branch from 52a22d7 to 73cec6d Compare April 7, 2022 18:21

github-actions bot added this to In progress in Dashboard Apr 7, 2022

nizamial09 force-pushed the grafana-e2e branch 7 times, most recently from 9517018 to 276e979 Compare April 8, 2022 11:11

nizamial09 commented Apr 8, 2022

View reviewed changes

nizamial09 force-pushed the grafana-e2e branch 3 times, most recently from 5cc5c41 to e79bd58 Compare April 8, 2022 14:06

aaSharma14 requested review from Sarthak0702, aaSharma14 and epuertat April 11, 2022 09:22

github-actions bot added the needs-rebase label Apr 13, 2022

nizamial09 force-pushed the grafana-e2e branch from e79bd58 to c5b0727 Compare April 18, 2022 06:05

github-actions bot removed the needs-rebase label Apr 18, 2022

Dashboard automation moved this from In progress to Reviewer approved Apr 19, 2022

avanthakkar approved these changes Apr 19, 2022

View reviewed changes

epuertat approved these changes Apr 19, 2022

View reviewed changes

nizamial09 force-pushed the grafana-e2e branch 2 times, most recently from ea6284e to 3fad102 Compare April 25, 2022 06:39

nizamial09 force-pushed the grafana-e2e branch 7 times, most recently from d3ddd9e to 49c1d25 Compare April 26, 2022 10:55

epuertat approved these changes Apr 26, 2022

View reviewed changes

nizamial09 force-pushed the grafana-e2e branch 4 times, most recently from 9b6bbe3 to cf998b3 Compare April 26, 2022 15:48

nizamial09 force-pushed the grafana-e2e branch from cf998b3 to 447952c Compare April 26, 2022 17:38

nizamial09 force-pushed the grafana-e2e branch from 447952c to 672e27c Compare April 27, 2022 06:34

nizamial09 moved this from Reviewer approved to Ready-to-merge in Dashboard Apr 27, 2022

epuertat merged commit 88622d2 into ceph:master Apr 27, 2022

Dashboard automation moved this from Ready-to-merge to Done Apr 27, 2022

epuertat deleted the grafana-e2e branch April 27, 2022 12:09

This was referenced Aug 19, 2022

quincy: mgr/dashboard: grafana frontend e2e testing and update cypress #47703

Merged

pacific: mgr/dashboard: grafana frontend e2e testing and update cypress #47721

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

mgr/dashboard: introduce grafana frontend e2e testing #45811

mgr/dashboard: introduce grafana frontend e2e testing #45811

nizamial09 commented Apr 7, 2022 •

edited

nizamial09 commented Apr 7, 2022

nizamial09 commented Apr 7, 2022

nizamial09 Apr 8, 2022 •

edited

github-actions bot commented Apr 13, 2022

nizamial09 commented Apr 18, 2022

avanthakkar commented Apr 19, 2022

avanthakkar Apr 19, 2022

nizamial09 Apr 25, 2022

epuertat left a comment

epuertat Apr 8, 2022

nizamial09 Apr 25, 2022 •

edited

epuertat Apr 8, 2022

nizamial09 Apr 25, 2022

epuertat Apr 8, 2022

nizamial09 Apr 25, 2022

epuertat Apr 19, 2022

nizamial09 Apr 25, 2022

nizamial09 commented Apr 25, 2022

epuertat Apr 26, 2022

nizamial09 Apr 26, 2022

nizamial09 commented Apr 27, 2022

mgr/dashboard: introduce grafana frontend e2e testing #45811

mgr/dashboard: introduce grafana frontend e2e testing #45811

Conversation

nizamial09 commented Apr 7, 2022 • edited

Checklist

nizamial09 commented Apr 7, 2022

nizamial09 commented Apr 7, 2022

nizamial09 Apr 8, 2022 • edited

Choose a reason for hiding this comment

github-actions bot commented Apr 13, 2022

nizamial09 commented Apr 18, 2022

avanthakkar commented Apr 19, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

epuertat left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nizamial09 Apr 25, 2022 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nizamial09 commented Apr 25, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nizamial09 commented Apr 27, 2022

nizamial09 commented Apr 7, 2022 •

edited

nizamial09 Apr 8, 2022 •

edited

nizamial09 Apr 25, 2022 •

edited