Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

o11y: unavailable ranges in console is coloured green #122014

Closed
ajstorm opened this issue Apr 9, 2024 · 2 comments · Fixed by #123120
Closed

o11y: unavailable ranges in console is coloured green #122014

ajstorm opened this issue Apr 9, 2024 · 2 comments · Fixed by #123120
Labels
A-cluster-observability Related to cluster observability C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) good first issue O-testcluster Issues found or occurred on a test cluster, i.e. a long-running internal cluster P-3 Issues/test failures with no fix SLA T-observability

Comments

@ajstorm
Copy link
Collaborator

ajstorm commented Apr 9, 2024

On the drt-large cluster we're in the process of recovering from a cluster outage. As a result, there are some unavailable ranges. In the summary on the metric page we see this:

image

What's odd about this is that there are 3K unavailable ranges, and yet the font is green. I'd think that for any number of unavailable ranges, this should be coloured red.

Jira issue: CRDB-37664

@ajstorm ajstorm added C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) A-cluster-observability Related to cluster observability O-testcluster Issues found or occurred on a test cluster, i.e. a long-running internal cluster P-3 Issues/test failures with no fix SLA T-observability labels Apr 9, 2024
@theloneexplorerquest
Copy link
Contributor

Hi @ajstorm , I would like to work on this issue. Is it still available? 😊

@theloneexplorerquest
Copy link
Contributor

theloneexplorerquest commented Apr 23, 2024

I think I have found where the issue is:
https://github.com/cockroachdb/cockroach/blob/master/pkg/ui/workspaces/db-console/src/views/shared/components/summaryBar/summarybar.styl#L101
Looks like this number is always green no matter what input number is. I will raise a PR soon.

theloneexplorerquest added a commit to theloneexplorerquest/cockroach that referenced this issue Apr 26, 2024
Modify the summary bar to change the color of unavailable ranges. When the unavailable range is greater than zero, it will be displayed in red; if it is zero, it will be green.

Fix: cockroachdb#122014

Release note (ui): Changed the color of unavailable ranges on the summary bar to red when nonzero; ranges are green when zero.
theloneexplorerquest added a commit to theloneexplorerquest/cockroach that referenced this issue May 1, 2024
Modify the summary bar to change the color of unavailable ranges. When the unavailable range is greater than zero, it will be displayed in red; if it is zero, it will be green.

Fix: cockroachdb#122014

Release note (ui): Changed the color of unavailable ranges on the summary bar to red when nonzero; ranges are green when zero.
theloneexplorerquest added a commit to theloneexplorerquest/cockroach that referenced this issue May 1, 2024
Modify the summary bar to change the color of unavailable ranges. When the unavailable range is greater than zero, it will be displayed in red; if it is zero, it will be green.

Fix: cockroachdb#122014

Release note (ui): Changed the color of unavailable ranges on the summary bar to red when nonzero; ranges are green when zero.
theloneexplorerquest added a commit to theloneexplorerquest/cockroach that referenced this issue May 1, 2024
Modify the summary bar to change the color of unavailable ranges. When the unavailable range is greater than zero, it will be displayed in red; if it is zero, it will be green.

Fix: cockroachdb#122014

Release note (ui): Changed the color of unavailable ranges on the summary bar to red when nonzero; ranges are green when zero.
theloneexplorerquest added a commit to theloneexplorerquest/cockroach that referenced this issue May 1, 2024
Modify the summary bar to change the color of unavailable ranges. When the unavailable range is greater than zero, it will be displayed in red; if it is zero, it will be green.

Fix: cockroachdb#122014

Release note (ui): Changed the color of unavailable ranges on the summary bar to red when nonzero; ranges are green when zero.
theloneexplorerquest added a commit to theloneexplorerquest/cockroach that referenced this issue May 3, 2024
Color of unavailable ranges on the summary bar to be red when nonzero.

Fix: cockroachdb#122014

Release note (ui change): Color of unavailable ranges on the summary bar to be red when nonzero.
theloneexplorerquest added a commit to theloneexplorerquest/cockroach that referenced this issue May 3, 2024
Color of unavailable ranges on the summary bar to be red when nonzero.

Fix: cockroachdb#122014

Release note (ui change): Color of unavailable ranges on the summary bar to be red when nonzero.
theloneexplorerquest added a commit to theloneexplorerquest/cockroach that referenced this issue May 10, 2024
Color of unavailable ranges on the summary bar to be red when nonzero.

Fix: cockroachdb#122014

Release note (ui change): Color of unavailable ranges on the summary bar to be red when nonzero.
theloneexplorerquest pushed a commit to theloneexplorerquest/cockroach that referenced this issue May 10, 2024
Color of unavailable ranges on the summary bar to be red when nonzero.

Fix: cockroachdb#122014

Release note (ui change): Color of unavailable ranges on the summary bar to be red when nonzero.
craig bot pushed a commit that referenced this issue May 21, 2024
119416: pkg/util/eventagg: general aggregation framework for reduction of event cardinality r=dhartunian a=abarganier

**Reviewer note: review commit-wise**

The eventagg package is (currently) a proof of concept ("POC") that aims to provide an easy-to-use library that standardizes the way in which we aggregate Observability event data in CRDB. The goal is to eventually emit that data as "exhaust" from CRDB, which downstream systems can consume to build Observability features that do not rely on CRDB's own availability to aid in debugging & investigations. Additionally, we want to provide facilities for code within CRDB to consume this same data, such that it can also power features internally.

This pull request contains work to create the aggregation mechanism in `pkg/util/eventagg`.

This facilities provide a way of aggregating notable events to reduce cardinality, before performing further processing and/or structured logging.

In addition to the framework, a toy SQL Stats example is provided in `pkg/sql/sqlstats/aggregate.go`, which shows the current developer experience when using the APIs.

See `pkg/util/eventagg/doc.go` for more details

Since this feature is currently experimental, it's gated by the `COCKROACH_ENABLE_STRUCTURED_EVENTS` environment variable, which is disabled by default.

---

Release note: none

Epic: CRDB-35919

123120: ui: Highlight unavailable ranges in red on the summary bar with nonzero r=abarganier a=theloneexplorerquest

Modify the summary bar to change the color of unavailable ranges. When the unavailable range is greater than zero, it will be displayed in red; if it is zero, it will be green.

Fix: #122014

Release note (ui): Changed the color of unavailable ranges on the summary bar to red when nonzero; ranges are green when zero.

124160: roachtest: add test for admission control disk bandwidth  r=sumeerbhola a=aadityasondhi

This test runs a single node target cluster that has two workloads
running on it. The lower priority (qos=background) is very bandwidth
intensive, and without the AC bandwidth limiter would saturate the
provisioned bandwidth (controlled using cgroups).

This test shows how setting the cluster setting
`kvadmission.store.provisioned-bandwidth` limits the disk bandwidth
usage of lower priority work and shapes it at the value set in the
setting.

Fixes #121576.

Release note: None


124293: tools: switch md5 cmd name based on existence  r=dt a=dt

Release note: none.
Epic: none.

124348: backupccl: download pre restore data in cluster restore r=dt a=msbutler

This patch adds the pre restore data spans to the list of spans to download.
While these pre restore spans map to data in the temporary system table
database that are then rewwritten to the actual system table, the download job
ought to download all external data linked into the cluster out of principle.

Fixes #124330

Release note: none

124403: roachtest: use first transient error when checking for flakes r=srosenberg a=renatolabs

Previously, roachtest would only look at the outermost error in a chain that matched a `TransientError` (or `ErrorWithOwnership`) when checking for flakes. However, that is in most cases *not* what we want: if a transient error wraps another transient error, the actual reason for the failure is the original (wrapped) error.

Informs: #123887

Release note: None

124486: kvclient: add WithFiltering option to rangefeed client r=nvanbenschoten,msbutler a=stevendanna

This adds a WithFiltering option to the rangefeed client that passes through the option to the underlying rangefeed.

Epic: none
Release note: None

124491: raft: remove RawNode.TickQuiesced r=pav-kv a=nvanbenschoten

This commit removes the `(*RawNode).TickQuiesced` method. The method was deprecated back in etcd-io/raft#62 and has not been in use since 2018.

Epic: None
Release note: None

Co-authored-by: Alex Barganier <abarganier@cockroachlabs.com>
Co-authored-by: theloneexplorerquest <theloneexplorerquest@gmail.com>
Co-authored-by: Aaditya Sondhi <20070511+aadityasondhi@users.noreply.github.com>
Co-authored-by: David Taylor <tinystatemachine@gmail.com>
Co-authored-by: Michael Butler <butler@cockroachlabs.com>
Co-authored-by: Renato Costa <renato@cockroachlabs.com>
Co-authored-by: Steven Danna <danna@cockroachlabs.com>
Co-authored-by: Nathan VanBenschoten <nvanbenschoten@gmail.com>
@craig craig bot closed this as completed in bc8bc7e May 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-cluster-observability Related to cluster observability C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) good first issue O-testcluster Issues found or occurred on a test cluster, i.e. a long-running internal cluster P-3 Issues/test failures with no fix SLA T-observability
Projects
None yet
3 participants