Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

merge release-23.1.18-rc to release-23.1: released CockroachDB version 23.1.18. Next version: 23.1.19 #122235

Conversation

cockroach-teamcity
Copy link
Member

Release note: None
Epic: None
Release justification: non-production (release infra) change.

yuzefovich and others added 15 commits March 20, 2024 17:21
This commit makes it so that we also now lose the references from the
monitor being `Stop`ped to its siblings to aid GC. This was omitted by
mistake originally and _perhaps_ could make GC's job harder.

Epic: None

Release note: None
…ort-release-23.1.18-rc-120789

release-23.1.18-rc: util/mon: lose more references on Stop
This commit is a reduced version of 33b1b61.
It is being backported in order to avoid any possible downsides of using
COCKROACH_ENABLE_MONITOR_TREE env var which, if disabled, makes it so
that we don't increment `numChildren` but keep decrementing it.

This is effectively a philosophical fix of an issue that shouldn't
manifest itself in any way.

Kudos to Nathan for identifying this.

Epic: None

Release note: None
…n: 23.1.18

Release note: None
Epic: None
Release justification: non-production (release infra) change.
…versions-release-23.1.18-rc-v23.1.17-irt5

release-23.1.18-rc: released CockroachDB version 23.1.17. Next version: 23.1.18
This introduces some validation to the `roachtest_nightly_impl` script
to achieve two goals:

* **Reduce the number of tests run every night**. We have seen a quick
increase in the number of legitimate release branches created recently
due to the baking period policy and perhaps a more quick paced patch
release schedule for existing release series. Currently, we have 7
active release branches, with a new one being created soon. This
volume of concurrent cluster creation leads to a lot of cluster
creation errors ("zone exhausted" errors on GCE) and sometimes DNS
errors as well. In this commit, we implement a policy where `-rc`
branches only run 40% of the test suite by default.

* **Avoid running the full suite on accidentally created release
branches**. It is not immediately obvious that creating a branch with
a `release-` prefix leads to a roachtest nightly build being created
for that branch. The roachtest script is updated in this commit to
check for these "invalid" release branches and abort execution
early. It is still possible to force execution by setting an
environment variable.

Epic: none

Release note: None
…ort-release-23.1.18-rc-120921

release-23.1.18-rc: util/mon: remove somewhat redundant numChildren field
We already have an env var for this, but it requires a node restart to
apply. Having a cluster setting will allow us to disable the monitor tree
tracking without the node restart, but it might also help us prove that
it's Go GC deficiency to blame for the suspected memory leak: namely, if
we find a cluster that has the leak and disable the tree tracking via
the cluster setting, if it's Go GC deficiency, the leak would be cleaned
up; if it's CRDB memory leak, then the leak will remain but will stop
growing.

Release note: None
Fixes: cockroachdb#121362

cockroachdb#118680 introduced code to
align the timeseries charts on the statement details page to the time
range set by the time picker, instead of the time range that we had
available data for.

The code contained a bug where it would set the start & end of the
XScale to `Moment.prototype.valueOf` instead of the result of *calling*
`Moment.prototype.valueOf()`. This causes the start & end timestamps to
be set to functions, instead of unix timestamps, which broke the charts.

This PR simply invokes the function.

Release note (bug fix): The timeseries graphs shown on the SQL Activity
statement details page in DB Console will now render properly, after
fixing a bug related to setting the time range of the charts.
…-rc-121366

release-23.1.18-rc: pkg/ui: properly derive XScale in statement details page
…ort-release-23.1.18-rc-121746

release-23.1.18-rc: util/mon: add cluster setting for disabling monitor tree tracking
This commit is partial revert of 88ebd70.
Until that change, we attempted to perform memory accounting for txn
fingeprint IDs stored in the cache for each session, but we never
initialized the bytes monitor, so it didn't actually count towards the
root SQL memory budget. In that change, we derived an account from the
"session" monitor to fix that.

However, this exposed another problem with how accounting was done:
namely, on `Cache.Add` call we always grow the account and on
`Cache.OnEvicted` we shrink the account. The problem is that if the txn
fingerprint ID already exists in the cache, we still call `Cache.Add`
(growing the account) but we will never shrink it because we didn't add
a new entry. As a result, until the session is closed, we'll keep on
accumulating the leak.

The fix in this commit is simple - just remove any attempt for memory
accounting for this txn fingerprint ID cache. Effectively, this brings
us back to the state of how we were before the change mentioned above
(no accounting done) without the overhead of creating redundant
BytesMonitor / BoundAccount objects (since they served no real purpose).
Not having any accounting done for this cache seems acceptable given
that the cache stores up to 100 txns (by default), and each txn results
in about 56B of usage, so we'll have about 5KiB of unaccounted for (per
session) memory usage. We have much larger omissions elsewhere, so for
now I left a TODO to add memory accounting in the future.

Release note (bug fix): CockroachDB could previously "leak" reported
memory usage (as accounted by the internal memory accounting system, the
limit for which is configured via --max-sql-memory flag) on long-running
sessions that issue many (hundreds of thousands or more) transactions.
This, in turn, could result in "root: memory budget exceeded" errors for
other queries. The bug is present in versions 23.1.17 and 23.2.3 and is
now fixed.
…-rc-121847

release-23.1.18-rc: sql: fix leak in memory accounting around TxnFingerprintIDCache
…n 23.1.18. Next version: 23.1.19

Release note: None
Epic: None
Release justification: non-production (release infra) change.
@cockroach-teamcity
Copy link
Member Author

This change is Reviewable

@celiala celiala merged commit af76bcf into cockroachdb:release-23.1 Apr 11, 2024
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants