Releases: redpanda-data/redpanda
v23.3.8
Features
- #16941
rpk redpanda config bootstrap
now support bootstrapping your advertised addresses configuration. by @r-vasquez in #16942
Bug Fixes
- Fix a crash that happened when a cluster that was partially in recovery mode tried to upload consumer offsets to cloud storage. by @ ztlpn in #17022
- Return a HTTP 400 error code when deploying a transform to a topic that doesn't exist instead of a 500 by @rockwotj in #17018
- Schema Registry: Deleted schemas no longer reappear after certain compaction patterns on the
_schemas
topic. by @BenPope in #17094 - #16679 Retains control batches from transactions to preserve transaction boundaries. This prevents some (very unlikely) scenarios where aborted data is read. by @bharathv in [#17100](https://github.com/ /pull/17100)
- PR #17093 [v23.3.x] c/topic_table: replaced partition metadata map with chunked_vector by @mmaslankaprv
- PR #17099 [v23.3.x] storage: ensure monotonic stable offset updates by @ nvartolomei
- PR #17111 [v23.3.x] cloud_storage_clients: classify request_timeout as retriable by @nvartolomei
Improvements
- #16815 Node-wide throughput throttling is now fair an responsive. by @ BenPope in #16848
- #16993 cluster: Avoid oversize allocs for topic creation and configuration by @BenPope in #17012
- #17107
rpk profile
has been reworked in an attempt to be simpler; see PR #17038 for more detail by @twmb in #17108 - PR #17115 [v23.3.x] Using
contiguous_range_map
inpartition_leaders_table
by @mmaslankaprv - PR #17120 [v23.3.x] rpk profile: a few more fixes by @twmb
Full Changelog: v23.3.7...v23.3.8
v23.3.7
Features
- You can create namespaces in Redpanda cloud using rpk cloud namespace. by @r-vasquez in [#16777](https://github.com/redpanda- data/redpanda/pull/16777)
- #16570 [#16572](https://github.com/redpanda-data/redpanda/issues/ 16572) Publish log (i.e. stderr/stdout) output from data transforms exclusively to an internally managed Redpanda topic (
_redpanda.transform_logs
). Data transform logs will no longer appear in broker logs. by @oleiman in [#16663](https://github. com//pull/16663) - #16895 Add Prometheus metrics for data transforms logging by @ oleiman in #16913
Bug Fixes
- Fixes a plausible correctness issue with idempotent requests during replication failures. by @bharathv in [#16749](https:// github.com//pull/16749)
- #16129 Fixes a bug in SASL user deletion and update where usernames with a + symbol in the username were prevented from being deleted by @pgellert in [#16811](https://github.com/redpanda-data/ redpanda/pull/16811)
- #16659 Fixes a bug in the tiered storage time-based query implementation that could result in a consumer hang when consuming very old data. by @andrwng in [#16660](https://github.com/ /pull/16660)
- #16717 Fixed a few oversized allocations for some admin server endpoints. by @rockwotj in #16719
- #16884 Fixed deleting Data Transforms with names that had URL unsafe characters by @rockwotj in #16885
- #16937 Fixes a bug in windowed compaction that could cause Redpanda to crash when an error occurs while reading batches. by @andrwng in [#16940](https://github.com/redpanda-data/redpanda/pull/ 16940)
Improvements
- Adds observability into producer evictions in each shard. by @bharathv in [#16839](https://github.com/redpanda-data/redpanda/ pull/16839)
- Fix large wasm module deployments by @rockwotj in #16767
- Increase
data_transforms_logging_buffer_capacity_bytes
from 100KiB to 500KiB by @oleiman in [#16977](https://github.com/ /pull/16977) - Large allocations are now logged by default (similar to reactor stalls) by @StephanDollberg in [#16844](https://github.com/ /pull/16844)
- #16795 Added ability to change transactional manage topic properties by @mmaslankaprv in #16968
- #16831 get_cluster_uuid returns a correctly formatted string by @ andijcr in #16832
- #16888 Data Transform builds in rpk now uses tinygo v0.31.1 by @ rockwotj in #16889
- #16947 better control of memory usage in storage layer. by @ mmaslankaprv in #16963
- #16997 Added
EHOSTUNREACH
to retry-able error code list by @ michael-redpanda in #16998 - optimized updating leadership metadata with health reports by @mmaslankaprv in [#16709](https://github.com/redpanda-data/ redpanda/pull/16709)
- preventing large allocation in partition balancer code by @mmaslankaprv in [#16939](https://github.com/redpanda-data/ redpanda/pull/16939)
- rpk: Remove 10s timeout in
rpk profile create
by @r-vasquez in [#16852](https://github.com/redpanda-data/redpanda/pull/ 16852) - PR #16682 [v23.3.x] Implement async_for_each by @travisdowns
- PR #16688 [v23.3.x] Add forward iterator to async_for_each by @ travisdowns
- PR #16691 [v23.3.x] Rethrow on unknown exceptions in fetch handler by @ballard26
- PR #16784 [v23.3.x] c/leaders: trigger leadership notification when term changes by @mmaslankaprv
- PR #16801 [v23.3.x] c/topic_table_probe: use btree_map in topic table probe by @mmaslankaprv
- PR #16829 [v23.3.x] rpk: update help text of decommission-status by @ daisukebe
- PR #16891 [v23.3.x] cmake: upgrade tinygo compiler by @rockwotj
- PR #16894 [v23.3.x] cloud_storage: Improve stale_reader test by @Lazin
- PR #16897 [v23.3.x] Fixed background apply fiber race condition in
raft::state_machine_manager
by @mmaslankaprv - PR #16903 [v23.3.x] cloud_storage: various non-functional changes by @andrwng
- PR #16908 Revert "[v23.3.x] rm_stm/idempotency: fix the producer lock scope" by @bharathv
- PR #16935 [v23.3.x] fix for cluster_config_test.py::test_aliasing by @andijcr
- PR #16965 [v23.3.x] Ensure
fragment_vector
fragments are always <= 128KiB by @ballard26
Full Changelog: v23.3.6...v23.3.7
v23.3.6
Bug Fixes
- Fix a bug that resulted in Redpanda ignoring until the next restart config values that were reset to their defaults. by @ztlpn in #16638
- Prevent detecting leader epoch advancement when state is not up to date by @mmaslankaprv in [#16573](https://github.com/redpanda-data/redpanda/ pull/16573)
- #16621 Avoid a large contiguous allocation when creating thousands of topics in a single CreateTopics request. by @travisdowns in #16622
- #16627 #16628
rpk tune -- output-script
: Add a missing new line in the ballast file tuner when using the--output-script
flag by @r-vasquez in [#16629](https://github. com//pull/16629)
Improvements
- Validate transform code at deploy time to ensure the correct SDK is used. by @rockwotj in [#16498](https://github.com/redpanda-data/redpanda/ pull/16498)
- #16627 #16628
rpk tune -- output-script
: rpk now creates a file for you if the provided file does not exist. by @r-vasquez in [#16629](https://github.com/redpanda-data/ redpanda/pull/16629) - PR #16546 [v23.3.x] rpc: Add config flag to enable/disable compression for replies by @ StephanDollberg
- PR #16565 [v23.3.x] introduce chunked_vector by @rockwotj
- PR #16569 [v23.3.x] rpc: Disable compression for internal rpc replies by @StephanDollberg
Full Changelog: v23.3.5...v23.3.6
v23.2.26
Bug Fixes
- Fix a bug that resulted in Redpanda ignoring until the next restart config values that were reset to their defaults. by @ztlpn in #16641
- #16624 #16625
rpk tune -- output-script
: Add a missing new line in the ballast file tuner when using the--output-script
flag by @r-vasquez in [#16626](https://github. com//pull/16626)
Improvements
- #16624 #16625
rpk tune -- output-script
: rpk now creates a file for you if the provided file does not exist. by @r-vasquez in [#16626](https://github.com/redpanda-data/ redpanda/pull/16626) - PR #16493 [v23.2.x] Fixed large allocation in
kafka::wait_for_leaders
by @mmaslankaprv - PR #16550 [v23.2.x] rpc: Add config flag to enable/disable compression for replies by @ StephanDollberg
- PR #16641 [v23.2.x] config: update bindings when properties are reset by @andijcr
Full Changelog: v23.2.25...v23.2.26
v23.3.5
Features
- #16411 You can print a schema now using
rpk registry schema get --print-schema
. by @r-vasquez in #16412 - PR #16292 [v23.3.x] "enable by default spillover manifest" testing followups by @andijcr
Bug Fixes
- Aggregates partitions in some cloud storage metrics when the
aggregate_metrics
cluster config is set to true. by @ballard26 in #16344 - #16350 rpk: fixed a bug where the
--password
flag could not be used along with the new configuration flag-X pass
in clusters where basic authentication was enabled. by @r-vasquez in #16351 - #16389 Prevent oversized allocation with large amounts of controller metadata by @rockwotj in #16390
- #16391 Fixes a bug that may prevent redpanda from shutting down cleanly when auditing is enabled by @graphcareful in #16392
- #16393 Fix graceful shutdown of the TS archive area retention procedure. by @Lazin in #16394
- #16404 Fixes issue that causes the connection to hang when an unsupported compression type is passed via an incremental_alter_configs request by @graphcareful in #16406
- #16450 #16451 Fix an issue where create topics responses would show incorrect partition count and replication factor by @oleiman in #16452
- #16500 Fix assertion triggered by interleaving of log flush and log truncation followed by append by @Lazin in #16501
- #16517 Fix timequery error that triggered full partition scan by @Lazin in #16518
Improvements
-
Adds a new cluster configuration property
fetch_read_strategy
. This property determines which fetch execution strategy Redpanda will use to fulfill a fetch request. The newly introducednon_polling
execution strategy is the default for this property with thepolling
strategy being included to make backporting possible. by @ballard26 in #16484 -
Improved handling of follower fetching offset validation when used with relaxed consistency by @mmaslankaprv in #16522
-
Improves observability by allowing Redpanda to detect that some internal processes are stuck. by @Lazin in #16476
-
Introduces a new non-polling fetch execution strategy that decreases CPU utilization of fetch requests and fetch request latency. by @ballard26 in #16484
-
Publish total reclaimable space to avoid stuck decommission scenario. by @dotnwat in #16422
-
SIMD instructions are generated by default for WebAssembly binaries when building with
rpk
. by @rockwotj in #16403 -
- #16362 rpk cluster health: now
--exit-when-healthy
enables--watch
when provided. by @r-vasquez in #16363
- #16362 rpk cluster health: now
-
PR #16294 [v23.3.x] transform-sdk/rust: borrow output record by @rockwotj
-
PR #16300 [v23.3.x] Transform logging data model by @oleiman
-
PR #16316 [v23.3.x] transform-sdk/rust: update rustdocs by @rockwotj
-
PR #16345 [v23.3.x] Wrapped logging with
vlog
macro in places that missed it by @mmaslankaprv -
PR #16401 [v23.3.x] cloud_storage_clients/client_pool: handle broken _self_config_barrier by @andijcr
-
PR #16432 [v23.3.x] Fixed large allocation in
kafka::wait_for_leaders
by @mmaslankaprv -
PR #16436 [v23.3.x] Introduce
transform::logging::manager
by @oleiman -
PR #16439 [v23.3.x] rptest: Fix s3.copy_object when running on GCP by @savex
-
PR #16492 [v23.3.x] k/metadata: guesstimate leader when information is not yet present by @mmaslankaprv
-
PR #16528 Use the original fetch impl by default in backports by @ballard26
-
PR #16531 [v23.3.x] Increase default value of
rpc_client_connections_per_peer
to 32 by @ballard26 -
Full Changelog: v23.3.4...v23.3.5
v23.2.25
Bug Fixes
- #16348 rpk: fixed a bug where the
--password
flag could not be used along with the new configuration flag-X pass
in clusters where basic authentication was enabled. by @r-vasquez in #16349 - #16387 Prevent oversized allocation with large amounts of controller metadata by @rockwotj in #16388
- #16405 Fixes issue that causes the connection to hang when an unsupported compression type is passed via an incremental_alter_configs request by @graphcareful in #16407
- #16447 #16448 Fix an issue where create topics responses would show incorrect partition count and replication factor by @oleiman in #16449
- #16479 Fix timequery error that triggered full partition scan by @Lazin in #16520
- PR #16435 [v23.2.x] Fixed large allocation in
kafka::wait_for_leaders
by @mmaslankaprv
Improvements
- Improved handling of follower fetching offset validation when used with relaxed consistency by @mmaslankaprv in #16530
- Improves observability by allowing Redpanda to detect that some internal processes are stuck. by @Lazin in #16477
- #16423 Publish total reclaimable space to avoid stuck decommission scenario. by @dotnwat in #16442
- PR #16219 [v23.2.x] Introduced partition shutdown watchdog timer by @mmaslankaprv
- PR #16494 [v23.2.x] Wrapped logging with
vlog
macro in places that missed it by @mmaslankaprv
Full Changelog: v23.2.24...v23.2.25
v23.3.4
Bug Fixes
- Fix the starter code for Rust projects in
rpk transform init
by @rockwotj in #16194 - Report runtime public metrics by task queue for all cores, not just core 0 by @rockwotj in #16203
- #16271 Fixes a bug that would previously cause read replicas to report the wrong value for the
redpand_kafka_max_offset
metric. by @andrwng in #16272
Improvements
- smaller memory footprint when using with large number of topics with small partition count by @mmaslankaprv in #16266
- PR #16182 [v23.3.x] Rename max_client_count to max_connection_count by @travisdowns
- PR #16195 [v23.3.x] Introduce wasm::logger by @oleiman
- PR #16216 [v23.3.x] Introduced partition shutdown watchdog timer by @mmaslankaprv
- PR #16236 [v23.3.x] c/topics_dispatcher: do not guesstimate leader ids by @mmaslankaprv
- PR #16255 [v23.3.x] admin api: skip partition info in /brokers end point by @bharathv
Full Changelog: v23.3.3...v23.3.4
v23.2.24
Bug Fixes
- #16274 Fixes a bug that would previously cause read replicas to report the wrong value for the
redpand_kafka_max_offset
metric. by @andrwng in #16275 - PR #16221 [v23.2.x] Fixed skipping application of raft snapshot by @mmaslankaprv
- PR #16239 [v23.2.x] c/topics_dispatcher: do not guesstimate leader ids by @mmaslankaprv
- PR #16282 [v23.2.x] cloud_storage: hold gate in hydration by @andrwng
Improvements
- smaller memory footprint when using with large number of topics with small partition count by @mmaslankaprv in #16267
- PR #16254 [v23.2.x] admin api: skip partition info in /brokers end point by @bharathv
Full Changelog: v23.2.23...v23.2.24
v23.3.3
Features
- spillover manifests are enabled by default for clusters that did not explicit set a value or null by @andijcr in #16174
Bug Fixes
- PR #16178 [v23.3.x] c/log_eviction_stm: do not request snapshot if already progressed by @mmaslankaprv
- Fix internal RPC client connection stall after more than 2^32 requests are sent. by @ztlpn in #16176
- Fix large allocation in partition manifest. by @dotnwat in #16188
- Fix tiered-storage housekeeping problem that may cause replaced segments to pile up if the spillover is enabled. by @Lazin in #16167
- Fix tiered-storage housekeeping problem that may cause replaced segments to pile up if the spillover is enabled. by @Lazin in #16170
- Protect against a very rare scenario where after node restart, some of the partition replicas hosted on that node could not take part in leader elections. by @ztlpn in #16080
- #15811 Several additional metrics will have their "partition" label aggregated away (i.e., into a single series per remaining label set with no partition label, whose value is the sum of all input series with the same label set and different partition labels). This is already the default behavior for most metrics, but this change extends it to almost all remaining metrics. by @travisdowns in #16094
- #16093 Several additional metrics will have their "partition" label aggregated away (i.e., into a single series per remaining label set with no partition label, whose value is the sum of all input series with the same label set and different partition labels). This is already the default behavior for most metrics, but this change extends it to almost all remaining metrics. by @travisdowns in #16100
- fixed incorrect fetch offset validation by @mmaslankaprv in #16169
Improvements
- Add a dedicated CPU scheduling policy for Data Transforms by @rockwotj in #16139
- Reduces the number of allocations performed by the auditing subsystem by @graphcareful in #16147
rpk transform deploy --file
now supportshttps://
URLs by @rockwotj in #16063- PR #16111 [v23.3.x] archival: Start housekeeping jobs after STM sync by @Lazin
- PR #16103 [v23.3.x] Add at: in Top-N alloc site output by @travisdowns
Full Changelog: v23.3.2...v23.3.3
v23.2.23
Bug Fixes
- Fix internal RPC client connection stall after more than 2^32 requests are sent. by @ztlpn in #16175
- Fix large allocation in partition manifest. by @dotnwat in #16191
- Fix tiered-storage housekeeping problem that may cause replaced segments to pile up if the spillover is enabled. by @Lazin in #16166
- Fix tiered-storage housekeeping problem that may cause replaced segments to pile up if the spillover is enabled. by @Lazin in #16171
- Have fetch handler ensure rack awareness is enabled before performing follower fetching by @michael-redpanda in #15914
- Protect against a very rare scenario where after node restart, some of the partition replicas hosted on that node could not take part in leader elections. by @ztlpn in #16082
- #15839 safer handle unknown properties in local state by @andijcr in #15874
- #15925 Prevent oversized allocs when group fetching from many partitions. by @rockwotj in #15926
- ext4 is no longer incorrectly detected as ext2 (all of ext2, 3 and 4 are assumed to be ext4). by @travisdowns in #15812
- fixed incorrect fetch offset validation by @mmaslankaprv in #16168
- PR #15816 [v23.2.x] c/archival_stm: do not reset _last_replicate on timeout by @nvartolomei
- PR #15989 [v23.2.x] r/offset_translator: remove unsafe bootstrap code by @ztlpn
Improvements
- #15829 Added new metric to provide Follower Fetching feature observability by @mmaslankaprv in #15831
- PR #16099 [v23.2.x] archival: Use explicit types to encode upload candidate creation result by @abhijat
- PR #16104 [v23.2.x] Add at: in Top-N alloc site output by @travisdowns
Full Changelog: v23.2.22...v23.2.23