Features
- allow use
rpk cluster config get
in cloud cluster. by @andresaristizabal in #26161 - PR #26078 [v25.1.x] make ntp_callbacks actually support multiple callbacks by @bashtanov
- PR #26183 [v25.1.x] c/archival: wakeup upload loop after flush by @ztlpn
- PR #26211 [v25.1.x] Improve safe pause resume by @Lazin
- PR #26225 [v25.1.x] kafka/debug: add a debug end point for offset_for_leader_epoch by @bharathv
- PR #26229 [v25.1.x]
storage
: output fullsegment
inWARN
log inoffset_to_filepos.cc
by @WillemKauf - PR #26438 Backport #26426 to v25.1.x by @wdberkeley
Bug Fixes
- Allow partition balancing to opearte in case when space management was enabled, but local target capacity was unset. by @ztlpn in #26304
- Enable TCP keepalive for cloud storage connections. by @Lazin in #26411
- Fix Redpanda crash if
partition_autobalancing_concurrent_moves
was set to 0. by @ztlpn in #26304 - Properly set TLS SNI information for Iceberg REST catalog connections. by @wdberkeley in #26370
- Several Iceberg REST catalog configurations are now correctly marked as needing restart. by @andrwng in #26218
- When Tiered Storage is paused and data is allowed to expire from local storage there will be gaps between last offset in tiered storage and first offset in local storage. If local storage was truncated in the middle of a segment (i.e. time based retention or via trim-prefix/delete records commands) tiered storage might get stuck with the following exception:
Failed to schedule upload: std::runtime_error (ntp {kafka/foo/0}: log offset N is outside the translation range (starting at M > N))
. Fix this by adjusting upload start offset to the first available and valid offset. Although we might have a bit more data in the segment, other information about that data (i.e. offset translation) is gone with prefix truncation. by @nvartolomei in #26066 - #26191 Fixes a bug in which a broker would crash during sliding window compaction when started with
log_compaction_use_sliding_window=false
and its value was later set totrue
without restarting. by @WillemKauf in #26197 partition_autobalancing_mode=off
now stops on-demand partition rebalance as well. by @ztlpn in #26304
Improvements
- Adds the
storage_log_adjacent_segments_compacted
metric for better observability into adjacent segment compaction. by @WillemKauf in #26203 - Allow changing
redpanda.iceberg.mode
dynamically at runtime by @bharathv in #26171 - Improved handling rf=1 partitions health reporting by @mmaslankaprv in #26101
- In AlterPartitionReassignmentsResponse per-partition response REASSIGNMENT_IN_PROGRESS error code is used if a reassignment is requested while Partition Balancer is moving partition replicas. by @bashtanov in #26347
- Made it easier to detect and diagnose node operation issues by @mmaslankaprv in #26147
- Swap out an internal data structure in the
storage
layer to prevent oversized allocations and crashes when a large number ofsegment
s are present in apartition
. by @WillemKauf in #26134 - better observability of state machines shutdown issues by @mmaslankaprv in #26413
- rpk debug bundle: improve reliability of debug bundle collection in k8s environments. by @r-vasquez in #26164
- rpk:
decommission-status
reports reallocation failure details by @daisukebe in #26253 - rpk: introduce logger to our
rpk registry
commands. (Works with -v) by @r-vasquez in #26251 - PR #26081 [v25.1.x] archival: Fix archival_stm_snapshot installation by @Lazin
- PR #26145 [v25.1.x] r/consensus: do not block leadership completely in maintenance mode by @mmaslankaprv
- PR #26168 [v25.1.x] storage: CORE-10056: Remove contiguous allocations in lock_manager by @wdberkeley
- PR #26267 [v25.1.x] Fix archival STM shutdown race by @bashtanov
- PR #26275 [v25.1.x] r/consensus: stop consumable offset monitor by @mmaslankaprv
- PR #26327 [v25.1.x] datalake: hold gate when interacting with catalog in schema manager by @mmaslankaprv
Full Changelog: v25.1.4...v25.1.5