Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion config.toml
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,7 @@ rdi_redis_gears_version = "1.2.6"
rdi_debezium_server_version = "2.3.0.Final"
rdi_db_types = "cassandra|mysql|oracle|postgresql|sqlserver"
rdi_cli_latest = "latest"
rdi_current_version = "1.14.1"
rdi_current_version = "1.15.0"

[params.clientsConfig]
"Python"={quickstartSlug="redis-py"}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -144,9 +144,6 @@ processors:
# Time (in ms) after which data will be read from stream even if
# read_batch_size was not reached.
# duration: 100
# Data type to use in Redis target database: `hash` for Redis Hash,
# `json` for JSON (which requires the RedisJSON module).
# target_data_type: hash
# The batch size for writing data to the target Redis database. Should be
# less than or equal to the read_batch_size.
# write_batch_size: 200
Expand All @@ -155,8 +152,26 @@ processors:
# Max size of the deduplication set (default: 1024).
# dedup_max_size: <DEDUP_MAX_SIZE>
# Error handling strategy: ignore - skip, dlq - store rejected messages
# in a dead letter queue
# in a dead letter queue.
# error_handling: dlq
# Dead letter queue max messages per stream.
# dlq_max_messages: 1000
# Data type to use in Redis target database: `hash` for Redis Hash,
# `json` for JSON (which requires the RedisJSON module).
# target_data_type: hash
# Number of processes to use when syncing initial data.
# initial_sync_processes: 4
# Checks if the batch has been written to the replica shard.
# wait_enabled: false
# Timeout in milliseconds when checking write to the replica shard.
# wait_timeout: 1000
# Ensures that a batch has been written to the replica shard and keeps
# retrying if not.
# retry_on_replica_failure: true
# Enable merge as the default strategy to writing JSON documents.
# json_update_strategy: merge
# Use native JSON merge if the target RedisJSON module supports it.
# use_native_json_merge: true
```

## Sections
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -178,11 +178,27 @@ it without the `noexec` option. See
or your company policy forbids you to install there. You can
select a different directory for the K3s installation using the
`--installation-dir` option with `install.sh`:
```bash
sudo ./install.sh --installation-dir <custom-installation-directory>
```
{{< /note >}}

**Advanced**: You can also pass custom K3s parameters to the installer using the
`INSTALL_K3S_EXEC` environment variable. For example, to set the kubeconfig file
permissions to be readable by all users:

```bash
sudo ./install.sh --installation-dir <custom-installation-directory>
sudo INSTALL_K3S_EXEC='--write-kubeconfig-mode=644' ./install.sh
```
{{< /note >}}

You can combine multiple K3s options in the `INSTALL_K3S_EXEC` variable. See the
[K3s documentation](https://docs.k3s.io/installation/configuration) for a full list of
available options.

{{< warning >}}Only modify K3s parameters if you understand exactly what you are changing
and why. Incorrect K3s configuration can cause RDI installation to fail or result in an
unstable deployment. {{< /warning >}}


The RDI installer collects all necessary configuration details and alerts you to potential issues,
offering options to abort, apply fixes, or provide additional information.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,7 @@ Configuration settings that control how data is processed, including batch sizes
| **dlq_max_messages**<br/>(DLQ message limit) | `integer`, `string` | Maximum number of messages to store in dead letter queue per stream<br/>Default: `1000`<br/>Pattern: `^\${.*}$`<br/>Minimum: `1`<br/> | |
| **target_data_type**<br/>(Target Redis data type) | `string` | Data type to use in Redis: hash for Redis Hash, json for RedisJSON (requires RedisJSON module)<br/>Default: `"hash"`<br/>Pattern: `^\${.*}$\|hash\|json`<br/> | |
| **json_update_strategy** | `string` | (DEPRECATED)<br/>Property 'json_update_strategy' will be deprecated in future releases. Use 'on_update' job-level property to define the json update strategy.<br/>Default: `"replace"`<br/>Pattern: `^\${.*}$\|replace\|merge`<br/> | |
| **use_native_json_merge**<br/>(Use native JSON merge) | `boolean` | Controls whether to use the native `JSON.MERGE` command (when `true`) or Lua scripts (when `false`) for JSON merge operations. Introduced in RDI 1.15.0. The native command provides 2x performance improvement but handles null values differently:<br/><br/>**Previous behavior (Lua merge)**: When merging `{"field1": "value1", "field2": "value2"}` with `{"field2": null, "field3": "value3"}`, the result was `{"field1": "value1", "field2": null, "field3": "value3"}` (null value is preserved)<br/><br/>**New behavior (JSON.MERGE)**: The same merge produces `{"field1": "value1", "field3": "value3"}` (null value removes the field, following [RFC 7396](https://datatracker.ietf.org/doc/html/rfc7396))<br/><br/>**Note**: The native `JSON.MERGE` command requires RedisJSON 2.6.0 or higher. If the target database has an older version of RedisJSON, RDI will automatically fall back to using Lua-based merge operations regardless of this setting.<br/><br/>**Impact**: If your application logic distinguishes between a field with a `null` value and a missing field, you may need to adjust your data handling. This follows the JSON Merge Patch RFC standard but differs from the previous Lua implementation. Set to `false` to revert to the previous Lua-based merge behavior if needed.<br/>Default: `true`<br/> | |
| **initial_sync_processes** | `integer`, `string` | Number of parallel processes for performing initial data synchronization<br/>Default: `4`<br/>Pattern: `^\${.*}$`<br/>Minimum: `1`<br/>Maximum: `32`<br/> | |
| **idle_sleep_time_ms**<br/>(Idle sleep interval) | `integer`, `string` | Time in milliseconds to sleep between processing batches when idle<br/>Default: `200`<br/>Pattern: `^\${.*}$`<br/>Minimum: `1`<br/>Maximum: `999999`<br/> | |
| **idle_streams_check_interval_ms**<br/>(Idle streams check interval) | `integer`, `string` | Time in milliseconds between checking for new streams when processor is idle<br/>Default: `1000`<br/>Pattern: `^\${.*}$`<br/>Minimum: `1`<br/>Maximum: `999999`<br/> | |
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
---
Title: Redis Data Integration release notes 1.15.0 (October 2025)
alwaysopen: false
categories:
- docs
- operate
- rs
description: |
Flink collector for Spanner enabled by default for improved user experience.
Enhanced high availability with configurable leader election and standby mode.
Support for sharded RDI Redis databases.
Improved configuration validation and monitoring capabilities.
Better resource management and security enhancements.
linkTitle: 1.15.0 (October 2025)
toc: 'true'
weight: 976
---

RDI's mission is to help Redis customers sync Redis Enterprise with live data from their slow disk-based databases to:

- Meet the required speed and scale of read queries and provide an excellent and predictable user experience.
- Save resources and time when building pipelines and coding data transformations.
- Reduce the total cost of ownership by saving money on expensive database read replicas.

RDI keeps the Redis cache up to date with changes in the primary database, using a [_Change Data Capture (CDC)_](https://en.wikipedia.org/wiki/Change_data_capture) mechanism.
It also lets you _transform_ the data from relational tables into convenient and fast data structures that match your app's requirements. You specify the transformations using a configuration system, so no coding is required.

## What's New in 1.15.0

{{<warning>}}
**Breaking change when using JSON with `json_update_strategy: merge`**

RDI now uses the native `JSON.MERGE` command instead of Lua scripts for JSON merge operations. While this provides significant performance improvements (2x faster), there is a **functional difference** in how null values are handled:

- **Previous behavior (Lua merge)**: When merging `{"field1": "value1", "field2": "value2"}` with `{"field2": null, "field3": "value3"}`, the result was `{"field1": "value1", "field2": null, "field3": "value3"}` (null value is preserved)
- **New behavior (JSON.MERGE)**: The same merge produces `{"field1": "value1", "field3": "value3"}` (null value removes the field, following [RFC 7396](https://datatracker.ietf.org/doc/html/rfc7396))

**Impact**: If your application logic distinguishes between a field with a `null` value and a missing field, you may need to adjust your data handling. This follows the JSON Merge Patch RFC standard but differs from the previous Lua implementation.

**Configuration**: You can control this behavior using the `use_native_json_merge` property in the processors section of your configuration. Set it to `false` to revert to the previous Lua-based merge behavior if needed.
{{</warning>}}

- **Native JSON merge for improved performance**: RDI now automatically uses the native `JSON.MERGE` command from RedisJSON 2.6.0+ instead of Lua scripts for JSON merge operations, providing 2x performance improvement. This feature is enabled by default and can be controlled via the `use_native_json_merge` property in the processors section of the configuration. **Note**: If the target Redis database has RedisJSON version lower than 2.6.0, the processor will automatically revert to using the Lua-based merge implementation.
- **Support for sharded Redis databases**: RDI now supports writing to multi-sharded Redis Enterprise databases for the RDI database, resolving cross-slot violations when reading from streams.
- **Enhanced processor performance metrics**: Detailed performance metrics are now exposed through the metrics exporter and statistics endpoint, with separate tracking for transformation time and write time.
- **Resource management improvements**: Collector and processor pods now support configurable resource requests, limits, and node affinity/tolerations for better cluster resource utilization.
- The `collector` defaults to 1 CPU and 1024Mi memory (requests), with limits of 4 CPUs and 4096Mi memory.
- The `processor` defaults to 1 CPU and 512Mi memory (requests), with limits of 4 CPUs and 3072Mi memory.

- **Leadership status monitoring**: New metrics expose leadership status and pipeline phase information for better monitoring of HA deployments.
- The `rdi_operator_is_leader` metric tracks the current leadership status of the operator: `1` indicates the instance is the leader, `0` indicates it is not the leader.
- The `rdi_operator_pipeline_phase` metric tracks the current phase of the pipeline. Phase indicates the current pipeline phase, must be one of `Active`, `Inactive`, `Resetting`, `Pending`, or `Error`.
- **Improved configuration validation**: More rigid validation for `config.yaml` and `jobs.yaml` files helps catch configuration errors earlier in the deployment process.
- **Custom K3s installation options**: The installer now supports passing custom arguments to K3s installation for more flexible on-premises deployments.
- Example: `sudo INSTALL_K3S_EXEC='--write-kubeconfig-mode=644' ./install.sh`
- **Workload Identity authentication**: Added support for Google Cloud Workload Identity authentication to Google Cloud Storage (GCS), eliminating the need for service account JSON files if using GSC-based leader election.
- Exposed new processor performance metrics showing transformation and write times separately
- `{namespace}_processor_process_time_ms_total` - Total time spent in the processor (transform + write)
- `{namespace}_processor_transform_time_ms_total` - Time spent transforming data
- `{namespace}_processor_write_time_ms_total` - Time spent writing data to Redis
- Exposed all existing processor metrics that were only available through the RDI CLI status and the API statistics endpoint.
- Enhanced statistics endpoint with new metrics for transform and process time

### Bug Fixes and Stability Improvements

- **Fixed task reconciliation errors**: Resolved "The ID argument cannot be a complete ID because xadd-id-uniqueness-mode is strict" errors during task reconciliation when using an RDI Redis database with strict XADD id uniqueness mode.
- **Fixed Debezium unavailable values**: Addressed issues where `__debezium_unavailable_value` was appearing in Redis data.
- **Improved operator stability**: Disabled operator webhooks by default to simplify deployments and reduce potential issues.

## Limitations

RDI can write data to a Redis Active-Active database. However, it doesn't support writing data to two or more Active-Active replicas. Writing data from RDI to several Active-Active replicas could easily harm data integrity as RDI is not synchronous with the source database commits.