From 2039c4f3a35778313aa7751606949335adc66bcb Mon Sep 17 00:00:00 2001 From: Justin Castilla Date: Wed, 1 Jun 2022 16:15:21 -0700 Subject: [PATCH 1/3] adds transcripts to ru301 pages w/ video --- .../index-basic-replication.mdx | 24 ++++++++- .../introduction/index-index-introduction.mdx | 15 +++++- .../redis-at-scale/index-redis-at-scale.mdx | 26 +++++++++- .../introduction/index-introduction.mdx | 24 ++++++++- .../introduction/index-introduction.mdx | 13 ++++- .../index-persistence-options-in-redis.mdx | 27 +++++++++- .../clustering-in-redis/index-scalability.mdx | 49 ++++++++++++++++++- .../index-command-line-tool.mdx | 24 ++++++++- .../redis-clients/index-redis-clients.mdx | 19 ++++++- .../index-redis-server-overview.mdx | 19 ++++++- 10 files changed, 230 insertions(+), 10 deletions(-) diff --git a/docs/operate/redis-at-scale/high-availability/basic-replication/index-basic-replication.mdx b/docs/operate/redis-at-scale/high-availability/basic-replication/index-basic-replication.mdx index 010a318af2..66864ea1f1 100644 --- a/docs/operate/redis-at-scale/high-availability/basic-replication/index-basic-replication.mdx +++ b/docs/operate/redis-at-scale/high-availability/basic-replication/index-basic-replication.mdx @@ -8,4 +8,26 @@ isEditable: false import useBaseUrl from '@docusaurus/useBaseUrl'; - \ No newline at end of file +
+ +
+
+
+

Replication in Redis follows a simple primary-replica model where the replication happens in one direction - from the primary to one or multiple replicas. Data is only written to the primary instance and replicas are kept in sync so that they’re exact copies of the primaries.

+ +To create a replica, you instantiate a Redis server instance with the configuration directive replicaof set to the address and port of the primary instance. once the replica instance is up and running, the replica will try to sync with the primary. To transfer all of its data as efficiently as possible, the primary instance will produce a compacted version of the data in a snapshot ( .rdb) file and send it to the replica. + +The replica will then read the snapshot file and load all of its data into memory, which will bring it to the same state the primary instance had at the moment of creating the rdb file. When the loading stage is done, the primary instance will send the backlog of any write commands run since the snapshot was made. Finally, the primary instance will send the replica a live stream of all subsequent commands. + +By default, replication is asynchronous. This means that if you send a write command to Redis (1) you will receive your acknowledged response first (2), and only then will the command be replicated to the replica (3). + +If the primary goes down after acknowledging a write but before the write can be replicated, then you might have data loss. To avoid this, the client can use the WAIT command. This command blocks the current client until all of the previous write commands are successfully transferred and acknowledged by at least some specified number of replicas. + + +For example, if we send the command WAIT 2 0, the client will block (will not return a response to the client) until all of the previous write commands issued on that connection have been written to at least 2 replicas. The second argument (0) will instruct the server to block indefinitely, but we could set it to a number (in milliseconds) so that it times out after a while and returns the number of replicas that successfully acknowledged the commands. + + +Replicas are read-only. This means that you can configure your clients to read from them, but you cannot write data to them. If you need additional read throughput, you can configure your Redis client to read from replicas as well as from your primary node. However, it's often easier just to scale out your cluster. This lets you scale reads and writes without writing any complex client logic. + +Also, you should know about Active-Active, an advanced feature of Redis Enterprise and Redis Cloud. Active-Active replicates entire databases across geographically-distributed clusters. With Active-Active, you can write locally to any replica databases, and those writes will be reflected globally. Something to keep in mind when you're really scaling out! + diff --git a/docs/operate/redis-at-scale/high-availability/introduction/index-index-introduction.mdx b/docs/operate/redis-at-scale/high-availability/introduction/index-index-introduction.mdx index 91d42c883a..37b948106c 100644 --- a/docs/operate/redis-at-scale/high-availability/introduction/index-index-introduction.mdx +++ b/docs/operate/redis-at-scale/high-availability/introduction/index-index-introduction.mdx @@ -8,4 +8,17 @@ isEditable: false import useBaseUrl from '@docusaurus/useBaseUrl'; - \ No newline at end of file +
+ +
+
+
+

High availability is a computing concept describing systems that guarantee a high level of uptime, designed to be fault-tolerant, highly dependable, operating continuously without intervention and without a single point of failure.

+ +What does this mean for Redis specifically? Well, it means that if your primary Redis server fails, a backup will kick in, and you, as a user, will see little to no disruption in the service. There are two components needed for this to be possible: replication and automatic failover. + +Replication is the continuous copying of data from a primary database to a backup, or a replica database. The two databases are usually located on different physical servers, so that we can have a functional copy of our data in case we lose the server where our primary database sits. + +But having a backup of our data is not enough for high availability. We also have to have a mechanism that will automatically kick in and redirect all requests towards the replica in the event that the primary fails. This mechanism is called automatic failover. + +In the rest of this section we’ll see how Redis handles replication and which automatic failover solutions it offers. Let’s dig in. diff --git a/docs/operate/redis-at-scale/index-redis-at-scale.mdx b/docs/operate/redis-at-scale/index-redis-at-scale.mdx index c5833c1b5b..066c0cbb07 100644 --- a/docs/operate/redis-at-scale/index-redis-at-scale.mdx +++ b/docs/operate/redis-at-scale/index-redis-at-scale.mdx @@ -11,9 +11,33 @@ import RedisCard from '@site/src/theme/RedisCard'; ## Welcome
- +
+
+
+

The world's data is growing exponentially. That exponential growth means that database systems must scale. This is a course about running Redis, one of the most popular databases, at scale.

+ +So, how do you run Redis at scale? There are two general answers to this question, and it's important that we address them right away. That's because the easiest and most common way to run Redis at scale is to let someone else manage your Redis deployment for you. + +The convenience of "database-as-a-service" offerings means that you don't have to know much about how your database scales, and that saves a lot of time and potential false starts. + +We at Redis Labs offer Redis Cloud, a highly available cloud-based Redis service that provides a lot of features you can't find anywhere else, like active-active, geo-distribution. + +Redis Cloud is also really easy to use and has a free tier so you can get going quickly. +So, that's the first answer. To run Redis these days, you might just use a fully-managed offering like Redis Cloud. +But not everyone can or wants to use a cloud-hosted database. + +There are a bunch of reasons for this. For example, maybe you're a large enterprise with your own data centers and dedicated ops teams. Or perhaps you're a mission-critical application whose SLAs are so rigid that you need to be able to dig deeply into any potential performance issue. This often rules out cloud-based deployments, since the cloud hides away the hardware and networks you're operating in. In this case, you're deploying Redis on your own. And for that, you need to know how Redis scales. + +Learning this isn't just useful; it's also genuinely interesting. Sharding, replication, high availability, and disaster recovery are all important concepts that anyone can understand with the right explanation. These concepts aren't rocket science. They're no harder to understand than basic high school math, and knowing about them makes you a better developer. +In this course, we'll look closely at how open source Redis scales. And you'll learn by doing, as we present a lot of the ideas through hands-on labs. + +These ideas will apply whether you're deploying open source Redis on your own or managing a Redis Enterprise cluster - which is, ultimately, what you'll want to reach for if you ever outgrow open source Redis. These are some important topics to consider during your time with this course. But let's first learn how to walk before we run. + +We sincerely hope you enjoy what you learn with us about scaling Redis, and as always, it's my pleasure to help. + +## Course Overview This course is broken up into units covering topics around scaling Redis for production deployment. Scaling means more than just performance. We have tried to identify key topics that will help you have a performant, stable, and secure deployment of Redis. This course is divided into the following units: diff --git a/docs/operate/redis-at-scale/observability/introduction/index-introduction.mdx b/docs/operate/redis-at-scale/observability/introduction/index-introduction.mdx index 38cfdc0391..a8f4c81cf9 100644 --- a/docs/operate/redis-at-scale/observability/introduction/index-introduction.mdx +++ b/docs/operate/redis-at-scale/observability/introduction/index-introduction.mdx @@ -8,4 +8,26 @@ isEditable: false import useBaseUrl from '@docusaurus/useBaseUrl'; - \ No newline at end of file +
+ +
+
+
+

The last thing you want to do after successfully deploying and scaling Redis is to be stuck working on the weekend because performance is down or the service is unavailable!

+ +If you're running a managed service like Redis Cloud, you won't have to worry about these questions as much. But even then, it's still worthwhile to know about certain key Redis metrics. + +Some of the question you always want to be able to answer include: +- Is Redis up and running right now? +- Where is my Redis capacity at? +- Is Redis accessible at this moment? +- Is Redis performing the way we expect? +- When failures occur… what exactly happened to Redis? + +Then of course you must ask... +- How can I find this out ahead of time? + +Let's dig into these questions and more as we look into observability with Redis. + + + diff --git a/docs/operate/redis-at-scale/persistence-and-durability/introduction/index-introduction.mdx b/docs/operate/redis-at-scale/persistence-and-durability/introduction/index-introduction.mdx index b1e49c95dd..6d1bb92394 100644 --- a/docs/operate/redis-at-scale/persistence-and-durability/introduction/index-introduction.mdx +++ b/docs/operate/redis-at-scale/persistence-and-durability/introduction/index-introduction.mdx @@ -8,4 +8,15 @@ isEditable: false import useBaseUrl from '@docusaurus/useBaseUrl'; - \ No newline at end of file +
+ +
+
+
+

Hello! Congrats on completing section 1. Section 2 is a bit shorter but contains some important information on persistence and durability.

+ +As I am sure you know, Redis serves all data directly from memory. But Redis is also capable of persisting data to disk. Persistence preserves data in the event of a server restart. + +In the following video and exercise, we'll look at the options for persisting data to disk. We'll show you how to enable persistence, and you'll then do a hands-on exercise setting up snapshots of your Redis instance. + +Good luck, and we'll see you in the next sections. diff --git a/docs/operate/redis-at-scale/persistence-and-durability/persistence-options-in-redis/index-persistence-options-in-redis.mdx b/docs/operate/redis-at-scale/persistence-and-durability/persistence-options-in-redis/index-persistence-options-in-redis.mdx index 7cf22f1b60..4089ac27a4 100644 --- a/docs/operate/redis-at-scale/persistence-and-durability/persistence-options-in-redis/index-persistence-options-in-redis.mdx +++ b/docs/operate/redis-at-scale/persistence-and-durability/persistence-options-in-redis/index-persistence-options-in-redis.mdx @@ -8,4 +8,29 @@ isEditable: false import useBaseUrl from '@docusaurus/useBaseUrl'; - \ No newline at end of file +
+ +
+
+
+

If a Redis server that only stores data in RAM is restarted, all data is lost. To prevent such data loss, there needs to be some mechanism for persisting the data to disk; +Redis provides two of them: snapshotting and an append-only file, or AOF. You can configure your Redis instances to use either of the two, or a combination of both.

+ +When a snapshot is created, the entire point-in-time view of the dataset is written to persistent storage in a compact dot rdb file. You can set up recurring backups, for example every 1, 12, or 24 hours and use these backups to easily restore different versions of the data set in case of disasters. You can also use these snapshots to create a clone of the server, or simply leave them in place for a future restart. + +Creating a .rdb file requires a lot of disk I/O. If performed in the main Redis process, this would reduce the server’s performance. That’s why this work is done by a forked child process. But even forking can be time-consuming if the dataset is large. This may result in decreased performance or in Redis failing to serve clients for a few milliseconds or even up to a second for very large datasets. Understanding this should help you decide whether this solution makes sense for your requirements. + +You can configure the name and location of the .rdb file with the dbfilename and dir configuration directives, either through the redis.conf file, or through the redis-cli as explained in [Section 1 Unit 2](http://localhost:3000/operate/redis-at-scale/talking-to-redis/configuring-a-redis-server). And of course you can configure how often you want to create a snapshot. Here’s an excerpt from the redis.conf file showing the default values. + +As an example, this configuration will make Redis automatically dump the dataset to disk every 60 seconds if at least 1000 keys changed in that period. While snapshotting is a great strategy for the use cases explained above, it leaves a huge possibility for data loss. You can configure snapshots to run every few minutes, or after X writes against the database, but if the server crashes you lose all the writes since the last snapshot was taken. In many use cases, that kind of data loss can be acceptable, but in many others it is absolutely not. For all of those other use cases Redis offers the AOF persistence option. + +AOF, or append-only file works by logging every incoming write command to disk as it happens. These commands can then be replayed at server startup, to reconstruct the original dataset. Commands are logged using the same format as the Redis protocol itself, in an append-only fashion. The AOF approach provides greater durability than snapshotting, and allows you to configure how often file syncs happen. + +Depending on your durability requirements (or how much data you can afford to lose), you can choose which fsync policy is the best for your use case: + +- fsync every write - The safest policy: The write is acknowledged to the client only after it has been written to the AOF file and flushed to disk. Since in this approach we are writing to disk synchronously, we can expect a much higher latency than usual. +- fsync every second: The default policy. Fsync is performed asynchronously, in a background thread, so write performance is still high. Choose this option if you need high performance and can afford to lose up to one second worth of writes. +- no fsync: In this case Redis will log the command to the file descriptor, but will not force the OS to flush the data to disk. If the OS crashes we can lose a few seconds of data (Normally Linux will flush data every 30 seconds with this configuration, but it's up to the kernel’s exact tuning.). + +The relevant configuration directives for AOF are shown on the screen. AOF contains a log of all the operations that modified the database in a format that’s easy to understand and parse. When the file gets too big, Redis can automatically rewrite it in the background, compacting it in a way that only the latest state of the data is preserved. + diff --git a/docs/operate/redis-at-scale/scalability/clustering-in-redis/index-scalability.mdx b/docs/operate/redis-at-scale/scalability/clustering-in-redis/index-scalability.mdx index 01198f8f20..6f2fa1ff25 100644 --- a/docs/operate/redis-at-scale/scalability/clustering-in-redis/index-scalability.mdx +++ b/docs/operate/redis-at-scale/scalability/clustering-in-redis/index-scalability.mdx @@ -9,4 +9,51 @@ isEditable: false import useBaseUrl from '@docusaurus/useBaseUrl'; - \ No newline at end of file +
+ +
+
+
+

Before we jump into the details, let's first address the elephant in the room: DBaaS offerings, or "database-as-a-service" in the cloud. No doubt, it's useful to know how Redis scales and how you might deploy it. But deploying and maintaining a Redis cluster is a fair amount of work. So if you don't want to deploy and manage Redis yourself, then consider signing up for a Redis Cloud, our managed service, and let us do the scaling for you. Of course, that route is not for everyone. And as I said, there's a lot to learn here, so let's dive in.

+ +We'll start with scalability. Here's one definition: +> “Scalability is the property of a system to handle a growing amount of work by adding resources to the system.” + [Wikipedia](https://en.wikipedia.org/wiki/Scalability) + +The two most common scaling strategies are vertical scaling and horizontal scaling. Vertical scaling, or also called “Scaling Up”, means adding more resources like CPU or memory to your server. Horizontal scaling, or “Scaling out”, implies adding more servers to your pool of resources. It's the difference between just getting a bigger server and deploying a whole fleet of servers. + +Let's take an example. Suppose you have a server with 128 GB of RAM, but you know that your database will need to store 300 GB of data. In this case, you’ll have two choices: you can either add more RAM to your server so it can fit the 300GB dataset, or you can add two more servers and split the 300GB of data between the three of them. Hitting your server’s RAM limit is one reason you might want to scale up, or out, but reaching the performance limit in terms of throughput, or operations per second, is also an indicator that scaling is necessary. + +Since Redis is mostly single-threaded, Redis cannot make use of the multiple cores of your server’s CPU for command processing. But if we split the data between two Redis servers, our system can process requests in parallel, increasing the throughput by almost 200%. In fact, performance will scale close to linearly by adding more Redis servers to the system. This database architectural pattern of splitting data between multiple servers for the purpose of scaling is called sharding. The resulting servers that hold chunks of the data are called shards. + +This performance increase sounds amazing, but it doesn’t come without some cost: if we divide and distribute our data across two shards, which are just two Redis server instances, how will we know where to look for each key? We need to have a way to consistently map a key to a specific shard. There are multiple ways to do this and different databases adopt different strategies. The one Redis chose is called “Algorithmic sharding” and this is how it works: + +In order to find the shard on which a key lives we compute a numeric hash value out of the key name and modulo divide it by the total number of shards. Because we are using a deterministic hash function the key “foo” will always end up on the same shard, as long as the number of shards stays the same. + +But what happens if we want to increase our shard count even further, a process commonly called “resharding”? Let’s say we add one new shard so that our total number of shards is three. When a client tries to read the key “foo” now, they will run the hash function and modulo divide by the number of shards, as before, but this time the number of shards is different and we’re modulo dividing with three instead of two. Understandably, the result may be different, pointing us to the wrong shard! + +Resharding is a common issue with the algorithmic sharding strategy and can be solved by rehashing all the keys in the keyspace and moving them to the shard appropriate to the new shard count. This is not a trivial task, though, and it can require a lot of time and resources, during which the database will not be able to reach its full performance or might even become unavailable. + +Redis chose a very simple approach to solving this problem: it introduced a new, logical unit that sits between a key and a shard, called a hash slot. + +One shard can contain many hash slots, and a hash slot contains many keys. +The total number of hash slots in a database is always 16384 (16K). This time, the modulo division is not done with the number of shards anymore, but instead with the number of hash slots, that stays the same even when resharding and the end result will give us the position of the hash slot where the key we’re looking for lives. And when we do need to reshard, we simply move hash slots from one shard to another, distributing the data as required across the different redis instances. + +Now that we know what sharding is and how it works in Redis, we can finally introduce Redis Cluster. Redis Cluster provides a way to run a Redis installation where data is automatically split across multiple Redis servers, or shards. Redis Cluster also provides high availability. So, if you're deploying Redis Cluster, you don't need (or use) Redis Sentinel. + +Redis Cluster can detect when a primary shard fails and promote a replica to a primary without any manual intervention from the outside. How does it do it? How does it know that a primary shard has failed, and how does it promote its replica to be the new primary shard? We need to have replication enabled. Say we have one replica for every primary shard. If all our data is divided between three Redis servers, we would need a six-member cluster, with three primary shards and three replicas. + +All 6 shards are connected to each other over TCP and constantly PING each other and exchange messages using a binary protocol. These messages contain information about which shards have responded with a PONG, so are considered alive, and which haven’t. + +When enough shards report that a certain primary shard is not responding to them, they can agree to trigger a failover and promote the shard’s replica to become the new primary. How many shards need to agree that a shard is offline before a failover is triggered? Well, that’s configurable and you can set it up when you create a cluster, but there are some very important guidelines that you need to follow. + +If you have an even number of shards in the cluster, say six, and there’s a network partition that divides the cluster in two, you'll then have two groups of three shards. The group on the left side will not be able to talk to the shards from the group on the right side, so the cluster will think that they are offline and it will trigger a failover of any primary shards, resulting in a left side with all primary shards. On the right side, the three shards will see the shards on the left as offline, and will trigger a failover on any primary shard that was on the left side, resulting in a right side of all primary shards. Both sides, thinking they have all the primaries, will continue to receive client requests that modify data, and that is a problem, because maybe client A sets the key “foo” to “bar” on the left side, but a client B sets the same key’s value to “baz” on the right side. + +When the network partition is removed and the shards try to rejoin, we will have a conflict, because we have two shards - holding different data claiming to be the primary and we wouldn’t know which data is valid. + +This is called a split brain situation, and is a very common issue in the world of distributed systems. A popular solution is to always keep an odd number of shards in your cluster, so that when you get a network split, the left and right group will do a count and see if they are in the bigger or the smaller group (also called majority or minority). If they are in the minority, they will not try to trigger a failover and will not accept any client write requests. + +Here's the bottom line: to prevent split-brain situations in Redis Cluster, always keep an odd number of primary shards and two replicas per primary shard. + + + diff --git a/docs/operate/redis-at-scale/talking-to-redis/command-line-tool/index-command-line-tool.mdx b/docs/operate/redis-at-scale/talking-to-redis/command-line-tool/index-command-line-tool.mdx index b7833ac7cf..3c8b16b70e 100644 --- a/docs/operate/redis-at-scale/talking-to-redis/command-line-tool/index-command-line-tool.mdx +++ b/docs/operate/redis-at-scale/talking-to-redis/command-line-tool/index-command-line-tool.mdx @@ -8,4 +8,26 @@ isEditable: false import useBaseUrl from '@docusaurus/useBaseUrl'; - \ No newline at end of file +
+ +
+
+
+

Redis-cli is a command line tool used to interact with the Redis server. Most package managers include redis-cli as part of the redis package. It can also be compiled from source, and you'll find the source code in the Redis repository on GitHub.

+ +There are two ways to use redis-cli : +- an interactive mode where the user types commands and sees the replies; +- a command mode where the command is provided as an argument to redis-cli, executed, and results sent to the standard output. + +Let’s use the CLI to connect to a Redis server running at 172.22.0.3 and port 7000. The arguments -h and -p are used to specify the host and port to connect to. They can be omitted if your server is running on the default host "localhost" port 6379. + +The redis-cli provides some useful productivity features. For example, you can scroll through your command history by pressing the up and down arrow keys. You can also use the TAB key to autocomplete a command, saving even more keystrokes. Just type the first few letters of a command and keep pressing TAB until the command you want appears on screen. + +Once you have the command name you want, the CLI will display syntax hints about the arguments so you don’t have to remember all of them, or open up the Redis command documentation. + +These three tips can save you a lot of time and take you a step closer to being a power user. + +You can do much more with redis-cli, like sending output to a file, scanning for big keys, get continuous stats, monitor commands and so on. For a much more detailed explanation refer to the documentation + + + diff --git a/docs/operate/redis-at-scale/talking-to-redis/redis-clients/index-redis-clients.mdx b/docs/operate/redis-at-scale/talking-to-redis/redis-clients/index-redis-clients.mdx index 0cf577d98b..d72a318672 100644 --- a/docs/operate/redis-at-scale/talking-to-redis/redis-clients/index-redis-clients.mdx +++ b/docs/operate/redis-at-scale/talking-to-redis/redis-clients/index-redis-clients.mdx @@ -8,4 +8,21 @@ isEditable: false import useBaseUrl from '@docusaurus/useBaseUrl'; - \ No newline at end of file +
+ +
+
+
+

Redis has a client-server architecture and uses a request-response model. Applications send requests to the Redis server, which processes them and returns responses for each. The role of a Redis client library is to act as an intermediary between your application and the Redis server.

+ +Client libraries perform the following duties: +- Implement the Redis wire protocol - the format used to send requests to and receive responses from the Redis server +- Provide an idiomatic API for using Redis commands from a particular programming language +Managing the connection to Redis + +Redis clients communicate with the Redis server over TCP, using a protocol called [RESP](https://redis.io/docs/reference/protocol-spec/) (REdis Serialization Protocol) designed specifically for Redis. + +The RESP protocol is simple and text-based, so it is easily read by humans, as well as machines. A common request/response would look something like this. Note that we're using netcat here to send raw protocol: + +This simple, well documented protocol has resulted in Redis clients for almost every language you can think of. The [redis.io](https://redis.io/docs/clients/) client page lists over 200 client libraries for more than 50 programming languages. + diff --git a/docs/operate/redis-at-scale/talking-to-redis/redis-server-overview/index-redis-server-overview.mdx b/docs/operate/redis-at-scale/talking-to-redis/redis-server-overview/index-redis-server-overview.mdx index b487cb60f5..f141009ab5 100644 --- a/docs/operate/redis-at-scale/talking-to-redis/redis-server-overview/index-redis-server-overview.mdx +++ b/docs/operate/redis-at-scale/talking-to-redis/redis-server-overview/index-redis-server-overview.mdx @@ -8,4 +8,21 @@ isEditable: false import useBaseUrl from '@docusaurus/useBaseUrl'; - \ No newline at end of file +
+
+
+
+

As you might already know, Redis is an open source data structure server written in C. You can store multiple data types, like strings, hashes, and streams and access them by a unique key name.

+ +For example, if you have a string value “Hello World” saved under the key name “greeting”, you can access it by running the GET command followed by the key name - greeting. All keys in a Redis database are stored in a flat keyspace. There is no enforced schema or naming policy, and the responsibility for organizing the keyspace is left to the developer. + +The speed Redis is famous for is mostly due to the fact that Redis stores and serves data entirely from RAM memory instead of disk, as most other databases do. Another contributing factor is its predominantly single-threaded nature: single-threading avoids race conditions and CPU-heavy context switching associated with threads. + +Indeed, this means that open source Redis can’t take advantage of the processing power of multiple CPU cores, although CPU is rarely the bottleneck with Redis. You are more likely to bump up against memory or network limitations before hitting any CPU limitations. That said, Redis Enterprise does let you take advantage of all of the cores on a single machine. + +Let’s now look at exactly what happens behind the scenes with every Redis request. When a client sends a request to a Redis server, the request is first read from the socket, then parsed and processed and finally, the response is written back to the socket and sent to the user. The reading and especially writing to a socket are expensive operations, so in Redis version 6.0 multi-threaded I/O was introduced. When this feature is enabled, Redis can delegate the time spent reading and writing to I/O sockets over to other threads, freeing up cycles for storing and retrieving data and boosting overall performance by up to a factor of two for some workloads. + +Throughout the rest of the section, you’ll learn how to use the Redis command line interface, how to configure your Redis server, and how to choose and tune your Redis client library. + + + From daf63f28d887e3ee0f83ca2161996a05c1a755c6 Mon Sep 17 00:00:00 2001 From: Justin Castilla Date: Tue, 7 Jun 2022 16:56:09 -0700 Subject: [PATCH 2/3] formatted transcripts for reading --- .../index-basic-replication.mdx | 14 ++++++-------- .../redis-at-scale/index-redis-at-scale.mdx | 6 ++---- .../introduction/index-introduction.mdx | 2 +- .../introduction/index-introduction.mdx | 2 +- .../index-persistence-options-in-redis.mdx | 12 ++++++------ .../clustering-in-redis/index-scalability.mdx | 17 ++++++++--------- .../index-command-line-tool.mdx | 10 +++++----- 7 files changed, 29 insertions(+), 34 deletions(-) diff --git a/docs/operate/redis-at-scale/high-availability/basic-replication/index-basic-replication.mdx b/docs/operate/redis-at-scale/high-availability/basic-replication/index-basic-replication.mdx index 66864ea1f1..9d19f9eb83 100644 --- a/docs/operate/redis-at-scale/high-availability/basic-replication/index-basic-replication.mdx +++ b/docs/operate/redis-at-scale/high-availability/basic-replication/index-basic-replication.mdx @@ -15,19 +15,17 @@ import useBaseUrl from '@docusaurus/useBaseUrl';

Replication in Redis follows a simple primary-replica model where the replication happens in one direction - from the primary to one or multiple replicas. Data is only written to the primary instance and replicas are kept in sync so that they’re exact copies of the primaries.

-To create a replica, you instantiate a Redis server instance with the configuration directive replicaof set to the address and port of the primary instance. once the replica instance is up and running, the replica will try to sync with the primary. To transfer all of its data as efficiently as possible, the primary instance will produce a compacted version of the data in a snapshot ( .rdb) file and send it to the replica. +To create a replica, you instantiate a Redis server instance with the configuration directive replicaof set to the address and port of the primary instance. once the replica instance is up and running, the replica will try to sync with the primary. To transfer all of its data as efficiently as possible, the primary instance will produce a compacted version of the data in a snapshot (.rdb) file and send it to the replica. -The replica will then read the snapshot file and load all of its data into memory, which will bring it to the same state the primary instance had at the moment of creating the rdb file. When the loading stage is done, the primary instance will send the backlog of any write commands run since the snapshot was made. Finally, the primary instance will send the replica a live stream of all subsequent commands. +The replica will then read the snapshot file and load all of its data into memory, which will bring it to the same state the primary instance had at the moment of creating the .rdb file. When the loading stage is done, the primary instance will send the backlog of any write commands run since the snapshot was made. Finally, the primary instance will send the replica a live stream of all subsequent commands. -By default, replication is asynchronous. This means that if you send a write command to Redis (1) you will receive your acknowledged response first (2), and only then will the command be replicated to the replica (3). +By default, replication is asynchronous. This means that if you send a write command to Redis you will receive your acknowledged response first, and only then will the command be replicated to the replica. -If the primary goes down after acknowledging a write but before the write can be replicated, then you might have data loss. To avoid this, the client can use the WAIT command. This command blocks the current client until all of the previous write commands are successfully transferred and acknowledged by at least some specified number of replicas. - - -For example, if we send the command WAIT 2 0, the client will block (will not return a response to the client) until all of the previous write commands issued on that connection have been written to at least 2 replicas. The second argument (0) will instruct the server to block indefinitely, but we could set it to a number (in milliseconds) so that it times out after a while and returns the number of replicas that successfully acknowledged the commands. +If the primary goes down after acknowledging a write but before the write can be replicated, then you might have data loss. To avoid this, the client can use the WAIT command. This command blocks the current client until all of the previous write commands are successfully transferred and acknowledged by at least some specified number of replicas. +For example, if we send the command WAIT 2 0, the client will block (will not return a response to the client) until all of the previous write commands issued on that connection have been written to at least 2 replicas. The second argument - 0 - will instruct the server to block indefinitely, but we could set it to a number (in milliseconds) so that it times out after a while and returns the number of replicas that successfully acknowledged the commands. Replicas are read-only. This means that you can configure your clients to read from them, but you cannot write data to them. If you need additional read throughput, you can configure your Redis client to read from replicas as well as from your primary node. However, it's often easier just to scale out your cluster. This lets you scale reads and writes without writing any complex client logic. -Also, you should know about Active-Active, an advanced feature of Redis Enterprise and Redis Cloud. Active-Active replicates entire databases across geographically-distributed clusters. With Active-Active, you can write locally to any replica databases, and those writes will be reflected globally. Something to keep in mind when you're really scaling out! +Also, you should know about Active-Active, an advanced feature of Redis Enterprise and Redis Cloud. Active-Active replicates entire databases across geographically-distributed clusters. With Active-Active, you can write locally to any replica databases, and those writes will be reflected globally. Something to keep in mind when you're really scaling out! diff --git a/docs/operate/redis-at-scale/index-redis-at-scale.mdx b/docs/operate/redis-at-scale/index-redis-at-scale.mdx index 066c0cbb07..0c048308e1 100644 --- a/docs/operate/redis-at-scale/index-redis-at-scale.mdx +++ b/docs/operate/redis-at-scale/index-redis-at-scale.mdx @@ -22,11 +22,9 @@ So, how do you run Redis at scale? There are two general answers to this questio The convenience of "database-as-a-service" offerings means that you don't have to know much about how your database scales, and that saves a lot of time and potential false starts. -We at Redis Labs offer Redis Cloud, a highly available cloud-based Redis service that provides a lot of features you can't find anywhere else, like active-active, geo-distribution. +We at Redis offer Redis Cloud, a highly available cloud-based Redis service that provides a lot of features you can't find anywhere else, like active-active, geo-distribution. -Redis Cloud is also really easy to use and has a free tier so you can get going quickly. -So, that's the first answer. To run Redis these days, you might just use a fully-managed offering like Redis Cloud. -But not everyone can or wants to use a cloud-hosted database. +Redis Cloud is also really easy to use and has a free tier so you can get going quickly. So, that's the first answer. To run Redis these days, you might just use a fully-managed offering like Redis Cloud. But not everyone can or wants to use a cloud-hosted database. There are a bunch of reasons for this. For example, maybe you're a large enterprise with your own data centers and dedicated ops teams. Or perhaps you're a mission-critical application whose SLAs are so rigid that you need to be able to dig deeply into any potential performance issue. This often rules out cloud-based deployments, since the cloud hides away the hardware and networks you're operating in. In this case, you're deploying Redis on your own. And for that, you need to know how Redis scales. diff --git a/docs/operate/redis-at-scale/observability/introduction/index-introduction.mdx b/docs/operate/redis-at-scale/observability/introduction/index-introduction.mdx index a8f4c81cf9..2a1c237eb2 100644 --- a/docs/operate/redis-at-scale/observability/introduction/index-introduction.mdx +++ b/docs/operate/redis-at-scale/observability/introduction/index-introduction.mdx @@ -25,7 +25,7 @@ Some of the question you always want to be able to answer include: - When failures occur… what exactly happened to Redis? Then of course you must ask... -- How can I find this out ahead of time? +- How can I find this out ahead of time? Let's dig into these questions and more as we look into observability with Redis. diff --git a/docs/operate/redis-at-scale/persistence-and-durability/introduction/index-introduction.mdx b/docs/operate/redis-at-scale/persistence-and-durability/introduction/index-introduction.mdx index 6d1bb92394..216f03c7fa 100644 --- a/docs/operate/redis-at-scale/persistence-and-durability/introduction/index-introduction.mdx +++ b/docs/operate/redis-at-scale/persistence-and-durability/introduction/index-introduction.mdx @@ -13,7 +13,7 @@ import useBaseUrl from '@docusaurus/useBaseUrl';

-

Hello! Congrats on completing section 1. Section 2 is a bit shorter but contains some important information on persistence and durability.

+

Hello! Congrats on completing Section 1. Section 2 is a bit shorter but contains some important information on persistence and durability.

As I am sure you know, Redis serves all data directly from memory. But Redis is also capable of persisting data to disk. Persistence preserves data in the event of a server restart. diff --git a/docs/operate/redis-at-scale/persistence-and-durability/persistence-options-in-redis/index-persistence-options-in-redis.mdx b/docs/operate/redis-at-scale/persistence-and-durability/persistence-options-in-redis/index-persistence-options-in-redis.mdx index 4089ac27a4..7921a08bfe 100644 --- a/docs/operate/redis-at-scale/persistence-and-durability/persistence-options-in-redis/index-persistence-options-in-redis.mdx +++ b/docs/operate/redis-at-scale/persistence-and-durability/persistence-options-in-redis/index-persistence-options-in-redis.mdx @@ -14,13 +14,13 @@ import useBaseUrl from '@docusaurus/useBaseUrl';

If a Redis server that only stores data in RAM is restarted, all data is lost. To prevent such data loss, there needs to be some mechanism for persisting the data to disk; -Redis provides two of them: snapshotting and an append-only file, or AOF. You can configure your Redis instances to use either of the two, or a combination of both.

+Redis provides two of them: snapshotting and an append-only file, or AOF. You can configure your Redis instances to use either of the two, or a combination of both.

-When a snapshot is created, the entire point-in-time view of the dataset is written to persistent storage in a compact dot rdb file. You can set up recurring backups, for example every 1, 12, or 24 hours and use these backups to easily restore different versions of the data set in case of disasters. You can also use these snapshots to create a clone of the server, or simply leave them in place for a future restart. +When a snapshot is created, the entire point-in-time view of the dataset is written to persistent storage in a compact .rdb file. You can set up recurring backups, for example every 1, 12, or 24 hours and use these backups to easily restore different versions of the data set in case of disasters. You can also use these snapshots to create a clone of the server, or simply leave them in place for a future restart. Creating a .rdb file requires a lot of disk I/O. If performed in the main Redis process, this would reduce the server’s performance. That’s why this work is done by a forked child process. But even forking can be time-consuming if the dataset is large. This may result in decreased performance or in Redis failing to serve clients for a few milliseconds or even up to a second for very large datasets. Understanding this should help you decide whether this solution makes sense for your requirements. -You can configure the name and location of the .rdb file with the dbfilename and dir configuration directives, either through the redis.conf file, or through the redis-cli as explained in [Section 1 Unit 2](http://localhost:3000/operate/redis-at-scale/talking-to-redis/configuring-a-redis-server). And of course you can configure how often you want to create a snapshot. Here’s an excerpt from the redis.conf file showing the default values. +You can configure the name and location of the .rdb file with the dbfilename and dir configuration directives, either through the redis.conf file, or through the redis-cli as explained in [Section 1 Unit 2](http://localhost:3000/operate/redis-at-scale/talking-to-redis/configuring-a-redis-server). And of course you can configure how often you want to create a snapshot. Here’s an excerpt from the redis.conf file showing the default values. As an example, this configuration will make Redis automatically dump the dataset to disk every 60 seconds if at least 1000 keys changed in that period. While snapshotting is a great strategy for the use cases explained above, it leaves a huge possibility for data loss. You can configure snapshots to run every few minutes, or after X writes against the database, but if the server crashes you lose all the writes since the last snapshot was taken. In many use cases, that kind of data loss can be acceptable, but in many others it is absolutely not. For all of those other use cases Redis offers the AOF persistence option. @@ -28,9 +28,9 @@ AOF, or append-only file works by logging every incoming write command to disk a Depending on your durability requirements (or how much data you can afford to lose), you can choose which fsync policy is the best for your use case: -- fsync every write - The safest policy: The write is acknowledged to the client only after it has been written to the AOF file and flushed to disk. Since in this approach we are writing to disk synchronously, we can expect a much higher latency than usual. -- fsync every second: The default policy. Fsync is performed asynchronously, in a background thread, so write performance is still high. Choose this option if you need high performance and can afford to lose up to one second worth of writes. -- no fsync: In this case Redis will log the command to the file descriptor, but will not force the OS to flush the data to disk. If the OS crashes we can lose a few seconds of data (Normally Linux will flush data every 30 seconds with this configuration, but it's up to the kernel’s exact tuning.). +- fsync every write: The safest policy: The write is acknowledged to the client only after it has been written to the AOF file and flushed to disk. Since in this approach we are writing to disk synchronously, we can expect a much higher latency than usual. +- fsync every second: The default policy. Fsync is performed asynchronously, in a background thread, so write performance is still high. Choose this option if you need high performance and can afford to lose up to one second worth of writes. +- no fsync: In this case Redis will log the command to the file descriptor, but will not force the OS to flush the data to disk. If the OS crashes we can lose a few seconds of data (Normally Linux will flush data every 30 seconds with this configuration, but it's up to the kernel’s exact tuning.). The relevant configuration directives for AOF are shown on the screen. AOF contains a log of all the operations that modified the database in a format that’s easy to understand and parse. When the file gets too big, Redis can automatically rewrite it in the background, compacting it in a way that only the latest state of the data is preserved. diff --git a/docs/operate/redis-at-scale/scalability/clustering-in-redis/index-scalability.mdx b/docs/operate/redis-at-scale/scalability/clustering-in-redis/index-scalability.mdx index 6f2fa1ff25..af7316adb4 100644 --- a/docs/operate/redis-at-scale/scalability/clustering-in-redis/index-scalability.mdx +++ b/docs/operate/redis-at-scale/scalability/clustering-in-redis/index-scalability.mdx @@ -20,28 +20,27 @@ We'll start with scalability. Here's one definition: > “Scalability is the property of a system to handle a growing amount of work by adding resources to the system.” [Wikipedia](https://en.wikipedia.org/wiki/Scalability) -The two most common scaling strategies are vertical scaling and horizontal scaling. Vertical scaling, or also called “Scaling Up”, means adding more resources like CPU or memory to your server. Horizontal scaling, or “Scaling out”, implies adding more servers to your pool of resources. It's the difference between just getting a bigger server and deploying a whole fleet of servers. +The two most common scaling strategies are vertical scaling and horizontal scaling. Vertical scaling, or also called “Scaling Up”, means adding more resources like CPU or memory to your server. Horizontal scaling, or “Scaling out”, implies adding more servers to your pool of resources. It's the difference between just getting a bigger server and deploying a whole fleet of servers. Let's take an example. Suppose you have a server with 128 GB of RAM, but you know that your database will need to store 300 GB of data. In this case, you’ll have two choices: you can either add more RAM to your server so it can fit the 300GB dataset, or you can add two more servers and split the 300GB of data between the three of them. Hitting your server’s RAM limit is one reason you might want to scale up, or out, but reaching the performance limit in terms of throughput, or operations per second, is also an indicator that scaling is necessary. -Since Redis is mostly single-threaded, Redis cannot make use of the multiple cores of your server’s CPU for command processing. But if we split the data between two Redis servers, our system can process requests in parallel, increasing the throughput by almost 200%. In fact, performance will scale close to linearly by adding more Redis servers to the system. This database architectural pattern of splitting data between multiple servers for the purpose of scaling is called sharding. The resulting servers that hold chunks of the data are called shards. +Since Redis is mostly single-threaded, Redis cannot make use of the multiple cores of your server’s CPU for command processing. But if we split the data between two Redis servers, our system can process requests in parallel, increasing the throughput by almost 200%. In fact, performance will scale close to linearly by adding more Redis servers to the system. This database architectural pattern of splitting data between multiple servers for the purpose of scaling is called sharding. The resulting servers that hold chunks of the data are called shards. This performance increase sounds amazing, but it doesn’t come without some cost: if we divide and distribute our data across two shards, which are just two Redis server instances, how will we know where to look for each key? We need to have a way to consistently map a key to a specific shard. There are multiple ways to do this and different databases adopt different strategies. The one Redis chose is called “Algorithmic sharding” and this is how it works: In order to find the shard on which a key lives we compute a numeric hash value out of the key name and modulo divide it by the total number of shards. Because we are using a deterministic hash function the key “foo” will always end up on the same shard, as long as the number of shards stays the same. -But what happens if we want to increase our shard count even further, a process commonly called “resharding”? Let’s say we add one new shard so that our total number of shards is three. When a client tries to read the key “foo” now, they will run the hash function and modulo divide by the number of shards, as before, but this time the number of shards is different and we’re modulo dividing with three instead of two. Understandably, the result may be different, pointing us to the wrong shard! +But what happens if we want to increase our shard count even further, a process commonly called resharding? Let’s say we add one new shard so that our total number of shards is three. When a client tries to read the key “foo” now, they will run the hash function and modulo divide by the number of shards, as before, but this time the number of shards is different and we’re modulo dividing with three instead of two. Understandably, the result may be different, pointing us to the wrong shard! Resharding is a common issue with the algorithmic sharding strategy and can be solved by rehashing all the keys in the keyspace and moving them to the shard appropriate to the new shard count. This is not a trivial task, though, and it can require a lot of time and resources, during which the database will not be able to reach its full performance or might even become unavailable. -Redis chose a very simple approach to solving this problem: it introduced a new, logical unit that sits between a key and a shard, called a hash slot. +Redis chose a very simple approach to solving this problem: it introduced a new, logical unit that sits between a key and a shard, called a hash slot. -One shard can contain many hash slots, and a hash slot contains many keys. -The total number of hash slots in a database is always 16384 (16K). This time, the modulo division is not done with the number of shards anymore, but instead with the number of hash slots, that stays the same even when resharding and the end result will give us the position of the hash slot where the key we’re looking for lives. And when we do need to reshard, we simply move hash slots from one shard to another, distributing the data as required across the different redis instances. +One shard can contain many hash slots, and a hash slot contains many keys. The total number of hash slots in a database is always 16384 (16K). This time, the modulo division is not done with the number of shards anymore, but instead with the number of hash slots, that stays the same even when resharding and the end result will give us the position of the hash slot where the key we’re looking for lives. And when we do need to reshard, we simply move hash slots from one shard to another, distributing the data as required across the different redis instances. -Now that we know what sharding is and how it works in Redis, we can finally introduce Redis Cluster. Redis Cluster provides a way to run a Redis installation where data is automatically split across multiple Redis servers, or shards. Redis Cluster also provides high availability. So, if you're deploying Redis Cluster, you don't need (or use) Redis Sentinel. +Now that we know what sharding is and how it works in Redis, we can finally introduce Redis Cluster. Redis Cluster provides a way to run a Redis installation where data is automatically split across multiple Redis servers, or shards. Redis Cluster also provides high availability. So, if you're deploying Redis Cluster, you don't need (or use) Redis Sentinel. -Redis Cluster can detect when a primary shard fails and promote a replica to a primary without any manual intervention from the outside. How does it do it? How does it know that a primary shard has failed, and how does it promote its replica to be the new primary shard? We need to have replication enabled. Say we have one replica for every primary shard. If all our data is divided between three Redis servers, we would need a six-member cluster, with three primary shards and three replicas. +Redis Cluster can detect when a primary shard fails and promote a replica to a primary without any manual intervention from the outside. How does it do it? How does it know that a primary shard has failed, and how does it promote its replica to be the new primary shard? We need to have replication enabled. Say we have one replica for every primary shard. If all our data is divided between three Redis servers, we would need a six-member cluster, with three primary shards and three replicas. All 6 shards are connected to each other over TCP and constantly PING each other and exchange messages using a binary protocol. These messages contain information about which shards have responded with a PONG, so are considered alive, and which haven’t. @@ -51,7 +50,7 @@ If you have an even number of shards in the cluster, say six, and there’s a ne When the network partition is removed and the shards try to rejoin, we will have a conflict, because we have two shards - holding different data claiming to be the primary and we wouldn’t know which data is valid. -This is called a split brain situation, and is a very common issue in the world of distributed systems. A popular solution is to always keep an odd number of shards in your cluster, so that when you get a network split, the left and right group will do a count and see if they are in the bigger or the smaller group (also called majority or minority). If they are in the minority, they will not try to trigger a failover and will not accept any client write requests. +This is called a split brain situation, and is a very common issue in the world of distributed systems. A popular solution is to always keep an odd number of shards in your cluster, so that when you get a network split, the left and right group will do a count and see if they are in the bigger or the smaller group (also called majority or minority). If they are in the minority, they will not try to trigger a failover and will not accept any client write requests. Here's the bottom line: to prevent split-brain situations in Redis Cluster, always keep an odd number of primary shards and two replicas per primary shard. diff --git a/docs/operate/redis-at-scale/talking-to-redis/command-line-tool/index-command-line-tool.mdx b/docs/operate/redis-at-scale/talking-to-redis/command-line-tool/index-command-line-tool.mdx index 3c8b16b70e..a2a6b3936e 100644 --- a/docs/operate/redis-at-scale/talking-to-redis/command-line-tool/index-command-line-tool.mdx +++ b/docs/operate/redis-at-scale/talking-to-redis/command-line-tool/index-command-line-tool.mdx @@ -16,18 +16,18 @@ import useBaseUrl from '@docusaurus/useBaseUrl';

Redis-cli is a command line tool used to interact with the Redis server. Most package managers include redis-cli as part of the redis package. It can also be compiled from source, and you'll find the source code in the Redis repository on GitHub.

There are two ways to use redis-cli : -- an interactive mode where the user types commands and sees the replies; -- a command mode where the command is provided as an argument to redis-cli, executed, and results sent to the standard output. +- an interactive mode where the user types commands and sees the replies; +- a command mode where the command is provided as an argument to redis-cli, executed, and results sent to the standard output. -Let’s use the CLI to connect to a Redis server running at 172.22.0.3 and port 7000. The arguments -h and -p are used to specify the host and port to connect to. They can be omitted if your server is running on the default host "localhost" port 6379. +Let’s use the CLI to connect to a Redis server running at 172.22.0.3 and port 7000. The arguments -h and -p are used to specify the host and port to connect to. They can be omitted if your server is running on the default host "localhost" port 6379. -The redis-cli provides some useful productivity features. For example, you can scroll through your command history by pressing the up and down arrow keys. You can also use the TAB key to autocomplete a command, saving even more keystrokes. Just type the first few letters of a command and keep pressing TAB until the command you want appears on screen. +The redis-cli provides some useful productivity features. For example, you can scroll through your command history by pressing the up and down arrow keys. You can also use the TAB key to autocomplete a command, saving even more keystrokes. Just type the first few letters of a command and keep pressing TAB until the command you want appears on screen. Once you have the command name you want, the CLI will display syntax hints about the arguments so you don’t have to remember all of them, or open up the Redis command documentation. These three tips can save you a lot of time and take you a step closer to being a power user. -You can do much more with redis-cli, like sending output to a file, scanning for big keys, get continuous stats, monitor commands and so on. For a much more detailed explanation refer to the documentation +You can do much more with redis-cli, like sending output to a file, scanning for big keys, get continuous stats, monitor commands and so on. For a much more detailed explanation refer to the [documentation](https://redis.io/docs/manual/cli/). From 40e339f11403bc1787a6bafc9b4504fdd85b92cb Mon Sep 17 00:00:00 2001 From: Justin Castilla Date: Wed, 8 Jun 2022 09:28:06 -0700 Subject: [PATCH 3/3] updates typos --- .../basic-replication/index-basic-replication.mdx | 2 +- docs/operate/redis-at-scale/index-redis-at-scale.mdx | 3 +-- .../index-persistence-options-in-redis.mdx | 2 +- .../scalability/clustering-in-redis/index-scalability.mdx | 2 +- .../talking-to-redis/redis-clients/index-redis-clients.mdx | 2 +- .../redis-server-overview/index-redis-server-overview.mdx | 6 +++--- 6 files changed, 8 insertions(+), 9 deletions(-) diff --git a/docs/operate/redis-at-scale/high-availability/basic-replication/index-basic-replication.mdx b/docs/operate/redis-at-scale/high-availability/basic-replication/index-basic-replication.mdx index 9d19f9eb83..53d09f3e3e 100644 --- a/docs/operate/redis-at-scale/high-availability/basic-replication/index-basic-replication.mdx +++ b/docs/operate/redis-at-scale/high-availability/basic-replication/index-basic-replication.mdx @@ -15,7 +15,7 @@ import useBaseUrl from '@docusaurus/useBaseUrl';

Replication in Redis follows a simple primary-replica model where the replication happens in one direction - from the primary to one or multiple replicas. Data is only written to the primary instance and replicas are kept in sync so that they’re exact copies of the primaries.

-To create a replica, you instantiate a Redis server instance with the configuration directive replicaof set to the address and port of the primary instance. once the replica instance is up and running, the replica will try to sync with the primary. To transfer all of its data as efficiently as possible, the primary instance will produce a compacted version of the data in a snapshot (.rdb) file and send it to the replica. +To create a replica, you instantiate a Redis server instance with the configuration directive replicaof set to the address and port of the primary instance. Once the replica instance is up and running, the replica will try to sync with the primary. To transfer all of its data as efficiently as possible, the primary instance will produce a compacted version of the data in a snapshot (.rdb) file and send it to the replica. The replica will then read the snapshot file and load all of its data into memory, which will bring it to the same state the primary instance had at the moment of creating the .rdb file. When the loading stage is done, the primary instance will send the backlog of any write commands run since the snapshot was made. Finally, the primary instance will send the replica a live stream of all subsequent commands. diff --git a/docs/operate/redis-at-scale/index-redis-at-scale.mdx b/docs/operate/redis-at-scale/index-redis-at-scale.mdx index 0c048308e1..fd0d1fb65b 100644 --- a/docs/operate/redis-at-scale/index-redis-at-scale.mdx +++ b/docs/operate/redis-at-scale/index-redis-at-scale.mdx @@ -28,8 +28,7 @@ Redis Cloud is also really easy to use and has a free tier so you can get going There are a bunch of reasons for this. For example, maybe you're a large enterprise with your own data centers and dedicated ops teams. Or perhaps you're a mission-critical application whose SLAs are so rigid that you need to be able to dig deeply into any potential performance issue. This often rules out cloud-based deployments, since the cloud hides away the hardware and networks you're operating in. In this case, you're deploying Redis on your own. And for that, you need to know how Redis scales. -Learning this isn't just useful; it's also genuinely interesting. Sharding, replication, high availability, and disaster recovery are all important concepts that anyone can understand with the right explanation. These concepts aren't rocket science. They're no harder to understand than basic high school math, and knowing about them makes you a better developer. -In this course, we'll look closely at how open source Redis scales. And you'll learn by doing, as we present a lot of the ideas through hands-on labs. +Learning this isn't just useful; it's also genuinely interesting. Sharding, replication, high availability, and disaster recovery are all important concepts that anyone can understand with the right explanation. These concepts aren't rocket science. They're no harder to understand than basic high school math, and knowing about them makes you a better developer.In this course, we'll look closely at how open source Redis scales. And you'll learn by doing, as we present a lot of the ideas through hands-on labs. These ideas will apply whether you're deploying open source Redis on your own or managing a Redis Enterprise cluster - which is, ultimately, what you'll want to reach for if you ever outgrow open source Redis. These are some important topics to consider during your time with this course. But let's first learn how to walk before we run. diff --git a/docs/operate/redis-at-scale/persistence-and-durability/persistence-options-in-redis/index-persistence-options-in-redis.mdx b/docs/operate/redis-at-scale/persistence-and-durability/persistence-options-in-redis/index-persistence-options-in-redis.mdx index 7921a08bfe..49372892e9 100644 --- a/docs/operate/redis-at-scale/persistence-and-durability/persistence-options-in-redis/index-persistence-options-in-redis.mdx +++ b/docs/operate/redis-at-scale/persistence-and-durability/persistence-options-in-redis/index-persistence-options-in-redis.mdx @@ -20,7 +20,7 @@ When a snapshot is created, the entire point-in-time view of the dataset is writ Creating a .rdb file requires a lot of disk I/O. If performed in the main Redis process, this would reduce the server’s performance. That’s why this work is done by a forked child process. But even forking can be time-consuming if the dataset is large. This may result in decreased performance or in Redis failing to serve clients for a few milliseconds or even up to a second for very large datasets. Understanding this should help you decide whether this solution makes sense for your requirements. -You can configure the name and location of the .rdb file with the dbfilename and dir configuration directives, either through the redis.conf file, or through the redis-cli as explained in [Section 1 Unit 2](http://localhost:3000/operate/redis-at-scale/talking-to-redis/configuring-a-redis-server). And of course you can configure how often you want to create a snapshot. Here’s an excerpt from the redis.conf file showing the default values. +You can configure the name and location of the .rdb file with the dbfilename and dir configuration directives, either through the redis.conf file, or through the redis-cli as explained in [Section 1 Unit 2](https://developer.redis.com//operate/redis-at-scale/talking-to-redis/configuring-a-redis-server). And of course you can configure how often you want to create a snapshot. Here’s an excerpt from the redis.conf file showing the default values. As an example, this configuration will make Redis automatically dump the dataset to disk every 60 seconds if at least 1000 keys changed in that period. While snapshotting is a great strategy for the use cases explained above, it leaves a huge possibility for data loss. You can configure snapshots to run every few minutes, or after X writes against the database, but if the server crashes you lose all the writes since the last snapshot was taken. In many use cases, that kind of data loss can be acceptable, but in many others it is absolutely not. For all of those other use cases Redis offers the AOF persistence option. diff --git a/docs/operate/redis-at-scale/scalability/clustering-in-redis/index-scalability.mdx b/docs/operate/redis-at-scale/scalability/clustering-in-redis/index-scalability.mdx index af7316adb4..f26979a0de 100644 --- a/docs/operate/redis-at-scale/scalability/clustering-in-redis/index-scalability.mdx +++ b/docs/operate/redis-at-scale/scalability/clustering-in-redis/index-scalability.mdx @@ -14,7 +14,7 @@ import useBaseUrl from '@docusaurus/useBaseUrl';

-

Before we jump into the details, let's first address the elephant in the room: DBaaS offerings, or "database-as-a-service" in the cloud. No doubt, it's useful to know how Redis scales and how you might deploy it. But deploying and maintaining a Redis cluster is a fair amount of work. So if you don't want to deploy and manage Redis yourself, then consider signing up for a Redis Cloud, our managed service, and let us do the scaling for you. Of course, that route is not for everyone. And as I said, there's a lot to learn here, so let's dive in.

+

Before we jump into the details, let's first address the elephant in the room: DBaaS offerings, or "database-as-a-service" in the cloud. No doubt, it's useful to know how Redis scales and how you might deploy it. But deploying and maintaining a Redis cluster is a fair amount of work. So if you don't want to deploy and manage Redis yourself, then consider signing up for Redis Cloud, our managed service, and let us do the scaling for you. Of course, that route is not for everyone. And as I said, there's a lot to learn here, so let's dive in.

We'll start with scalability. Here's one definition: > “Scalability is the property of a system to handle a growing amount of work by adding resources to the system.” diff --git a/docs/operate/redis-at-scale/talking-to-redis/redis-clients/index-redis-clients.mdx b/docs/operate/redis-at-scale/talking-to-redis/redis-clients/index-redis-clients.mdx index d72a318672..2d6d85e99b 100644 --- a/docs/operate/redis-at-scale/talking-to-redis/redis-clients/index-redis-clients.mdx +++ b/docs/operate/redis-at-scale/talking-to-redis/redis-clients/index-redis-clients.mdx @@ -18,8 +18,8 @@ import useBaseUrl from '@docusaurus/useBaseUrl'; Client libraries perform the following duties: - Implement the Redis wire protocol - the format used to send requests to and receive responses from the Redis server - Provide an idiomatic API for using Redis commands from a particular programming language -Managing the connection to Redis +## Managing the connection to Redis Redis clients communicate with the Redis server over TCP, using a protocol called [RESP](https://redis.io/docs/reference/protocol-spec/) (REdis Serialization Protocol) designed specifically for Redis. The RESP protocol is simple and text-based, so it is easily read by humans, as well as machines. A common request/response would look something like this. Note that we're using netcat here to send raw protocol: diff --git a/docs/operate/redis-at-scale/talking-to-redis/redis-server-overview/index-redis-server-overview.mdx b/docs/operate/redis-at-scale/talking-to-redis/redis-server-overview/index-redis-server-overview.mdx index f141009ab5..cbbc736a74 100644 --- a/docs/operate/redis-at-scale/talking-to-redis/redis-server-overview/index-redis-server-overview.mdx +++ b/docs/operate/redis-at-scale/talking-to-redis/redis-server-overview/index-redis-server-overview.mdx @@ -14,15 +14,15 @@ import useBaseUrl from '@docusaurus/useBaseUrl';

As you might already know, Redis is an open source data structure server written in C. You can store multiple data types, like strings, hashes, and streams and access them by a unique key name.

-For example, if you have a string value “Hello World” saved under the key name “greeting”, you can access it by running the GET command followed by the key name - greeting. All keys in a Redis database are stored in a flat keyspace. There is no enforced schema or naming policy, and the responsibility for organizing the keyspace is left to the developer. +For example, if you have a string value “Hello World” saved under the key name “greeting”, you can access it by running the GET command followed by the key name - greeting. All keys in a Redis database are stored in a flat keyspace. There is no enforced schema or naming policy, and the responsibility for organizing the keyspace is left to the developer. -The speed Redis is famous for is mostly due to the fact that Redis stores and serves data entirely from RAM memory instead of disk, as most other databases do. Another contributing factor is its predominantly single-threaded nature: single-threading avoids race conditions and CPU-heavy context switching associated with threads. +The speed Redis is famous for is mostly due to the fact that Redis stores and serves data entirely from RAM instead of disk, as most other databases do. Another contributing factor is its predominantly single-threaded nature: single-threading avoids race conditions and CPU-heavy context switching associated with threads. Indeed, this means that open source Redis can’t take advantage of the processing power of multiple CPU cores, although CPU is rarely the bottleneck with Redis. You are more likely to bump up against memory or network limitations before hitting any CPU limitations. That said, Redis Enterprise does let you take advantage of all of the cores on a single machine. Let’s now look at exactly what happens behind the scenes with every Redis request. When a client sends a request to a Redis server, the request is first read from the socket, then parsed and processed and finally, the response is written back to the socket and sent to the user. The reading and especially writing to a socket are expensive operations, so in Redis version 6.0 multi-threaded I/O was introduced. When this feature is enabled, Redis can delegate the time spent reading and writing to I/O sockets over to other threads, freeing up cycles for storing and retrieving data and boosting overall performance by up to a factor of two for some workloads. -Throughout the rest of the section, you’ll learn how to use the Redis command line interface, how to configure your Redis server, and how to choose and tune your Redis client library. +Throughout the rest of the section, you’ll learn how to use the Redis command-line interface, how to configure your Redis server, and how to choose and tune your Redis client library.