# Introduction to Redis
* Redis is an in-memory data structure store, used as a database, cache and message broker. It supports data structures such as `strings, hashes, lists, sets, sorted sets` with range queries, `bitmaps, hyperloglogs, geospatial indexes` with radius queries and `streams`. Redis has built-in `replication, Lua scripting, LRU eviction, transactions` and different levels of `on-disk persistence`, and provides high availability via `Redis Sentinel` and automatic partitioning with `Redis Cluster`.

* In order to achieve its outstanding performance, Redis works with an in-memory dataset. Depending on your use case, you can persist it either by `dumping the dataset to disk` every once in a while, or by `appending each command to a log`. Persistence can be optionally disabled, if you just need a feature-rich, networked, in-memory cache.

* Redis also supports trivial-to-setup `master-slave asynchronous replication`, with very fast non-blocking first synchronization, auto-reconnection with partial resynchronization on net split.

* Other features include:
    * `Transactions`
    * `Pub/Sub`
    * `Lua scripting`
    * `Keys with a limited time-to-live`
    * `LRU eviction of keys`
    * `Automatic failover`
* 所有标记可以从这里找到链接：https://redis.io/topics/introduction
* 选读内容，理解实现原理（充分实践后的必读内容）：
    * Redis Server的通信协议：https://redis.io/topics/protocol
    * Redis String的实现思路：https://redis.io/topics/internals-sds
    * Redis Event实现思路：https://redis.io/topics/internals-rediseventlib
    * Redis 虚拟内存实现思路：https://redis.io/topics/internals-vm
    * Redis RDB数据库协议：https://github.com/sripathikrishnan/redis-rdb-tools/wiki/Redis-RDB-Dump-File-Format

****

# Data types
* 命令集参考链接：https://redis.io/commands, https://www.cheatography.com/tasjaevan/cheat-sheets/redis/
* Redis keys are strings:
    * Redis keys are binary safe, this means that you can use any binary sequence as a key, from a string like "foo" to the content of a JPEG file. The empty string is also a valid key.
    * Very long keys are not a good idea, hashing it (for example with SHA1) is a better idea, especially from the perspective of memory and bandwidth.
    * Very short keys are often not a good idea.
    * Try to stick with a schema. For instance "object-type:id" is a good idea, as in "user:1000". Dots or dashes are often used for multi-word fields, as in "comment:1234:reply.to" or "comment:1234:reply-to".
    * The maximum allowed key size is 512 MB.
* `All types support command: EXISTS, DEL, TYPE, EXPIRE, PERSIST, TTL, PEXPIRE, PTTL`
    * Information about expires are replicated and persisted on disk, the time virtually passes when your Redis server remains stopped (this means that Redis saves the date at which a key will expire). 即使服务器关闭了，过期日期还是会继续计数
* Redis Strings:
    * A value can't be bigger than 512 MB.
    * `Support command: SET, GET, INCR, INCRBY, DECR, DECRBY, GETSET, MSET, MGET`
* Redis Lists:
    * Redis lists are implemented via Linked Lists.
        * a database system it is crucial to be able to add elements to a very long list in a very fast way.
        * can be taken at constant length in constant time(accessing small ranges towards the head or the tail of the list).
    * Use case:
        * Reliable queue or Circular list with command `RPOPLPUSH, BRPOPLPUSH`
        * Remember the latest updates posted by users into a social network.
        * Communication between processes, using a consumer-producer pattern where the producer pushes items into a list, and a consumer (usually a worker) consumes those items and executed actions. 
    * `Support command: LPUSH, RPUSH, LRANGE, LPOP, RPOP, LTRIM, BRPOP, BLPOP, RPOPLPUSH, BRPOPLPUSH, LREM, LLEN`
        * `LRANGE`是闭区间
        * `BRPOP, BLPOP`:
            * Clients are served in an ordered way: the first client that blocked waiting for a list, is served first when an element is pushed by some other client, and so forth.
            * The return value is different compared to `RPOP`: it is a two-element array since it also includes the name of the key, because `BRPOP` and `BLPOP` are able to block waiting for elements from multiple lists.
            * If the timeout is reached, NULL is returned.
        * <b>Automatic creation and removal of keys for Lists, Sets, Sorted Sets and Hashes.</b>
* Redis Hashes:
    * Redis hashes look exactly how one might expect a "hash" to look, with field-value pairs
    * `Support command: HMSET, HGET, HMGET, HINCRBY`
        * Full list: https://redis.io/commands#hash
* Redis Sets:
    * Redis Sets are unordered collections of strings, operation support for the intersection, union or difference between multiple sets
    * `Support command: SADD, SINTER, SPOP, SUNIONSTORE, SCARD, SRANDMEMBER`
* Redis Sorted sets:
    * Sort rules:
        * If A and B are two elements with a different score, then A > B if A.score is > B.score.
        * If A and B have exactly the same score, then A > B if the A string is lexicographically greater than the B string. A and B strings can't be equal since sorted sets only have unique elements.
    * `Support command: ZADD, ZRANGE, ZREVRANGE, ZRANGEBYSCORE, ZREMRANGEBYSCORE, ZLEXCOUNT, ZRANGEBYLEX`
* Bitmaps:
    * Use case:
        * Real time analytics of all kinds.
        * Storing space efficient but high performance boolean information associated with object IDs.
    * `Support command: SETBIT, GETBIT, BITOP, BITCOUNT, BITPOS`
* HyperLogLogs:
    * A HyperLogLog is a probabilistic data structure used in order to count unique things.
    * `Support command: PFADD, PFCOUNT`

****

### Redis Streams: An implementation of message queue
* <b>the stream key will not removed even there is nothing in it</b>
* The Stream is a new data type introduced with Redis 5.0, which models a log data structure in a more abstract way, however the essence of the log is still intact, Redis streams are primarily an <b>append only</b> data structure.
* A stream entry is not just a string, but is instead composed of one or multiple field-value pairs. This way, each entry of a stream is already structured, like an append only file written in CSV format where multiple separated fields are present in each line.
* Entry ID: `<millisecondsTime>-<sequenceNumber>`, Every new ID will be monotonically increasing, so in more simple terms, every new entry added will have a higher ID compared to all the past entries.
    * The milliseconds time part is actually the local time in the local Redis node generating the stream ID, however if the current milliseconds time happens to be smaller than the previous entry time, then the previous entry time is used instead, so if a clock jumps backward the monotonically incrementing ID property still holds. 
    * The sequence number is used for entries created in the same millisecond. 
* Use case:
    * one obvious way is to mimic what we normally do with the Unix command tail -f, that is, we may start to listen in order to get the new messages that are appended to the stream. Note that unlike the blocking list operations of Redis, where a given element will reach a single client which is blocking in a pop style operation like BLPOP, with streams we want that multiple consumers can see the new messages appended to the Stream, like many tail -f processes can see what is added to a log.
    * another natural query mode is to get messages by ranges of time, or alternatively to iterate the messages using a cursor to incrementally check all the history.
    * as a stream of messages that can be partitioned to multiple consumers that are processing such messages, so that groups of consumers can only see a subset of the messages arriving in a single stream. In this way, it is possible to scale the message processing across different consumers, without single consumers having to process all the messages: each consumer will just get different messages to process. 
* Consumer groups: 
    * Each message is served to a different consumer so that it is not possible that the same message is delivered to multiple consumers.
    * Consumers are identified, within a consumer group, by a name, which is a case-sensitive string that the clients implementing consumers must choose. This means that even after a disconnect, the stream consumer group retains all the state, since the client will claim again to be the same consumer. However, this also means that it is up to the client to provide a unique identifier.
    * Each consumer group has the concept of the first ID never consumed so that, when a consumer asks for new messages, it can provide just messages that were never delivered previously.
    * Consuming a message however requires explicit acknowledge using a specific command, to say: this message was correctly processed, so can be evicted from the consumer group.
    * A consumer group tracks all the messages that are currently pending, that is, messages that were delivered to some consumer of the consumer group, but are yet to be acknowledged as processed. Thanks to this feature, when accessing the history of messages of a stream, each consumer will only see messages that were delivered to it.
* Special IDs:
    * `>`: messages never delivered to other consumers so far
    * `$$$$`:  XREAD should use as last ID the maximum ID already stored in the stream mystream, so that we will receive only new messages, starting from the time we started listening. This is similar to the tail -f Unix command in some way.
    * `0`: STREAMS mystream 0 so we want all the messages in the Stream mystream having an ID greater than 0-0.
* `Support command: XADD, XLEN, XRANGE, XREVRANGE, XREAD, XGROUP, XREADGROUP, XACK, XPENDING, XINFO, XTRIM, XDEL`
    * `XADD mystream * sensor-id 1234 temperature 19.8`
    * `XREAD`:
        * A stream can have multiple clients (consumers) waiting for data. Every new item, by default, will be delivered to every consumer that is waiting for data in a given stream. This behavior is different than blocking lists, where each consumer will get a different element.
        * While in Pub/Sub messages are fire and forget and are never stored anyway, and while when using blocking lists, when a message is received by the client it is popped (effectively removed) from the list, streams work in a fundamentally different way. All the messages are appended in the stream indefinitely (unless the user explicitly asks to delete entries): different consumers will know what is a new message from its point of view by remembering the ID of the last message received.
        * Streams Consumer Groups provide a level of control that Pub/Sub or blocking lists cannot achieve, with different groups for the same stream, explicit acknowledge of processed items, ability to inspect the pending items, claiming of unprocessed messages, and coherent history visibility for each single client, that is only able to see its private past history of messages.

****

# Features
* <b>pipelining</b>: A Request/Response server can be implemented so that it is able to process new requests even if the client didn't already read the old responses. This way it is possible to send multiple commands to the server without waiting for the replies at all, and finally read the replies in a single step. This is called pipelining.
* <b>Pub/Sub</b>: SUBSCRIBE, UNSUBSCRIBE and PUBLISH implement the Publish/Subscribe messaging paradigm where (citing Wikipedia) senders (publishers) are not programmed to send their messages to specific receivers (subscribers). Rather, published messages are characterized into channels, without knowledge of what (if any) subscribers there may be. Subscribers express interest in one or more channels, and only receive messages that are of interest, without knowledge of what (if any) publishers there are. This decoupling of publishers and subscribers can allow for greater scalability and a more dynamic network topology. Messages sent by other clients to these channels will be pushed by Redis to all the subscribed clients.
    * `Support command: PSUBSCRIBE, PUBLISH, PUBSUB, PUNSUBSCRIBE, SUBSCRIBE, UNSUBSCRIBE`
    * Please note that redis-cli will not accept any commands once in subscribed mode and can only quit the mode with Ctrl-C.
* <b>Redis Lua scripting</b>: EVAL and EVALSHA are used to evaluate scripts using the Lua interpreter built into Redis starting from version 2.6.0. similar to the database's function/procedure, Redis does not need to recompile the script every time as it uses an internal caching mechanism
    * all keys should pass into the script, not hardcode inside the script to make sure Redis Cluster can forward your request to the appropriate cluster node.
    * Redis uses the same Lua interpreter to run all the commands. Also Redis guarantees that a script is executed in an atomic way: no other script or Redis command will be executed while a script is being executed. 
    * `redis.call()` throw an error, `redis.pcall()` returns the error to client
    * `whole scripts replication` and `script effects replication` 两种复杂备份模式
    * Scripts should never try to access the external system, like the file system or any other system call. A script should only operate on Redis data and passed arguments. Scripts are also subject to a maximum execution time (<b>five seconds by default</b>).
    * `Support command: EVAL, EVALSHA, SCRIPT DEBUG, SCRIPT EXISTS, SCRIPT FLUSH, SCRIPT KILL, SCRIPT LOAD`
* <b>Using Redis as an LRU cache</b>
    * LRU (Less Recently Used)，最近最少使用
    * LFU (Least Frequently Used)，最少使用
    * The maxmemory configuration directive is used in order to configure Redis to use a specified amount of memory for the data set. It is possible to set the configuration directive using the redis.conf file, or later using the CONFIG SET command at runtime.
    * Setting maxmemory to zero results into no memory limits. This is the default behavior for 64 bit systems, while 32 bit systems use an implicit memory limit of 3GB.
    * When the specified amount of memory is reached, it is possible to select among different behaviors, called policies. Redis can just return errors for commands that could result in more memory being used, or it can evict some old data in order to return back to the specified limit every time new data is added.
    * Eviction policies: The exact behavior Redis follows when the maxmemory limit is reached is configured using the maxmemory-policy configuration directive.
        * `noeviction`: return errors when the memory limit was reached and the client is trying to execute commands that could result in more memory to be used (most write commands, but DEL and a few more exceptions).
        * `allkeys-lru`: evict keys by trying to remove the less recently used (LRU) keys first, in order to make space for the new data added.
        * `volatile-lru`: evict keys by trying to remove the less recently used (LRU) keys first, but only among keys that have an expire set, in order to make space for the new data added.
        * `allkeys-random`: evict keys randomly in order to make space for the new data added.
        * `volatile-random`: evict keys randomly in order to make space for the new data added, but only evict keys with an expire set.
        * `volatile-ttl`: evict keys with an expire set, and try to evict keys with a shorter time to live (TTL) first, in order to make space for the new data added.
        * `volatile-lfu`: Evict using approximated LFU among the keys with an expire set.
        * `allkeys-lfu`: Evict any key using approximated LFU.
        * `CONFIG SET maxmemory-samples <count> command`: 设置LRU算法的精度
        * `lfu-log-factor`: 设置缓存计数饱和度，该值越大，需要访问越多才能达到饱和，衡量饱和的数值是0-255之间的范围
        * `lfu-decay-time`: 设置衰弱周期，单位是分钟，意思每隔这个时间就会减少缓存Key的访问次数，访问次数最少的将会被剔除
* <b>Redis transactions</b>
    * MULTI, EXEC, DISCARD and WATCH are the foundation of transactions in Redis. They allow the execution of a group of commands in a single step, with two important guarantees:
        * All the commands in a transaction are serialized and executed sequentially. This guarantees that the commands are executed as a single isolated operation.
        * Either all of the commands or none are processed, so a Redis transaction is also atomic. 
    * Errors inside a transaction:
        * A command may fail to be queued, so there may be an error before EXEC is called.
            * starting with Redis 2.6.5, the server will remember that there was an error during the accumulation of commands, and will refuse to execute the transaction returning also an error during EXEC, and discarding the transaction automatically.
            * Before Redis 2.6.5 the behavior was to execute the transaction with just the subset of commands queued successfully in case the client called EXEC regardless of previous errors. 
        * A command may fail after EXEC is called
            * Errors happening after EXEC instead are not handled in a special way: all the other commands will be executed even if some command fails during the transaction.
            * It's important to note that <b>even when a command fails, all the other commands in the queue are processed – Redis will not stop the processing of commands.</b>
    * `Support command: DISCARD, EXEC, MULTI, UNWATCH, WATCH`
* <b>Partitioning</b>
    * It allows for much larger databases, using the sum of the memory of many computers. Without partitioning you are limited to the amount of memory a single computer can support.
    * It allows scaling the computational power to multiple cores and multiple computers, and the network bandwidth to multiple computers and network adapters.
* <b>Distributed locks with Redis</b>: http://localhost:8888/notebooks/java-concept/architecture.ipynb#%E5%88%86%E5%B8%83%E5%BC%8F%E9%94%81%E7%AE%97%E6%B3%95:
* Redis Keyspace Notifications: 监控Key上的操作事件，详情参考: https://redis.io/topics/notifications
* <b>Secondary indexing with Redis</b>: 利用Redis构建二级索引，实践参考: https://redis.io/topics/indexes

****

# Distributed Feature
* <b>Master-Slave Replication</b>: https://redis.io/topics/replication
    * How Redis replication works
        * Every Redis master has a replication ID: it is a large pseudo random string that marks a given story of the dataset. Each master also takes an offset that increments for every byte of replication stream that it is produced to be sent to slaves, in order to update the state of the slaves with the new changes modifying the dataset. The replication offset is incremented even if no slave is actually connected
        * When slaves connects to masters, they use the `PSYNC` command in order to send their old master replication ID and the offsets they processed so far. This way the master can send just the incremental part needed. However if there is not enough backlog in the master buffers, or if the slave is referring to an history (replication ID) which is no longer known, than a full resynchronization happens: in this case the slave will get a full copy of the dataset, from scratch.
        * This is how a full synchronization works in more details: The master starts a background saving process in order to produce an RDB file. At the same time it starts to buffer all new write commands received from the clients. When the background saving is complete, the master transfers the database file to the slave, which saves it on disk, and then loads it into memory. The master will then send all buffered commands to the slave. This is done as a stream of commands and is in the same format of the Redis protocol itself.
    * Read-only slave
        * Since Redis 2.6, slaves support a read-only mode that is enabled by default.
        * writable slaves before version 4.0 were incapable of expiring keys with a time to live set. 
        * computing slow Set or Sorted set operations and storing them into local keys is an use case for writable slaves that was observed multiple times.
        *  since Redis 4.0 slave writes are only local, and are not propagated to sub-slaves attached to the instance. Sub slaves instead will always receive the replication stream identical to the one sent by the top-level master to the intermediate slaves.
    * Allow writes only with N attached replicas
        * Redis slaves ping the master every second, acknowledging the amount of replication stream processed.
        * Redis masters will remember the last time it received a ping from every slave.
        * The user can configure a minimum number of slaves that have a lag not greater than a maximum number of seconds.
        * If there are at least N slaves, with a lag less than M seconds, then the write will be accepted.
    * How Redis replication deals with expires on keys
        * Slaves don't expire keys, instead they wait for masters to expire the keys. When a master expires a key (or evict it because of LRU), it synthesizes a DEL command which is transmitted to all the slaves.
        * However because of master-driven expire, sometimes slaves may still have in memory keys that are already logically expired, since the master was not able to provide the DEL command in time. In order to deal with that the slave uses its logical clock in order to report that a key does not exist only for read operations that don't violate the consistency of the data set (as new commands from the master will arrive). In this way slaves avoid to report logically expired keys are still existing. In practical terms, an HTML fragments cache that uses slaves to scale will avoid returning items that are already older than the desired time to live.
        * During Lua scripts executions no keys expires are performed. As a Lua script runs, conceptually the time in the master is frozen, so that a given key will either exist or not for all the time the script runs. This prevents keys to expire in the middle of a script, and is needed in order to send the same script to the slave in a way that is guaranteed to have the same effects in the data set.
* <b>Redis Sentinel</b>: https://redis.io/topics/sentinel, https://redis.io/topics/sentinel-clients
    * Sentinels by default run listening for connections to TCP port 26379
    * You need at least three Sentinel instances for a robust deployment.
    * The three Sentinel instances should be placed into computers or virtual machines that are believed to fail in an independent way. So for example different physical servers or Virtual Machines executed on different availability zones.
    * Redis Sentinel provides tasks such as monitoring, notifications, acts as a configuration provider for clients.
        * Monitoring. Sentinel constantly checks if your master and slave instances are working as expected.
        * Notification. Sentinel can notify the system administrator, another computer programs, via an API, that something is wrong with one of the monitored Redis instances.
        * Automatic failover. If a master is not working as expected, Sentinel can start a failover process where a slave is promoted to master, the other additional slaves are reconfigured to use the new master, and the applications using the Redis server informed about the new address to use when connecting.
        * Configuration provider. Sentinel acts as a source of authority for clients service discovery: clients connect to Sentinels in order to ask for the address of the current Redis master responsible for a given service. If a failover occurs, Sentinels will report the new address.
* <b>Redis cluster</b>: 
    * <b>In Redis Cluster nodes are responsible for holding the data, and taking the state of the cluster, including mapping keys to the right nodes. Cluster nodes are also able to auto-discover other nodes, detect non-working nodes, and promote slave nodes to master when needed in order to continue to operate when a failure occurs.</b>
        * Multiple Master in the cluster, Master-Slave model is part of this, the Sentinel is separate instances to monitor these cluster
        * 简单的说Redis cluster模式等于Master-Slave Replication模式+Redis Sentinel模式，但是没有Redis Sentinel模式代理功能，需要自己缓存不同节点的hash slots
    * Cluster topology
        * Every Redis Cluster node requires two TCP connections open. The normal Redis TCP port used to serve clients, for example 6379, plus the port obtained by adding 10000 to the data port, so 16379 in the example.
        * Redis Cluster is a full mesh where every node is connected with every other node using a TCP connection. In a cluster of N nodes, every node has N-1 outgoing TCP connections, and N-1 incoming connections.
        * Every node is connected to every other node in the cluster using the cluster bus. Nodes use a gossip protocol to propagate information about the cluster in order to discover new nodes, to send ping packets to make sure all the other nodes are working properly, and to send cluster messages needed to signal specific conditions. 
    * There are 16384 hash slots in Redis Cluster, and to compute what is the hash slot of a given key, we simply take the CRC16 of the key modulo 16384.
        * Key hash tag, keep the different key hash slots into the same node: https://redis.io/topics/cluster-spec#keys-hash-tags
    * Redis服务端的负载均衡: https://redis.io/topics/cluster-spec#redirection-and-resharding
        * In Redis Cluster nodes don't proxy commands to the right node in charge for a given key, but instead they redirect clients to the right nodes serving a given portion of the key space.
        * The client is in theory free to send requests to all the nodes in the cluster, getting redirected if needed, so the client is not required to hold the state of the cluster. However clients that are able to cache the map between keys and nodes can improve the performance in a sensible way.
            * <b>Normally slave nodes will redirect clients to the authoritative master for the hash slot involved in a given command, however clients can use slaves in order to scale reads using the READONLY command.</b>
    * <b>Redis Cluster available check</b>: https://redis.io/topics/cluster-tutorial#redis-cluster-master-slave-model
    * <b>Redis Cluster is not able to guarantee strong consistency</b>: https://redis.io/topics/cluster-tutorial#redis-cluster-consistency-guarantees
    * <b>cluster-operation：https://redis.io/topics/cluster-tutorial; cluster-theory: https://redis.io/topics/cluster-spec</b>
    * Replicas migration: 解决单个Master单个Slave稳定性低的问题
        * The cluster will try to migrate a replica from the master that has the greatest number of replicas in a given moment.
        * To benefit from replica migration you have just to add a few more replicas to a single master in your cluster, it does not matter what master.
        * There is a configuration parameter that controls the replica migration feature that is called cluster-migration-barrier: you can read more about it in the example redis.conf file provided with Redis Cluster.
    * <b>Redis Cluster uses a concept similar to the Raft algorithm "term", base on the Raft consensus protocol for slave promotion</b>

****

# Administration
* <b>Redis-cli</b>: https://redis.io/topics/rediscli
* <b>Configuration</b>: https://redis.io/topics/config
* <b>Admin suggestion operation</b>: https://redis.io/topics/admin
* <b>Redis Security rules</b>: https://redis.io/topics/security
* <b>Redis latency monitoring</b>: https://redis.io/topics/latency-monitor, https://redis.io/topics/latency
* <b>Redis benchmark</b>: https://redis.io/topics/benchmarks
* <b>Redis Persistence</b>:
    * The `RDB persistence` performs point-in-time <b>snapshots of your dataset</b> at specified intervals.
    * the `AOF persistence` logs every write operation received by the server, that will be played again at server startup, reconstructing the original dataset. Commands are logged using the same format as the Redis protocol itself, in an <b>append-only fashion</b>. Redis is able to rewrite the log on background when it gets too big.
        * fsync every time a new command is appended to the AOF. Very very slow, very safe.
        * fsync every second. Fast enough (in 2.4 likely to be as fast as snapshotting), and you can lose 1 second of data if there is a disaster.
        * Never fsync, just put your data in the hands of the Operating System. The faster and less safe method.
    * If you wish, you can disable persistence at all, if you want your data to just exist as long as the server is running.
    * It is possible to combine both AOF and RDB in the same instance. Notice that, in this case, when Redis restarts the AOF file will be used to reconstruct the original dataset since it is guaranteed to be the most complete.
    