Skip to content

2.0 GA (v2.0.0)

Compare
Choose a tag to compare
@K-Jo K-Jo released this 17 Sep 17:12
· 1424 commits to master since this release
d421035

This is the GA release for RedisSearch 2.0. This release includes several improvements in performance and usability over RediSearch 1.0. These improvements necessitate a few backward-breaking changes to the API.

Highlights

For this release, we changed the way in which the search indexes are kept in sync with your data. In RediSearch 1.x, you had to manually add data to your indexes using the FT.ADD command. In RediSearch 2.x, your data is indexed automatically based on a key pattern.

These changes are designed to enhance developer productivity, and to ensure that
your search indexes are always kept in sync with your data. To support this, we've
made a few changes to the API.

In addition to simplifying indexing, RediSearch 2.0 allows you to scale a single index over multiple Redis shards using the Redis cluster API.

Finally, RediSearch 2.x keeps its indexes outside of the main Redis key space. Improvements to the indexing code have increased query performance 2.4x.

You can read more details in the RediSearch 2.0 announcement blog post, and you can get started by checking out this quick start blog post.
architecture

Details

  • When you create an index, you must specify a prefix condition and/or a filter. This determines which hashes RediSearch will index.
  • Several RediSearch commands now map to their Redis equivalents: FT.ADD -> HSET, FT.DEL -> DEL (equivalent to FT.DEL with the DD flag in RediSearch 1.x), FT.GET -> HGETALL, FT.MGET -> HGETALL.
  • RediSearch indexes no longer reside within the key space, and the indexes are no longer saved to the RDB.
  • You can upgrade from RediSearch 1.x to RediSearch 2.x.

Noteworthy changes

  • #1246: geodistance function for FT.AGGREGATE APPLY operation.
  • #1394: Expired documents (TTL) will be removed from the index.
  • #1394: Optimization to avoid reindexing documents when non-indexed fields are updated.
  • After index creation, an initial scan starts for existing documents. You can check the status of this scan by calling FT.INFO and looking at the indexing and percent_indexed values. While indexing is true, queries return partial results.
  • #1435: NOINITIALINDEX flag on FT.CREATE to skip the initial scan of documents on index creation.
  • #1401: Support upgrade from v1.x and for reading RDB's created by RediSearch 1.x (more information).
  • #1445: Support for load event. This event indexes documents when they are loaded from RDB, ensuring that indexes are fully available when RDB loading is complete (available from Redis 6.0.7 and above).
  • #1384: FT.DROPINDEX, which by default does not delete documents underlying the index (see deprecated FT.DROP).
  • #1385: Add index definition to FT.INFO response.
  • #1097: Add Hindi snowball stemmer.
  • The FT._LIST command returns a list of all available indices. Note that this is a temporary command, as indicated by the _ in the name, so it's not documented. We're working on a SCAN-like command for databases with many indexes.
  • The RediSearch version will appear in Redis as 20000, which is equivalent to 2.0.0 in semantic versioning. Since the version of a module in Redis is numeric, we cannot explicitly add an GA flag.
  • RediSearch 2.x requires Redis 6.0 or later.

Behavior changes

Please familiarize yourself with these changes before upgrading to RediSearch 2.0:

  • #1381: FT.SYNADD is removed; use FT.SYNUPDATE instead. FT.SYNUPDATE requires both
    and index name and a synonym group ID. This ID can be any ASCII string.
  • #1437: Documents that expire during query execution time will not appear in the results (but might have been counted in the number of produced documents).
  • #1221: Synonyms support for lower case. This can result in a different result set on FT.SEARCH when using synonyms.
  • RediSearch will not index hashes whose fields do not match an existing index schema. You can see the number of hashes not indexed using FT.INFO - hash_indexing_failures . The requirement for adding support for partially indexing and blocking is captured here: #1455.
  • Removed support for NOSAVE (for details see v1.6 docs).
  • RDB loading will take longer due to the index not being persisted.
  • Field names in the query syntax are now case-sensitive.
  • Deprecated commands:
    • FT.DROP (replaced by FT.DROPINDEX, which by default keeps the documents)
    • FT.ADD (mapped to HSET for backward compatibility)
    • FT.DEL (mapped to DEL for backward compatibility)
    • FT.GET (mapped to HGETALL for backward compatibility)
    • FT.MGET (mapped to HGETALL for backward compatibility)
  • Removed commands:
    • FT.ADDHASH (no longer makes sense)
    • FT.SYNADD (see #1381)
    • FT.OPTIMIZE (see v1.6 docs)

Scaling a single index over multiple shards with the open source Redis cluster API

Previously, a single RediSearch index, and its documents, had to reside on a single shard. This meant that dataset size and throughput was bound to what a single Redis process could handle.

Redis Enterprise offered the ability to distribute documents in a clustered database and aggregate the results at query time. This fan-out and aggregation is handled by a component called the “coordinator” that is now also available under the same [Redis Source Available License] for all Redis OSS users in it's own repository RSCoordinator.

Notes:

  • The version inside Redis will be 20000 or 2.0.0 in semantic versioning. Since the version of a module in Redis is numeric, we could not add an GA flag.
  • Requires Redis v6 or above.