Implement subset of RediSearch #431

romange · 2022-10-24T17:39:18Z

Quoting a nodejs user, they want to create an index like this:

import { SchemaFieldTypes } from 'redis';
import { redis } from '..';

redis.ft
    .CREATE(
        'index:channels',
        {
            '$.id': {
                type: SchemaFieldTypes.TEXT,
                AS: 'id',
                SORTABLE: true,
            },
            '$.guildId': {
                type: SchemaFieldTypes.TEXT,
                AS: 'guildId',
                SORTABLE: true,
            },
         
            '$.position': {
                type: SchemaFieldTypes.NUMERIC as any,
                AS: 'position',
                SORTABLE: true,
            },
        },
        {
            ON: 'JSON',
            PREFIX: 'channels',
        }
    )
    .catch(() => null);

and then be able to query it like this:

await utils.redisSearch('index:channels', `@guildId:${id}`);

note - we do not need a full-text search, stemming, query rewrite and other language related features.
Instead this task is about formal, semi-structured querying that will provide lots of value for folks that use RedisJson.

The task is a super task that should be broken down into smaller sub-projects:

Auto indexing (FT.CREATE)
Building a query AST tree with all the operators we support.
Executing a query without query plan optimizations.

iko1 · 2022-10-26T19:58:16Z

I think you meant writing in the title RedisSearch and not JsonSearch.

romange · 2022-10-27T04:52:35Z

You are right, but since the subset of functionality I want to focus on is within JSON , this mistake makes sense 😄

romange · 2023-04-29T13:05:01Z

Could be a great MVP for query part

sirfz · 2023-05-06T06:58:08Z

I'm not a RediSearch user (yet) but have been very interested in it recently as it seems to be exactly what I need for my use case.

In particular, the vector similarity search can be a killer feature to have in dragonfly.

In general, RediSearch seems to be an all around great feature and having it in dragonfly, in my opinion, would bring lots of adoption. Just my 2 cents

totorofly · 2023-05-09T15:14:16Z

My project heavily utilizes the combination of RediSearch and RediJSON, requiring roughly 100-300 FT.SEARCH commands per second in 2-3 million records to obtain results that meet various conditions. Additionally, the TTL of my 200-300 million records is only around 180-480 seconds, meaning the load on both writing and reading (FT.SEARCH, with the requirement that the average query result returns within 300ms) from the Redis cluster is quite high. As a result, I had to build a Redis cluster to meet these demands, which makes the overall maintenance cost relatively high. Therefore, I'm looking for an architecture that can achieve this effect at a lower cost. If DragonFlyDB can provide full-text search capabilities similar to RediSearch and RediJSON, I would be willing to give it a try.

romange · 2023-05-09T15:24:59Z

Can you provide an example for a typical query that you send? Do you need word stemming, multiple languages support in full text search?

totorofly · 2023-05-09T15:34:05Z

Can you provide an example for a typical query that you send? Do you need word stemming, multiple languages support in full text search?

romange · 2023-05-09T15:48:17Z

Looks like a structured search, do not see here any full text-search requirements but maybe i am missing something.

…

totorofly · 2023-05-09T15:58:27Z

Looks like a structured search, do not see here any full text-search requirements but maybe i am missing something.
…
On Tue, May 9, 2023, 18:34 0.618 @.> wrote: Can you provide an example for a typical query that you send? Do you need word stemming, multiple languages support in full text search? I currently do not need to use stemming because my project is mainly to help users match mobile phone numbers related to their favorite numbers. Since the matching is all about numeric strings, even if it is a Chinese project, I don't need to use Chinese, just numbers and English letters. Here is my search example: FT.SEARCH tm @.:lastABABAB|anyABCDABCD|lastAAABBB|lastAABBCC|lastABCABC|lastABCDDBCAXXX|anyAABBCC|anyAAABBB|lastAAAAB|lastAAAAA|anyABCDEF|anyAAAAA|lastAABBB|lastABCDABDCXXX|lastABCDBACD|lastABCDBACDXXX|lastABCDDCBA|lastAAAA|lastABCDACBDXXX|lastABBA|lastABBCBB|lastABCDABDC|anyAAAA|lastAABB|anyABABAB|anyABCABC|anyAAAAB|midAAAA|anyAAABB|anyABBCBB|lastABABtu368|anyAABBB|lastABBB|lastABABtu613|lastABAB|lastABABtu850|midBAAA|lastABCD|midAAAB|lastAABCC|midAABB|lastBrithYear758799|lastAAAB|midABCD|midABAB|any888|any666|lastAABAAXXX|lastABCCBAXX|anyABAB|anyABABtu368|anyBrithYear758799|anyABBA|anyAABB|anyABABtu850|anyABABtu613|lastABB|anyABCD|lastXAXAXAXA|lastXAXAXA|lastABAC|lastAXAXAX|anyAAA|head1889|lastABACAD|anyABBCDD|lastAXAXAXAX @ttlInSecond:[1683364311 +inf] @providerCode:jyxf @status:1 @preOrderTime:[-inf (1683364009] @TouchCode:P00000035328 @province:beijing @city:beijing ' LIMIT 0 0 — Reply to this email directly, view it on GitHub <#431 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA4BFCHVM5YEHKYLCMXDPWTXFJPXRANCNFSM6AAAAAARNGMONU . You are receiving this because you authored the thread.Message ID: @.***>

I use full-text search by decomposing all possible combinations of a phone number into individual keys in a JSON object and then setting these JSON keys as index values in RediSearch. For example, in order to match phone numbers similar to a user's license plate number, I specifically decompose various possible combinations of 5 consecutive digits of the target phone number, as follows:

{
    "phone": "13085669245",
    "status": "1",
    "owner": "",
    "ttlInSecond": 0,
    "preOrderTime": 0,
    "providerCode": "beijing", 
     "rule_car": {
        "any5": "69245",
        "any4": "X9245,6X245,69X45,692X5,6924X",
        "any3": "XX245,X9X45,X92X5,X924X,6XX45,6X2X5,6X24X,69XX5,69X4X,692XX",
        "tail5": "69245",
        "tail4": "9245",
        "tail3": "245",
        "continuous5": "66924,56692,85669,08566,30856,13085",
        "continuous4": "6924,6692,5669,8566,0856,3085,1308",
        "continuous3": "924,692,669,566,856,085,308,130"
    },
}

sirfz · 2023-05-09T16:15:49Z

In my case, fulltext search is the least interesting feature of RediSearch. I'm more interested in running queries like:

FT.SEARCH items-index "(@brand:xxx @model:xxx)=>[KNN 10 @vector $vector as score]" ...

which translates to something like: for all items that match brand xxx and model xxx, get me the top 10 closest items to the given $vector. Stemming/normalization could be useful for the attributes filter I guess but the power is more about searching multiple attributes and returning other attriubutes/columns (and of course the vector similarity search is great).

totorofly · 2023-05-11T15:42:00Z

In Redis Cluster mode, I am unable to simultaneously call FT.SEARCH and JSON.SET operations within a single Lua script, because doing so involves different hosts and different slots, and cross-slot combination operations are not supported. This is one of the areas where I think Redis Cluster mode is not as perfect as it could be.

romange · 2023-06-04T10:26:15Z

@sirfz hey can you DM me on discord? I am curious to hear more about your usecase.

romange · 2023-09-07T18:00:52Z

@sirfz @dwzkit we will have an experimental version of FT.SEARCH in v1.10 (next release).
Would you like to try it out?

totorofly · 2023-09-08T01:21:32Z

@sirfz @dwzkit we will have an experimental version of FT.SEARCH in v1.10 (next release). Would you like to try it out?

I'm sorry, I've been busy with other projects recently and may not have time to experiment for the next two months.

romange changed the title ~~Implement subset of JsonSearch~~ Implement subset of RediSearch Apr 29, 2023

romange assigned dranikpg Apr 29, 2023

romange mentioned this issue May 4, 2023

feat: simple AST for search #1175

Merged

romange mentioned this issue Sep 7, 2023

Request for Compatibility with Open-Source Search Engines and JSON Processing Libraries #1229

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement subset of RediSearch #431

Implement subset of RediSearch #431

romange commented Oct 24, 2022 •

edited

iko1 commented Oct 26, 2022

romange commented Oct 27, 2022

romange commented Apr 29, 2023

sirfz commented May 6, 2023

totorofly commented May 9, 2023

romange commented May 9, 2023

totorofly commented May 9, 2023 •

edited

romange commented May 9, 2023 via email

totorofly commented May 9, 2023

sirfz commented May 9, 2023

totorofly commented May 11, 2023

romange commented Jun 4, 2023

romange commented Sep 7, 2023

totorofly commented Sep 8, 2023

Implement subset of RediSearch #431

Implement subset of RediSearch #431

Comments

romange commented Oct 24, 2022 • edited

iko1 commented Oct 26, 2022

romange commented Oct 27, 2022

romange commented Apr 29, 2023

sirfz commented May 6, 2023

totorofly commented May 9, 2023

romange commented May 9, 2023

totorofly commented May 9, 2023 • edited

romange commented May 9, 2023 via email

totorofly commented May 9, 2023

sirfz commented May 9, 2023

totorofly commented May 11, 2023

romange commented Jun 4, 2023

romange commented Sep 7, 2023

totorofly commented Sep 8, 2023

romange commented Oct 24, 2022 •

edited

totorofly commented May 9, 2023 •

edited