New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FT.SEARCH output incorrect (FT.ADD not creating tokens correctly).. #168
Comments
the index is limited to 32 text fields and unlimited numeric fields. However if some of them are used only for sorting, I can release a fix that makes them not count in those 32. Will that work for you? |
you can maybe solve it by indexing the yes/no fields as numeric and not text for example. |
Thanks Dvir for your prompt response. Yes, it would be great if you can release a fix that makes them not count in those 32. Also, In the above example, there are 25 text fields and 8 numeric fields. So, ideally it should work (as numeric fields would not be counted) ? |
The numeric fields might be mistakenly counted in those 32 as well, let me check. In anyway it should not fail silently. I'll try to release a fix for this ASAP |
Yeah, there was a bug and they were counted together, I fixed it and now writing a test for a huge schema. It should be pushed soon and you can try it. |
OK, it should work now. Please pull master and try. |
Yes, working fine now. Thanks |
@arora-kushal nice! there are tests for big schema now (64 fields), and the actual limitation is 1024 fields (I should document it somewhere). Also, it doesn't just fail silently and tells you if there are more than 32 text fields. I would have added more, but I'm marking the field ids for each term in each document with a bitmask that is either 8,16,24 or 32 bits, to allow filtering for many fields at constant speed. The encoding schema doesn't allow more than 32 bits for the field mask at the moment, and most people will probably not use more than 8. I do however want to add a new type of field, I call it a "tag field", much like indexing a VARCHAR field in SQL. It will be just like a text field, but is not tokenized, and cannot be refered to without the field name, thus it doesn't need a bit mask. This will allow unlimited fields for things like (in your schema) |
Thanks for explaining it in detail. Looking forward for this solution as it will be very useful for us. |
Hi Dvir, Do you have any update on this? Actually, We are building a solution where we have schema of more than 32 text fields and it requires filtering. So, the solution you have mentioned above will be very useful for us. |
@arora-kushal Hi, it's not ready yet, I've started implementing it but have been tied in other stuff. I don't want to promise anything but it's coming soon. |
We are building a similar solution where we have around 60 text fields on which searching/filterting can be applied. We will create two indexes for a schema containing fields as follows: Any suggestions will be highly appreciated. |
@dvirsky Could you please guide us if we are thinking in right direction? Or there could be better workaround for it? |
It's a hack, indeed, but it will work.
As long as there aren't too many results to load on the first query (A few thousands should be fine), it will work, and can be a good temporary solution. |
Hi,
We are facing an issue in RediSearch 0.21.0. The command is giving wrong output.
Could anyone please help me on this. The reproduction steps are mentioned below.
Thanks !
## Commands to setup:
"FT.CREATE" "VVV" "SCHEMA" "VVId_s" "TEXT" "SORTABLE" "PrimaryBIN_s" "TEXT" "SORTABLE" "StreetAddress_s" "TEXT" "SORTABLE" "PrimaryAddress_s" "TEXT" "SORTABLE" "CreationDate_l" "NUMERIC" "SORTABLE" "NormalizedSeverity_s" "TEXT" "SORTABLE" "SourceSeverity_s" "TEXT" "SORTABLE" "ClosedDate_l" "NUMERIC" "SORTABLE" "Description_s" "TEXT" "SORTABLE" "RemediationAgency_s" "TEXT" "SORTABLE" "Group_s" "TEXT" "SORTABLE" "SiteCompliId_l" "NUMERIC" "SORTABLE" "Bin_s" "TEXT" "SORTABLE" "Conditions_s" "TEXT" "SORTABLE" "Owned_s" "TEXT" "SORTABLE" "IsCondition_s" "TEXT" "SORTABLE" "VVCost_s" "TEXT" "SORTABLE" "CAPTracker_s" "TEXT" "SORTABLE" "Fixer_s" "TEXT" "SORTABLE" "OverallStatus_s" "TEXT" "SORTABLE" "CombinedStatus_s" "TEXT" "SORTABLE" "IsExcluded_s" "TEXT" "SORTABLE" "UnitNumber_s" "TEXT" "SORTABLE" "Borough_s" "TEXT" "SORTABLE" "DateAddedtoBCS_l" "NUMERIC" "SORTABLE" "UpdatedBy_s" "TEXT" "SORTABLE" "UpdatedAt_l" "NUMERIC" "SORTABLE" "Created_At_l" "NUMERIC" "SORTABLE" "Created_At_Date_l" "NUMERIC" "SORTABLE" "RemediationOwner_s" "TEXT" "SORTABLE" "VVType_s" "TEXT" "SORTABLE" "IsActive_l" "NUMERIC" "SORTABLE" "IsActiveText_s" "TEXT" "SORTABLE"
"FT.ADD" "VVV" "ss:d:{#ClientList}50" "1" "FIELDS" "VVId_s" "123456789K" "PrimaryBIN_s" "2" "StreetAddress_s" "1800" "PrimaryAddress_s" "1800" "CreationDate_l" "6" "NormalizedSeverity_s" "days" "SourceSeverity_s" "NO" "ClosedDate_l" "222" "Description_s" "Work" "RemediationAgency_s" "U" "Group_s" "2" "SiteCompliId_l" "0" "Bin_s" "2" "Conditions_s" "2" "Owned_s" "Private" "IsCondition_s" "No" "VVCost_s" "2" "CAPTracker_s" "2" "Fixer_s" "2" "OverallStatus_s" "O" "CombinedStatus_s" "Open" "IsExcluded_s" "No" "UnitNumber_s" "2" "Borough_s" "BBBB" "DateAddedtoBCS_l" "232" "UpdatedBy_s" "2" "UpdatedAt_l" "32" "Created_At_l" "6" "Created_At_Date_l" "6" "RemediationOwner_s" "U" "VVType_s" "C" "IsActive_l" "2" "IsActiveText_s" "Yes"
Search Command : "FT.SEARCH" "VVV" "@IsActiveText_s:(Yes)"
Expected output : "ss:d:{#ClientList}50"
Actual Output (INCORRECT): 0
However, if the above commands have less columns, then it gives correct output. For e.g.:
"FT.CREATE" "VVV" "SCHEMA" "VVId_s" "TEXT" "SORTABLE" "IsActiveText_s" "TEXT" "SORTABLE"
"FT.ADD" "VVV" "ss:d:{#ClientList}50" "1" "FIELDS" "VVId_s" "123456789K" "IsActiveText_s" "Yes"
Search Command : "FT.SEARCH" "VVV" "@IsActiveText_s:(Yes)"
Expected output : "ss:d:{#ClientList}50"
Actual Output (CORRECT): "ss:d:{#ClientList}50"
The text was updated successfully, but these errors were encountered: