Missing records in search query, Manticore 5.0.2 #912

kalsan · 2022-10-14T13:43:26Z

Describe the bug
A clear and concise description of what the bug is.

To Reproduce

Setup manticore 5.0.2 with ThinkingSphinx
Add hundreds of thousands of records
Let it run in a rather busy production environment for a few weeks
Suddenly, records are missing in search results (the records were reported earlier and they were not deleted in the meantime)

Expected behavior

Return all records

Describe the environment:

Manticore 5.0.2 348514c86@220530 dev
64-bit Linux system

Messages from log files:

No relevant log entries. Manticore is of the opinion that everything works as expected.

Additional context

Restarting Manticore does not help. Re-indexing a record type fixes the problem for that type for the next few days or weeks, after which the problem occurs again. Occurence is random and unpredictable.

I conducted experiments suggesting that this is a Manticore and not a ThinkingSphinx bug, see pat/thinking-sphinx#1230 (comment)

The text was updated successfully, but these errors were encountered:

tomatolog · 2022-10-14T15:12:46Z

it could be better to provide reproducible example

kalsan · 2022-10-14T15:49:10Z

I'm afraid this bug is not reproducible in my lab setup, even with identical database. It requires a fair amount of reads and/or writes to occur.

tomatolog · 2022-10-17T10:00:51Z

could you check the index with indextool and in case check passed well upload the index files into our FTP along with query that should return documents were missed? along with documents id these are missed?

kalsan · 2022-11-25T09:12:08Z

Thank you for your advice. My first step was to downgrade Manticore to 4.2.0_211223.15e927b28-1 in order to see whether is affected as well. As the bug takes days to weeks to manifest, this issue evolves slowly. The bug also occured this week, so 4.2 is affected as well.

I'll look into indextool.

kalsan · 2023-02-24T09:16:18Z

The problem showed up today. Indextool reports: check passed. Since this produced no usable result, I'll check if and how I can obtain the requested data for the FTP upload without violating confidentiality rules. Thank you for your patience and sorry for the long process.

sanikolaev · 2023-02-24T09:36:44Z

@kalsan
We've switched to S3. Here's the instruction https://manual.manticoresearch.com/Reporting_bugs#Uploading-your-data on how to use it.

kalsan · 2023-02-28T10:25:18Z

@sanikolaev after thorough inspection of the index files, I'm certain that they are not the problem. I copied the same index files in both broken and working state. Both work perfectly fine in a lab, but the broken one only shows up 16 records where the working one shows 33. The file sizes are 38.3 KB (broken) vs 50.7 KB (working). Conclusion: The records are effectively not there anymore. Thus, the index files will be worthless to you, as they will simply contain roughly half of the data and be indistinguishable from a manual deletion of rows.

Trying another strategy. Is there a way to log all writing transactions (not just queries and events), in particular deletion statements?

tomatolog · 2023-02-28T19:25:20Z

you could set undocumented searchd.query_log_commands =1 along with searchd.query_log_format = sphinxql at your config and restart daemon after that all incoming SphinxQL statements will be logged into query.log

kalsan · 2023-03-03T08:55:54Z

Thanks for the info. The logging is now set up appropriately and produces the DELETE log lines I was hoping for. If the application is responsible for the lost records, it will show up in the logs. I will run the results next week.

kalsan · 2023-03-06T09:10:04Z

The logging produced an interesting result:

/* Mon Mar  6 03:18:47.064 2023 conn 5971 */ REPLACE INTO foo_core(id, `sphinx_internal_class_name`, ...) VALUES (123456, 'Foo', ...) # error=unknown column: 'sphinx_internal_class_name'

We have seen unknown column: 'sphinx_internal_class_name' before, it's a bug that occurs randomly when workers update a real time index. The bug is independent of the actual data and does not occur when starting an index job in the foreground. The bug has existed in Sphinx for years and appearently still exists in manticore. Behaviors when unknown column: 'sphinx_internal_class_name' occurs:

Sphinx: throws an error
Manticore: throws a warning, and as the item could not be inserted, the record is lost from the search index, producing this issue.

This means that the following issues are all related:

The reason this problem is not reproducible in a lab is because it only occurs when very large amounts of text are in the index.

kalsan · 2023-03-06T09:38:25Z

Additional info:

The problem only occurs when running mass-updates on data from a background job.
Mass-updating from a console over SSH with output being generated (currently, thinking-sphinx produces a dot for every update), avoids the problem.
=> This might be a timing issue where a Manticore/Sphinx buffer overflows when it gets too much data at once, and perhaps sending dots over SSH slows it down enough to prevent it?
Our search indices are just below 2 GB, we have about 100'000 records in the index.

sanikolaev · 2023-03-06T10:21:14Z

Manticore: throws a warning

How do I reproduce it? For me it throws an error:

mysql> drop table if exists foo_core; create table foo_core(a int); REPLACE INTO foo_core(id, `sphinx_internal_class_name`) VALUES (123456, 'Foo');
--------------
drop table if exists foo_core
--------------

Query OK, 0 rows affected (0.01 sec)

--------------
create table foo_core(a int)
--------------

Query OK, 0 rows affected (0.00 sec)

--------------
REPLACE INTO foo_core(id, `sphinx_internal_class_name`) VALUES (123456, 'Foo')
--------------

ERROR 1064 (42000): unknown column: 'sphinx_internal_class_name'
mysql>

kalsan · 2023-04-04T09:05:26Z

@sanikolaev thanks for looking into this. sphinx_internal_class_name is a field generated by the ThinkingSphinx Gem (https://github.com/pat/thinking-sphinx/).

After months of debugging, I still haven't found a way of reliably reproducing this issue. However, it appears to be resolved with Sphinx 3.5.1, released in February. As the cause of the issue is still unclear, I'd suggest to close this issue and let the it be until someone else also runs into this.

akostadinov · 2023-11-17T21:10:06Z

A little bit strange decision to close this because Sphinx 3.5.1 fixed it. Sphinx 3.x is unusable because it is not open source. And its fixes do not propagate to Manticore.

Why I'm here is actually because with Manticore 6.2.12 I face this issue with total_pages pat/thinking-sphinx#1213

I'm not sure how to reproduce exactly without thinking-sphinx but it appears to be a difference between total and total_found in search results. Which apparently worked differently in sphinx.

Any manticore developer has any idea why that might be?

tomatolog · 2023-11-17T22:01:23Z

you could reopen the issue with the reproducible case

I see no point keep ticket open without any progress on the first step - to get reproducible example that shows the case. Without the case we can not start investigation and provide the fix.

akostadinov · 2023-12-07T11:37:54Z

@tomatolog , makes sense, here I have something. I believe this is the same issue. Let me know if you want a new issue opened. And here's how to reproduce.

Run once manticore server and then sphinx 2.2.11 with the attached configuration file (manticore.conf.gz). Also add the following data to them:

REPLACE INTO account_core (id, `name`, `account_id`, `username`, `user_full_name`, `email`, `user_key`, `app_id`, `app_name`, `user_id`, `sphinx_internal_id`, `sphinx_internal_class`, `sphinx_deleted`, `sphinx_updated_at`, `provider_account_id`, `tenant_id`, `state`) VALUES (126, 'company1', '7', 'company1', ' ', 'foo7@example.net', '', '', '', '7', 7, 'Account', 0, 1701871556, 2, 2, 'created');
REPLACE INTO account_core (id, `name`, `account_id`, `username`, `user_full_name`, `email`, `user_key`, `app_id`, `app_name`, `user_id`, `sphinx_internal_id`, `sphinx_internal_class`, `sphinx_deleted`, `sphinx_updated_at`, `provider_account_id`, `tenant_id`, `state`) VALUES (144, 'company2', '8', 'company2', ' ', 'foo8@example.net', '', '', '', '8', 8, 'Account', 0, 1701871556, 2, 2, 'created');
REPLACE INTO account_core (id, `name`, `account_id`, `username`, `user_full_name`, `email`, `user_key`, `app_id`, `app_name`, `user_id`, `sphinx_internal_id`, `sphinx_internal_class`, `sphinx_deleted`, `sphinx_updated_at`, `provider_account_id`, `tenant_id`, `state`) VALUES (162, 'company3', '9', 'company3', ' ', 'foo9@example.net', '', '', '', '9', 9, 'Account', 0, 1701871557, 2, 2, 'approved');
REPLACE INTO account_core (id, `name`, `account_id`, `username`, `user_full_name`, `email`, `user_key`, `app_id`, `app_name`, `user_id`, `sphinx_internal_id`, `sphinx_internal_class`, `sphinx_deleted`, `sphinx_updated_at`, `provider_account_id`, `tenant_id`, `state`) VALUES (180, 'company4', '10', 'company4', ' ', 'foo10@example.net', '', '', '', '10', 10, 'Account', 0, 1701871557, 2, 2, 'created');
REPLACE INTO account_core (id, `name`, `account_id`, `username`, `user_full_name`, `email`, `user_key`, `app_id`, `app_name`, `user_id`, `sphinx_internal_id`, `sphinx_internal_class`, `sphinx_deleted`, `sphinx_updated_at`, `provider_account_id`, `tenant_id`, `state`) VALUES (198, 'company5', '11', 'company5', ' ', 'foo11@example.net', '', '', '', '11', 11, 'Account', 0, 1701871557, 2, 2, 'created');
REPLACE INTO account_core (id, `name`, `account_id`, `username`, `user_full_name`, `email`, `user_key`, `app_id`, `app_name`, `user_id`, `sphinx_internal_id`, `sphinx_internal_class`, `sphinx_deleted`, `sphinx_updated_at`, `provider_account_id`, `tenant_id`, `state`) VALUES (216, 'company6', '12', 'company6', ' ', 'foo12@example.net', '', '', '', '12', 12, 'Account', 0, 1701871557, 2, 2, 'created');
REPLACE INTO account_core (id, `name`, `account_id`, `username`, `user_full_name`, `email`, `user_key`, `app_id`, `app_name`, `user_id`, `sphinx_internal_id`, `sphinx_internal_class`, `sphinx_deleted`, `sphinx_updated_at`, `provider_account_id`, `tenant_id`, `state`) VALUES (234, 'company7', '13', 'company7', ' ', 'foo13@example.net', '', '', '', '13', 13, 'Account', 0, 1701871557, 2, 2, 'created');
REPLACE INTO account_core (id, `name`, `account_id`, `username`, `user_full_name`, `email`, `user_key`, `app_id`, `app_name`, `user_id`, `sphinx_internal_id`, `sphinx_internal_class`, `sphinx_deleted`, `sphinx_updated_at`, `provider_account_id`, `tenant_id`, `state`) VALUES (252, 'company8', '14', 'company8', ' ', 'foo14@example.net', '', '', '', '14', 14, 'Account', 0, 1701871557, 2, 2, 'approved');
REPLACE INTO account_core (id, `name`, `account_id`, `username`, `user_full_name`, `email`, `user_key`, `app_id`, `app_name`, `user_id`, `sphinx_internal_id`, `sphinx_internal_class`, `sphinx_deleted`, `sphinx_updated_at`, `provider_account_id`, `tenant_id`, `state`) VALUES (270, 'company9', '15', 'company9', ' ', 'foo15@example.net', '', '', '', '15', 15, 'Account', 0, 1701871557, 2, 2, 'created');
REPLACE INTO account_core (id, `name`, `account_id`, `username`, `user_full_name`, `email`, `user_key`, `app_id`, `app_name`, `user_id`, `sphinx_internal_id`, `sphinx_internal_class`, `sphinx_deleted`, `sphinx_updated_at`, `provider_account_id`, `tenant_id`, `state`) VALUES (288, 'company10', '16', 'company10', ' ', 'foo16@example.net', '', '', '', '16', 16, 'Account', 0, 1701871557, 2, 2, 'created');
REPLACE INTO account_core (id, `name`, `account_id`, `username`, `user_full_name`, `email`, `user_key`, `app_id`, `app_name`, `user_id`, `sphinx_internal_id`, `sphinx_internal_class`, `sphinx_deleted`, `sphinx_updated_at`, `provider_account_id`, `tenant_id`, `state`) VALUES (306, 'company11', '17', 'company11', ' ', 'foo17@example.net', '', '', '', '17', 17, 'Account', 0, 1701871557, 2, 2, 'created');
REPLACE INTO account_core (id, `name`, `account_id`, `username`, `user_full_name`, `email`, `user_key`, `app_id`, `app_name`, `user_id`, `sphinx_internal_id`, `sphinx_internal_class`, `sphinx_deleted`, `sphinx_updated_at`, `provider_account_id`, `tenant_id`, `state`) VALUES (324, 'company12', '18', 'company12', ' ', 'foo18@example.net', '', '', '', '18', 18, 'Account', 0, 1701871558, 2, 2, 'approved');
REPLACE INTO account_core (id, `name`, `account_id`, `username`, `user_full_name`, `email`, `user_key`, `app_id`, `app_name`, `user_id`, `sphinx_internal_id`, `sphinx_internal_class`, `sphinx_deleted`, `sphinx_updated_at`, `provider_account_id`, `tenant_id`, `state`) VALUES (324, 'company12', '18', 'company12', ' ', 'foo18@example.net', '', '', '', '18', 18, 'Account', 0, 1701871558, 2, 2, 'created');

Then run these commands on both:

SELECT * FROM account_core limit 2,4;
SHOW META:

The result for manticore is:

+----------------+-------+
| Variable_name  | Value |
+----------------+-------+
| total          | 6     |
| total_found    | 6     |
| total_relation | gte   |
| time           | 0.001 |
+----------------+-------+

While for sphinx it is:

+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| total         | 12    |
| total_found   | 12    |
| time          | 0.000 |
+---------------+-------+

I'm not sure whether data is the problem or the metadata only. For sure the metadata is a problem as mentioned in pat/thinking-sphinx#1213. And as far as I can tell accoding to documentation, thinking-sphinx is using and should be using total. See https://sphinxsearch.com/blog/2014/03/28/basics-of-paginating-results/

Thank you!

tomatolog · 2023-12-07T14:38:21Z

its a planned feature implicit cutoff described at our manual

⚠️ BREAKING CHANGE: Implicit cutoff. Manticore now doesn't spend time and resources processing data you don't need in the result set which will be returned. The downside is that it affects total_found in SHOW META and hits.total in JSON output. It is now only accurate in case you see total_relation: eq while total_relation: gte means the actual number of matching documents is greater than the total_found value you've got. To retain the previous behaviour you can use search option cutoff=0 , which makes total_relation always eq .

in case you use default sorting implicit cutoff is on

akostadinov · 2023-12-07T17:25:03Z

That's very important to know, thank you. I didn't see it in the migration notes because apparently it came later. Can it be set somehow globally to avoid the need to change the application?

One other question, are there any other changes that would interfere with pagination as described in the blog post I linked in my previous comment?

akostadinov · 2023-12-07T17:32:34Z

Actually, I just tried it again with

SELECT * FROM account_core limit 2,4 OPTION cutoff=0;

And the result is

+----------------+-------+
| Variable_name  | Value |
+----------------+-------+
| total          | 6     |
| total_found    | 12    |
| total_relation | eq    |
| time           | 0.000 |
+----------------+-------+
4 rows in set (0,001 sec)

Why does still total equal 6? I'm inserting 12 records?

P.S. Thinking Sphinx already does cutoff=0 in pat/thinking-sphinx#1239.

sanikolaev · 2023-12-07T17:52:15Z

Why does still total equal 6? I'm inserting 12 records?

Because total is calculated as limit + offset. What you need to be looking at is total_found, not total and it's 12 in your case. We recently discussed in Manticore team, that total is confusing and we are thinking about removing it at all in future releases.

Can it be set somehow globally to avoid the need to change the application?

Unfortunately not. Another approach BTW is to just do select count(*) like you would do in mysql/postgres etc. SHOW META is a nice feature, but it comes with some limitations.

akostadinov · 2023-12-07T18:01:32Z

@sanikolaev , in my previous comment I linked to this sphinx doc:
https://sphinxsearch.com/blog/2014/03/28/basics-of-paginating-results/

They specifically advise against using total_found for pagination because it may show a higher number than actual results that would be returned. And they mandate total for that purpose.

Is Manticore's meaning of total (and by extension pagination) deliberately different than sphinx' and is it documented somewhere so projects can rely on it's stability over time?

P.S. actually if you consider removing total maybe make it compatible with sphinx 2.2.11 instead to make migration smoother?

sanikolaev · 2023-12-08T17:58:00Z

@sanikolaev , in my previous comment I linked to this sphinx doc:
https://sphinxsearch.com/blog/2014/03/28/basics-of-paginating-results/

When we do the pagination we should consider the total, because if we use total_found we might try to display more pages than we can actually show

I have another opinion: The user can decide themselves how many pages/docs they want to show, if the total_found is too high, it's not a big deal to do: if total_found is > 1000, total_found = 1000 in the code.

Is Manticore's meaning of total (and by extension pagination) deliberately different than sphinx' and is it documented somewhere so projects can rely on it's stability over time?

akostadinov · 2023-12-08T18:17:06Z

The point is that user cannot show all total_found records if desired as far as I understand. Or is this possible with manticore?

And again, is there any technical difficulty to make total compatible with sphinx? If total is to be removed, from a user standpoint of view it would be much nicer to instead make it return what sphinx returns. Or if all records of total_found are accessible to the user in manticore, then perhaps make total and total_found return the same number to allow same pagination logic as described for sphinx. Presently pagination using thinking-sphinx is not working because of this difference.

PS cutoff is fine, thinking-sphinx already sets that to 0

PPS I honestly can't understand the SHOW_META documentation about total. I don't see details about its meaning in a paginated query. In the cutoff documentation I only see total_found mentioned.

sanikolaev · 2023-12-08T18:35:49Z

The point is that user cannot show all total_found records if desired as far as I understand. Or is this possible with manticore?

By "show all total_found records" do you mean that if total_found is 1 billion, then you can't get 1B docs at once? It's possible, but you'll have to increase max_matches manually.

If total is to be removed, from a user standpoint of view it would be much nicer to instead make it return what sphinx returns.

Only from Sphinx user's standpoint.

perhaps make total and total_found return the same number

It would be even more confusing than now. We'll of course think about backwards compatibility if we decide to remove total and will perhaps add some searchd.show_meta_total = 1 to keep it for those who can't change their app for some reason, but that wouldn't be the default.

PS cutoff is fine, pat/thinking-sphinx#1239 already sets that to 0

Then total_found should be accurate. If it's not, we need a reproducible example to fix it.

Presently pagination using thinking-sphinx is not working because of this difference.

total_found is to be used.

PPS I honestly can't understand the SHOW_META documentation about total. I don't see details about its meaning in a paginated query.

That's why we are thinking about removing it at all. It's meaningless.

In the cutoff documentation I only see total_found mentioned.

I've created this task to improve the docs #1670 . PRs are welcome. The docs are here https://github.com/manticoresoftware/manticoresearch/tree/master/manual

akostadinov · 2023-12-08T18:59:34Z

By "show all total_found records" do you mean that if total_found is 1 billion, then you can't get 1B docs at once? It's possible, but you'll have to increase max_matches manually.

That's not really a solution. I'm all talking about pagination, so it is not about viewing them at once. And I don't even care so much that the user can't see them all. It is more about UI design. With pagination usually users see first couple of pages and last couple of pages when many pages are available.

And it is common for the users to go to the last page. If total_found showed a number that cannot be retrieved, then showing last page will apparently fail and break the UI.

So it is necessary to have a META that will indicate the max records possible to be ever retrieved.

Modifying max_matches makes things just too complicated. It can't be made one size fits all. For small installations it would be fine. And then all of a sudden with increased data, things will start breaking.

From any pagination user standpoint of view, some way to get the max accessible from meta is important. I would argue that total_found is useless if it shows numbers that cannot ever be retrieved. Why do I need to know the number if I can't get that many records (except for curiosity maybe)?

sanikolaev · 2023-12-09T07:25:34Z

With pagination usually users see first couple of pages and last couple of pages when many pages are available.
And it is common for the users to go to the last page

Amazon:

Ebay:

Google:

+ autoload of the next page

Yahoo:

Duckduckgo:

Quora, Reddit, Youtube

Autoload, no pagination, no search results count.

Walmart:

I'm not sure how common this feature is. It may have been more prevalent in previous decades, but it doesn't seem to be the case now. Why would I care about what's on the 50th page? The best results are typically found within the first few pages. Even clicking "next page" is a hassle since this can be automated by scrolling down to the end of the page. Displaying something like 1 2 ... 49 50 is not user-friendly. Showing an accurate number of results may make sense in some use cases. In most cases, however, approximations like 1000+, over 1000, or another similar format (like in Google and Yahoo) are sufficient. What's important nowadays is displaying the most relevant results on the first page and ensuring quick response times. That's the approach taken by platforms like DuckDuckGo, Quora, Reddit, and YouTube.

I could find one website which shows the last page and let me visit it, but when I visit it I see more pages 🤦 - Stackoverflow:

Quite confusing behaviour.

From any pagination user's standpoint, having some way to access the maximum number of pages via metadata is important.

If you want to cap the document count at 1000, similar to how total worked in Sphinx, just use cutoff=1000 and refer to total_found.

sanikolaev · 2023-12-09T07:30:08Z

Modifying max_matches makes things just too complicated

Valid point. The task to simplify it is #1607

akostadinov · 2023-12-11T23:54:49Z

I'm not sure the provided examples can be classified as typical applications or that they provide a full picture of the use cases.

Here's a specific use case from the project I'm working on is

This is a table of available accounts. Some installations may have 10 accounts, some - thousands (other objects are listed in a similar way but 1 example should illustrate well enough). The table can be filtered by multiple criteria (including full-text search) as well sorted. Must be easy for users to scroll easily through many pages. IMHO it's not practical to change the page selection elements depending on whether full-text search was used or not. Maybe it's good technically but I expect confusion from the users.

If server already knows the max accessible number, why shouldn't it provide it through metadata? Subject to limitations for performance reasons that app developers can override like presently with the cutoff. Just a courtesy to reduce complexity of app code and configuration.

My undestanding is that there is no technical difficulty for Manticore to provide a max accessible value in META by something like

total_or_whatever_other_name: ( total_found < max_matches ? total_found : max_matches )

The task you created is nice and would resolve the use case in an alternative way. Thank you for it! I expect though that ops would prefer to set a high non-overridable max_matches limit, instead of risking the reckless developers to break the server with a bad query.

akostadinov · 2024-01-29T11:09:11Z

If total is to be removed, from a user standpoint of view it would be much nicer to instead make it return what sphinx returns.

Only from Sphinx user's standpoint.

It should be better from any user perspective. Presently that information is not available at all.

If you want add another metadata field with same information (the max possible matches). Then it will be sphinx vs non-sphinx users. But also libraries can abstract this for the developer in this way without the need to add and match configuration options on server and client.

sanikolaev · 2024-01-30T08:34:31Z

BTW in newer Sphinx versions the total in SHOW META also doesn't reflect the current max_matches:

root@21ffe1f7144c:/sphinx-3.6.1/bin# ./searchd -v
Sphinx 3.6.1 (commit c9dbedab)
Copyright (c) 2001-2023, Andrew Aksyonoff
Copyright (c) 2008-2016, Sphinx Technologies Inc (http://sphinxsearch.com)

MySQL [(none)]> select * from t; show meta;
+------+
| id   |
+------+
|    1 |
|    2 |
|    3 |
|    4 |
|    5 |
|    6 |
|    7 |
|    8 |
|    9 |
|   10 |
|   11 |
|   12 |
|   13 |
|   14 |
|   15 |
|   16 |
|   17 |
|   18 |
|   19 |
|   20 |
+------+
20 rows in set (0.001 sec)

+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| total         | 20    |
| total_found   | 10001 |
| time          | 0.000 |
+---------------+-------+
3 rows in set (0.000 sec)

akostadinov · 2024-01-30T11:25:40Z

total | 20 appears correct to me, no?

Regardless, such a meta would be useful whether we call it found or something else. I don't care so much if it is compatible with sphinx or not (although it would be easier if it is). Just to me it is a non brainer that such a value is useful because once the configuration is in the server, there is no additional configuration needed on the client side.

FYI presently I decided to hardcode this value of 1000 in a monkey patch. This is not cool but I see no other feasible option. The dynamic max_matches (when/if it becomes available) is unsafe if too many records are found so I prefer to hardcode the limit instead of relying on luck.

ThinkingSphinx::Masks::PaginationMask.prepend(Module.new do
  def total_pages
    return 0 unless search.meta['total_found']

    # 1000 is the default server max_matches value. We should stay at or below the server setting here.
    @total_pages ||= ([total_entries, 1000].min / search.per_page.to_f).ceil
  end
end)

sanikolaev · 2024-01-30T12:13:55Z

total | 20 appears correct to me, no?

The same behaviour is in Manticore, but previously you insisted that total should show how many documents you can paginate through without increasing max_matches.

akostadinov · 2024-01-30T12:27:18Z

@sanikolaev , I assume 10001 is all matching records but max_matches is 20, what total amount for. Or is it different?

For explanation of the issue, please check my comment starting #912 (comment)

The difference with sphinx 2.11 comes when using limits for pagination purposes. For normal queries this metadata is irrelevant.

The behavior is definitely different from sphinx 2.11 because the thinking-sphinx library properly paginates with it while it doesn't with manticore. See pat/thinking-sphinx#1239

Some people reported that it also worked with 3.x but it is not open source so it is not an option for me.

As I said, it is really not that important whether manticore is fully sphinx compatible. But this is useful information, I assume easy for the server to return. Because server already has total_found and already knows it's configured max_matches so it is a simple calculation to return it in some metadata. Be it as found or something else.

The workaround being to have a client option of total records that can be returned which has to be kept in sync with the server configuration by the developer/ops.

sanikolaev added the waiting Waiting for the original poster (in most cases) or something else label Nov 25, 2022

kalsan mentioned this issue Jan 4, 2023

total_pages incorrect when using Manticore as backend pat/thinking-sphinx#1230

Closed

kalsan mentioned this issue Mar 6, 2023

unknown column: 'sphinx_internal_class_name' pat/thinking-sphinx#1219

Closed

sanikolaev closed this as completed Apr 5, 2023

sanikolaev removed the waiting Waiting for the original poster (in most cases) or something else label Apr 5, 2023

akostadinov mentioned this issue Nov 18, 2023

fixes a weird bug: search.total_pages is always 1 pat/thinking-sphinx#1213

Merged

akostadinov mentioned this issue Jan 30, 2024

[THREESCALE-10375] manticore support 3scale/porta#3678

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Missing records in search query, Manticore 5.0.2 #912

Missing records in search query, Manticore 5.0.2 #912

kalsan commented Oct 14, 2022

tomatolog commented Oct 14, 2022

kalsan commented Oct 14, 2022

tomatolog commented Oct 17, 2022

kalsan commented Nov 25, 2022

kalsan commented Feb 24, 2023

sanikolaev commented Feb 24, 2023

kalsan commented Feb 28, 2023

tomatolog commented Feb 28, 2023 •

edited

Loading

kalsan commented Mar 3, 2023

kalsan commented Mar 6, 2023

kalsan commented Mar 6, 2023

sanikolaev commented Mar 6, 2023

kalsan commented Apr 4, 2023

akostadinov commented Nov 17, 2023 •

edited

Loading

tomatolog commented Nov 17, 2023

akostadinov commented Dec 7, 2023

tomatolog commented Dec 7, 2023

akostadinov commented Dec 7, 2023

akostadinov commented Dec 7, 2023 •

edited

Loading

sanikolaev commented Dec 7, 2023

akostadinov commented Dec 7, 2023 •

edited

Loading

sanikolaev commented Dec 8, 2023

akostadinov commented Dec 8, 2023

sanikolaev commented Dec 8, 2023

akostadinov commented Dec 8, 2023

sanikolaev commented Dec 9, 2023

sanikolaev commented Dec 9, 2023

akostadinov commented Dec 11, 2023

akostadinov commented Jan 29, 2024

sanikolaev commented Jan 30, 2024

akostadinov commented Jan 30, 2024 •

edited

Loading

sanikolaev commented Jan 30, 2024

akostadinov commented Jan 30, 2024

Missing records in search query, Manticore 5.0.2 #912

Missing records in search query, Manticore 5.0.2 #912

Comments

kalsan commented Oct 14, 2022

tomatolog commented Oct 14, 2022

kalsan commented Oct 14, 2022

tomatolog commented Oct 17, 2022

kalsan commented Nov 25, 2022

kalsan commented Feb 24, 2023

sanikolaev commented Feb 24, 2023

kalsan commented Feb 28, 2023

tomatolog commented Feb 28, 2023 • edited Loading

kalsan commented Mar 3, 2023

kalsan commented Mar 6, 2023

kalsan commented Mar 6, 2023

sanikolaev commented Mar 6, 2023

kalsan commented Apr 4, 2023

akostadinov commented Nov 17, 2023 • edited Loading

tomatolog commented Nov 17, 2023

akostadinov commented Dec 7, 2023

tomatolog commented Dec 7, 2023

akostadinov commented Dec 7, 2023

akostadinov commented Dec 7, 2023 • edited Loading

sanikolaev commented Dec 7, 2023

akostadinov commented Dec 7, 2023 • edited Loading

sanikolaev commented Dec 8, 2023

akostadinov commented Dec 8, 2023

sanikolaev commented Dec 8, 2023

akostadinov commented Dec 8, 2023

sanikolaev commented Dec 9, 2023

Amazon:

Ebay:

Google:

Yahoo:

Duckduckgo:

Quora, Reddit, Youtube

Walmart:

sanikolaev commented Dec 9, 2023

akostadinov commented Dec 11, 2023

akostadinov commented Jan 29, 2024

sanikolaev commented Jan 30, 2024

akostadinov commented Jan 30, 2024 • edited Loading

sanikolaev commented Jan 30, 2024

akostadinov commented Jan 30, 2024

tomatolog commented Feb 28, 2023 •

edited

Loading

akostadinov commented Nov 17, 2023 •

edited

Loading

akostadinov commented Dec 7, 2023 •

edited

Loading

akostadinov commented Dec 7, 2023 •

edited

Loading

akostadinov commented Jan 30, 2024 •

edited

Loading