-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(mango
): prevent occasional duplication of paginated text
results
#4782
Conversation
7770d75
to
6a6d9c0
Compare
The problem with search-backed mango queries is the execute/1 function that slurps up all the matches in a loop before passing them over for processing. We should fix that. If we do, then there will be no need for |
ignore last comment. re-reading this and happy that execute/1, execute/3, handle_hit/3 exit when we've seen |
cc38c8f
to
c43f45b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding extra tests and types, @pgj!
I did notice however that now dialyzer complains more compared to main:
=> mango (dialyze)
src/mango_crud.erl:103:1: Function explain/3 has no local return
src/mango_cursor.erl:109:1: Function extract_candidate_indexes/1 has no local return
src/mango_cursor.erl:168:9: The call mango_cursor:tag_elems
(#{'ranking' := _, 'usable' := 'false'},
NotUsableIndexes :: sets:set(_)) breaks the contract
(properties(), sets:set(#idx{})) -> [{#idx{}, properties()}]
src/mango_cursor.erl:260:1: Function extract_selector_hints/1 will never be called
src/mango_cursor.erl:304:1: Function explain/1 has no local return
src/mango_cursor_text.erl:79:1: Function explain/1 has no local return
src/mango_cursor_text.erl:86:41: The call mango_cursor_text:get_partition
(Opts :: any(),
'null') breaks the contract
(Options, Partition) -> Partition
when Partition :: partition(), Options :: cursor_options()
src/mango_cursor_text.erl:93:1: Function execute/3 has no local return
src/mango_cursor_text.erl:106:41: The call mango_cursor_text:get_partition
(Opts :: any(),
'nil') breaks the contract
(Options, Partition) -> Partition
when Partition :: partition(), Options :: cursor_options()
src/mango_cursor_text.erl:153:1: Function execute/1 will never be called
src/mango_cursor_text.erl:171:1: Function search_docs/1 will never be called
src/mango_cursor_text.erl:187:1: Function handle_hits/2 will never be called
src/mango_cursor_text.erl:195:1: Function handle_hit/3 will never be called
src/mango_cursor_text.erl:242:1: Function apply_user_fun/2 will never be called
src/mango_cursor_text.erl:263:1: Function sort_query/2 will never be called
src/mango_cursor_text.erl:288:2: Invalid type specification for function mango_cursor_text:get_partition/2. The success typing is
(_, 'nil' | 'null') -> any()
src/mango_cursor_text.erl:300:1: Function get_bookmark/1 will never be called
src/mango_cursor_text.erl:309:1: Function update_bookmark/2 will never be called
src/mango_cursor_text.erl:318:1: Function pack_bookmark/1 will never be called
src/mango_cursor_text.erl:343:1: Function ddocid/1 will never be called
src/mango_cursor_text.erl:352:1: Function update_query_args/1 will never be called
src/mango_cursor_text.erl:363:1: Function get_limit/1 will never be called
src/mango_cursor_text.erl:374:1: Function get_json_docs/2 will never be called
src/mango_idx.erl:56:2: Invalid type specification for function mango_idx:get_usable_indexes/4. The success typing is
(_, _, maybe_improper_list(), _) ->
{[any()],
#{'all_indexes' := sets:set(_),
'global_indexes' := sets:set(_),
'partition_indexes' := sets:set(_),
'usability_map' := [any()],
'usable_indexes' := sets:set(_)}}
vs
% make dialyze apps=mango
==> mango (build-plt)
WARN: ''build-plt'' command does not apply to directory /Users/nvatama/asf/rel
WARN: ''build-plt'' command does not apply to directory /Users/nvatama/asf
==> mango (dialyze)
src/mango_crud.erl:103:1: Function explain/3 has no local return
src/mango_cursor.erl:109:1: Function extract_candidate_indexes/1 has no local return
src/mango_cursor.erl:168:9: The call mango_cursor:tag_elems
(#{'ranking' := _, 'usable' := 'false'},
NotUsableIndexes :: sets:set(_)) breaks the contract
(properties(), sets:set(#idx{})) -> [{#idx{}, properties()}]
src/mango_cursor.erl:260:1: Function extract_selector_hints/1 will never be called
src/mango_cursor.erl:304:1: Function explain/1 has no local return
src/mango_idx.erl:56:2: Invalid type specification for function mango_idx:get_usable_indexes/4. The success typing is
(_, _, maybe_improper_list(), _) ->
{[any()],
#{'all_indexes' := sets:set(_),
'global_indexes' := sets:set(_),
'partition_indexes' := sets:set(_),
'usability_map' := [any()],
'usable_indexes' := sets:set(_)}}
Maybe some type specs need adjusting and/or it's discovered some genuine bugs. Unless the tweaks to the type specs are minimal to fix to make the dialyzer happy, perhaps the extra types and associated fixes should be in a separate PR.
Also noticed during DocId
extraction we were not actually getting a DocId but instead a Doc with a single _id
field. See a few extra comments about it inline.
9ceb84c
to
ea4343e
Compare
Thanks @nickva for the comment on the type specifications. I forgot to check them with Dialyzer before marking the PR ready for review, although I believe most of them are logically right. Except for maybe the one about the partitions. The series of "will never be called" and "has no local return" warnings are probably due to the number of By taking your advice, I have removed the type specifications from this PR and I will create another one, possibly together with a refactor. |
index = Index, | ||
ranges = null, | ||
trace = Trace, | ||
selector = selector, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A selector here obviously doesn't matter and can be anything, however in general in tests, if possible it might be better use proper types even when they don't matter and when it's not too hard. So maybe a selector might be {[]}
. skip
should be perhaps 0 or some plausible integer like 53
etc. Don't have to change in the current PR just in general I think it's nicer to respect types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I decided to go with this approach because the functions that receive these values will be mocked. Even if we would respect the types this way it does not matter much because the values of those types are not used for their original purpose — for example, the 0
integer becomes a simple token only. On the other hand, non-standard, out-of-type values can warrant that the mocked function was called with that specific instance and not by accident through some other functions.
Another and more important reason for using atoms in place of regular values is that if the code blows up somewhere while the test is running it means that given value was attempted for use while it should not have been. I admit that this might be a bit more nit-picky, but hey, why not to establish such guarantees in the tests or catch such potentially unintended changes?
class Concurrently(object): | ||
def __init__(self, thread, thread_args, start=True): | ||
self.thread = threading.Thread(target=self.wrapper, args=(thread, thread_args)) | ||
self.return_value = None | ||
if start: | ||
self.start() | ||
|
||
def wrapper(self, body, args): | ||
self.return_value = body(*args) | ||
|
||
def start(self): | ||
self.thread.start() | ||
|
||
def get_result(self): | ||
self.thread.join() | ||
return self.return_value | ||
|
||
def join(self): | ||
self.thread.join() | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A nice abstraction!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great, Just noticed a few minor style issues in tests.
We should also get @rnewson's opinion before merging. He is more familiar with this area of the codebase.
ea4343e
to
445a992
Compare
445a992
to
08190b5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The patch is neatly done, bu I have a doubt arond bookmarks.
The Hit
at line 176 is used to update the bookmark but it might be ignored at line 186
(as its a duplicate).
Prior to this PR the incoming Hit
its always right to update the bookmark, even if the selector doesn't match, as we know we don't want search results earlier than that (they'll fail the selector next time).
With this PR I worry that we'll regress the bookmark if a duplication occurs (a doc that appeared on page 1 that also appears on page 2 due to an update while paginating).
@rnewson I have checked the use of --- a/src/mango/src/mango_cursor_text.erl
+++ b/src/mango/src/mango_cursor_text.erl
@@ -184,17 +184,17 @@ handle_hit(CAcc0, Sort, Doc) ->
execution_stats = Stats,
documents_seen = Seen
} = CAcc0,
- CAcc1 = update_bookmark(CAcc0, Sort),
Stats1 = mango_execution_stats:incr_docs_examined(Stats),
couch_stats:increment_counter([mango, docs_examined]),
- CAcc2 = CAcc1#cacc{execution_stats = Stats1},
- case mango_selector:match(CAcc2#cacc.selector, Doc) of
+ CAcc1 = CAcc0#cacc{execution_stats = Stats1},
+ case mango_selector:match(CAcc1#cacc.selector, Doc) of
true ->
DocId = mango_doc:get_field(Doc, <<"_id">>),
case sets:is_element(DocId, Seen) of
true ->
- CAcc2;
+ CAcc1;
false ->
+ CAcc2 = update_bookmark(CAcc1, Sort),
CAcc3 = CAcc2#cacc{
documents_seen = sets:add_element(DocId, Seen)
},
@@ -216,7 +216,7 @@ handle_hit(CAcc0, Sort, Doc) ->
end
end;
false ->
- CAcc2
+ CAcc1
end. Did you mean something like this? (*) Except for |
I was unclear. I am questioning whether we should still update the bookmark when we detect a duplicate or not. with this PR we will. But I think we shouldn't? |
Yeah, that sounds like a good idea to me. My diff above does that exactly. Thanks for the clarification. |
095451b
to
b4847f3
Compare
It does seem wrong to update the bookmark with the duplicate IDs. We're effectively getting index results from a newer database snapshot than the one we opened in the request (the index got ahead of the database, so to speak). So, we wouldn't want to update the bookmark or do anything with the duplicate IDs except just ignore them. If we reworked these indexes to store document IDs and update sequences, when we read those documents we would check that |
Thanks @nickva for the comment, that is an important piece of information. |
perhaps it would be prudent to record |
@rnewson in the scope of this PR? I am not that experienced with |
no, not in this PR. I will take care of that. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
I think this may be the simplest way to eliminate duplicates in the result.
However, we may still get index updates from an index that is now ahead of the db snapshot we opened in the current request. In other words, we're only papering over the symptoms here, so to speak. Changing that would require keeping track of sequences which in the context of Clouseau is trickier since it involves backward compatibility clauses.
When an interleaved update happens to a `text` index while it is being queried, the affected documents might appear duplicated in the collected results. That is because `search` results are paginated where the boundaries might move due to the updates and the document might show up once more on subsequent queries. Mitigate the situation by tracking and losing duplicated documents while the results are being streamed to the user.
This is a port of a fix applied to the `text` cursor where occasionally duplicated documents were filtered out. This happens when moving between pages and an interleaved update is applied.
b4847f3
to
0d128cf
Compare
@rnewson are okay with the PR? |
+1 |
When an interleaved update happens to a
text
index while it is being queried, the affected documents might appear duplicated in the collected results. That is becausesearch
results are paginated where the boundaries might move due to the updates and the document might show up once more on subsequent queries.Mitigate the situation by tracking and losing duplicated documents while the results are being streamed to the user.
In addition to that, the
mango
unit test coverage has increased:mango_cursor_text
: 0% ⟶ 100%Testing recommendations
make eunit apps=mango make mango-test MANGO_TEST_OPTS="24-text-paginated-test"
Note that
mango-test
requires a running Clouseau instance that is hooked up, whileeunit
does not.The change would remove duplicated documents from the response, which users will notice but ideally it should not be considered a breaking change but a welcome fix. There shall be no data loss.
Checklist