Description
Parent/child queries have a non-desirable property: given a parent/child query Q, updating a document in segment A might change the set of matching document in another segment B.
This is an issue because it means that parent/child queries and filters cannot be cached per segment, so we had to add logic to make sure these queries don't get cached, either directly or as part of a cached parent filter (eg. under a cached bool
filter). The propagation logic can be a bit fragile so I think we should work on a better fix.
One idea could be to change the abstraction we have to match document from a single Lucene query to something that could perform several Lucene queries. For instance in the case of has_child
, we could have a first query that would collect parent ids and then build a new query based on these ids. This is the same execution logic, but each query on its own would solely depend on data that is stored in the current segment, so they would be cacheable (even though it might not a good idea to cache them).