-
-
Notifications
You must be signed in to change notification settings - Fork 217
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sync writes/deletes on sub case search indices. #34673
Conversation
…b indices and add tests
@@ -1155,6 +1156,12 @@ def scroll(self, *args, **kw): | |||
def search(self, *args, **kw): | |||
return self.primary.search(*args, **kw) | |||
|
|||
def _get_case_search_sub_index_docs(self, action): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function seems to be at the wrong abstraction level. ElasticMultiplexAdapter
should not need to know about any particular index by name except for the ones it's configured to multiplex via primary_adapter
/secondary_adapter
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with you. I have the similar thought while I was working on it.
The ElasticMultiplexer
calls in bulk method of it's own and not of the primary
and secondary
adapter. We have separate classes for all other adapters but for multiplexed indices every multiplexed adapter is an instance of ElasticMultiplexAdpater
which have access to primary and secondary indices.
In this case we are actually dealing with another adapter inside of multiplex adapter which is neither primary nor secondary index.
And I monitored the entire call stack to hack index request to ensure the writes can be multiplexed to sub-indices, this seemed to be most appropriate place. So I ended up doing it in this way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This also seemed a bit odd to me. Looking at the implementation again it seems like the multiplex adapter doesn't actually call the indexing methods on the primary or secondary but instead uses it's own logic which is why changes need to be applied here as well.
But I do agree with @millerdev that it would be ideal if the multiplex adapter didn't have to know about this. One option could be to have _render_bulk_action
return a list of actions instead of a single one. That would allow the primary adapter to generate the actions for both the main and sub index.
(I'm OK with the current implementation for the short term)
@@ -68,20 +70,24 @@ def process_change(self, change): | |||
|
|||
if change.deleted and change.id: | |||
doc = change.get_document() | |||
domain = doc.get('domain') if doc else None | |||
if not domain: | |||
meta = getattr(change, 'metadata') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dot-syntax is equivalent and preferred to using getattr
meta = getattr(change, 'metadata') | |
meta = change.metadata |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is to ensure that if 'metadata' is None or not set it will not error out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To prevent an error, we would need to specify a default option in getattr
in case the attribute does not exist.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added default value in 2a81942 to prevent an error from being raised, which is the reason to use getattr here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do still feel a bit uncomfortable with the prospect of domain
maybe being missing - that means when that situation arises, we won't multiplex.
Possible alternatives that come to mind are:
- We identify specific circumstances where that occurs, and make available an explicit signal that it's okay to not multiplex in this instance (assuming it is okay)
- Since this is only for deletions, could we propagate the deletion to all indices? Assuming it noops if the doc isn't present, that seems like it'd be in line with our integrity model.
We may also want to log an exception even if we don't fail hard.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added f84af4a to at least log an error if we attempt to delete a case from case search without specifying a domain. Based on a slack conversation with Ethan, it seems pretty unlikely that we will find ourselves in a scenario where the domain is None. Currently on production, there aren't any cases with a domain set to None, which means any soft deletion will have a domain available since the postgres doc is still available to query. For hard deletions where the postgres doc is no longer available, we can be fairly confident that the ChangeMeta will be populated with the domain based on publish_case_deleted. This is because our hard deletion logic only deletes a case if the provided case_id and domain match, so if the caller didn't specify a domain, the case deletion would fail, and as stated previously, we don't currently have any cases where a domain is None.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First pass review. I can make the minor changes needed so that this PR is ready for Monday.
@@ -68,20 +70,24 @@ def process_change(self, change): | |||
|
|||
if change.deleted and change.id: | |||
doc = change.get_document() | |||
domain = doc.get('domain') if doc else None | |||
if not domain: | |||
meta = getattr(change, 'metadata') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To prevent an error, we would need to specify a default option in getattr
in case the attribute does not exist.
@@ -68,20 +70,24 @@ def process_change(self, change): | |||
|
|||
if change.deleted and change.id: | |||
doc = change.get_document() | |||
domain = doc.get('domain') if doc else None | |||
if not domain: | |||
meta = getattr(change, 'metadata') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do still feel a bit uncomfortable with the prospect of domain
maybe being missing - that means when that situation arises, we won't multiplex.
Possible alternatives that come to mind are:
- We identify specific circumstances where that occurs, and make available an explicit signal that it's okay to not multiplex in this instance (assuming it is okay)
- Since this is only for deletions, could we propagate the deletion to all indices? Assuming it noops if the doc isn't present, that seems like it'd be in line with our integrity model.
We may also want to log an exception even if we don't fail hard.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving with the rationale that if any issues around writing to the BHA index come up after merging, that is still low risk as we can stop reading from that index by updating a toggle.
(My approval is worthless to github since I made changes here too)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple nits and questions and I am not super familiar with the adapter methods, but this all looks reasonable
""" | ||
The CASE_SEARCH_SUB_INDICES should look like this: | ||
{ | ||
'co-carecoordination-perf': { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: rather than give an example this would be more helpful if it explained what the fields do/mean. for example, i think the first is a domain, but only because I recognize that domain, and I am not sure what the other two fields control looking here.
@@ -97,7 +105,7 @@ def process_change(self, change): | |||
return | |||
|
|||
if doc.get('doc_type') is not None and doc['doc_type'].endswith("-Deleted"): | |||
self._delete_doc_if_exists(change.id) | |||
self._delete_doc_if_exists(change.id, domain=doc.get('domain')) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why does this not need to check the change meta like the above did?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is because this isn't a deletion, and therefore there should still be a valid DB object when we call change.get_document()
.
The CASE_SEARCH_SUB_INDICES should look like this: | ||
{ | ||
'co-carecoordination-perf': { | ||
'index_cname': 'case_search_bha', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where is the cname mapped to an actual index name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The case_search_bha_adapter effectively maps cname to the actual index name. So using the cname defined in settings, we can grab the adapter as done here: https://github.com/dimagi/commcare-hq/pull/34673/files#diff-7ba2ac2cd9c42581b497ab0c7cb5ea77288d3a8d7e8583cfc2c35a4da9e063b6R232
if isinstance(doc, dict): | ||
return doc["domain"] | ||
if hasattr(doc, 'domain'): | ||
return doc.domain |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this raise an exception if neither case is true?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I stumbled upon a case where this is also used by deletes where we don't have doc but have doc_ids.
So I have just checked for the case assuming all valid case search objects should have a domain. If they don't have a domain they can be safely ignored as they don't belong to the BHA index.
if hasattr(doc, 'domain'): | ||
return doc.domain | ||
|
||
def index(self, doc, refresh=False): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do the 'update' and 'delete' methods also need similar treatment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Deletes are handled a level up in pillows code and we don't use updates with the case-search index.
@@ -1155,6 +1156,12 @@ def scroll(self, *args, **kw): | |||
def search(self, *args, **kw): | |||
return self.primary.search(*args, **kw) | |||
|
|||
def _get_case_search_sub_index_docs(self, action): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This also seemed a bit odd to me. Looking at the implementation again it seems like the multiplex adapter doesn't actually call the indexing methods on the primary or secondary but instead uses it's own logic which is why changes need to be applied here as well.
But I do agree with @millerdev that it would be ideal if the multiplex adapter didn't have to know about this. One option could be to have _render_bulk_action
return a list of actions instead of a single one. That would allow the primary adapter to generate the actions for both the main and sub index.
(I'm OK with the current implementation for the short term)
Acknowledging that we will followup on this to find a better long term solution, but we want to get a first implementation of multiplexing merged so that we can test using the new index. For that reason, I'm going to merge this PR. |
Product Description
This PR builds up on #34663 and adds ability to sync writes and deletes to sub-case search indices.
It uses a django-setting called
CASE_SEARCH_SUB_INDICES
which will look something likeThis setting will be used to decide which sub-index should the duplicate writes be routed to.
This PR in some ways combine commits from #34602 with certain modifications to make the changes less error prone.
Review by commit 🐡
Feature Flag
NA
Safety Assurance
Most of the changes(except pillow delete) have been covered by tests and have been tested locally by running case-pillow, submitting forms and ensuring that data is present in both the indices. Then archiving the form to ensure that the case is deleted from both case-search indices.
Another thing to note here is that the changes are added in a way that if setting is not enabled on the domain then it should not have any effect. I am also putting this on staging to test out the changes with
co-carecoordination-test
on staging.Automated test coverage
Tests are added for most of the changes.
QA Plan
Given the nature of the change it would be a good idea to get QA done but given time sensitivity of the work, I am not sure if we should do it.
Rollback instructions
Labels & Review