Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sync writes/deletes on sub case search indices. #34673

Merged
merged 10 commits into from
May 26, 2024
Merged

Conversation

AmitPhulera
Copy link
Contributor

@AmitPhulera AmitPhulera commented May 23, 2024

Product Description

This PR builds up on #34663 and adds ability to sync writes and deletes to sub-case search indices.
It uses a django-setting called CASE_SEARCH_SUB_INDICES which will look something like

{
    'co-carecoordination-perf': {
        'index_cname': 'case_search_bha',
        'multiplex_writes': True,
    }
}

This setting will be used to decide which sub-index should the duplicate writes be routed to.

This PR in some ways combine commits from #34602 with certain modifications to make the changes less error prone.

Review by commit 🐡

Feature Flag

NA

Safety Assurance

Most of the changes(except pillow delete) have been covered by tests and have been tested locally by running case-pillow, submitting forms and ensuring that data is present in both the indices. Then archiving the form to ensure that the case is deleted from both case-search indices.

Another thing to note here is that the changes are added in a way that if setting is not enabled on the domain then it should not have any effect. I am also putting this on staging to test out the changes with co-carecoordination-test on staging.

Automated test coverage

Tests are added for most of the changes.

QA Plan

Given the nature of the change it would be a good idea to get QA done but given time sensitivity of the work, I am not sure if we should do it.

Rollback instructions

  • This PR can be reverted after deploy with no further considerations

Labels & Review

  • Risk label is set correctly
  • The set of people pinged as reviewers is appropriate for the level of risk of the change

@AmitPhulera AmitPhulera mentioned this pull request May 23, 2024
4 tasks
@AmitPhulera AmitPhulera requested a review from snopoke May 23, 2024 13:44
@@ -1155,6 +1156,12 @@ def scroll(self, *args, **kw):
def search(self, *args, **kw):
return self.primary.search(*args, **kw)

def _get_case_search_sub_index_docs(self, action):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function seems to be at the wrong abstraction level. ElasticMultiplexAdapter should not need to know about any particular index by name except for the ones it's configured to multiplex via primary_adapter/secondary_adapter.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with you. I have the similar thought while I was working on it.

The ElasticMultiplexer calls in bulk method of it's own and not of the primary and secondary adapter. We have separate classes for all other adapters but for multiplexed indices every multiplexed adapter is an instance of ElasticMultiplexAdpater which have access to primary and secondary indices.
In this case we are actually dealing with another adapter inside of multiplex adapter which is neither primary nor secondary index.

And I monitored the entire call stack to hack index request to ensure the writes can be multiplexed to sub-indices, this seemed to be most appropriate place. So I ended up doing it in this way.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also seemed a bit odd to me. Looking at the implementation again it seems like the multiplex adapter doesn't actually call the indexing methods on the primary or secondary but instead uses it's own logic which is why changes need to be applied here as well.

But I do agree with @millerdev that it would be ideal if the multiplex adapter didn't have to know about this. One option could be to have _render_bulk_action return a list of actions instead of a single one. That would allow the primary adapter to generate the actions for both the main and sub index.

(I'm OK with the current implementation for the short term)

@@ -68,20 +70,24 @@ def process_change(self, change):

if change.deleted and change.id:
doc = change.get_document()
domain = doc.get('domain') if doc else None
if not domain:
meta = getattr(change, 'metadata')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dot-syntax is equivalent and preferred to using getattr

Suggested change
meta = getattr(change, 'metadata')
meta = change.metadata

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is to ensure that if 'metadata' is None or not set it will not error out.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To prevent an error, we would need to specify a default option in getattr in case the attribute does not exist.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added default value in 2a81942 to prevent an error from being raised, which is the reason to use getattr here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do still feel a bit uncomfortable with the prospect of domain maybe being missing - that means when that situation arises, we won't multiplex.

Possible alternatives that come to mind are:

  1. We identify specific circumstances where that occurs, and make available an explicit signal that it's okay to not multiplex in this instance (assuming it is okay)
  2. Since this is only for deletions, could we propagate the deletion to all indices? Assuming it noops if the doc isn't present, that seems like it'd be in line with our integrity model.

We may also want to log an exception even if we don't fail hard.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added f84af4a to at least log an error if we attempt to delete a case from case search without specifying a domain. Based on a slack conversation with Ethan, it seems pretty unlikely that we will find ourselves in a scenario where the domain is None. Currently on production, there aren't any cases with a domain set to None, which means any soft deletion will have a domain available since the postgres doc is still available to query. For hard deletions where the postgres doc is no longer available, we can be fairly confident that the ChangeMeta will be populated with the domain based on publish_case_deleted. This is because our hard deletion logic only deletes a case if the provided case_id and domain match, so if the caller didn't specify a domain, the case deletion would fail, and as stated previously, we don't currently have any cases where a domain is None.

Copy link
Contributor

@gherceg gherceg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First pass review. I can make the minor changes needed so that this PR is ready for Monday.

@@ -68,20 +70,24 @@ def process_change(self, change):

if change.deleted and change.id:
doc = change.get_document()
domain = doc.get('domain') if doc else None
if not domain:
meta = getattr(change, 'metadata')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To prevent an error, we would need to specify a default option in getattr in case the attribute does not exist.

corehq/apps/es/case_search.py Outdated Show resolved Hide resolved
corehq/ex-submodules/pillowtop/processors/elastic.py Outdated Show resolved Hide resolved
@@ -68,20 +70,24 @@ def process_change(self, change):

if change.deleted and change.id:
doc = change.get_document()
domain = doc.get('domain') if doc else None
if not domain:
meta = getattr(change, 'metadata')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do still feel a bit uncomfortable with the prospect of domain maybe being missing - that means when that situation arises, we won't multiplex.

Possible alternatives that come to mind are:

  1. We identify specific circumstances where that occurs, and make available an explicit signal that it's okay to not multiplex in this instance (assuming it is okay)
  2. Since this is only for deletions, could we propagate the deletion to all indices? Assuming it noops if the doc isn't present, that seems like it'd be in line with our integrity model.

We may also want to log an exception even if we don't fail hard.

Copy link
Contributor

@gherceg gherceg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving with the rationale that if any issues around writing to the BHA index come up after merging, that is still low risk as we can stop reading from that index by updating a toggle.

(My approval is worthless to github since I made changes here too)

Copy link
Contributor

@calellowitz calellowitz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple nits and questions and I am not super familiar with the adapter methods, but this all looks reasonable

"""
The CASE_SEARCH_SUB_INDICES should look like this:
{
'co-carecoordination-perf': {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: rather than give an example this would be more helpful if it explained what the fields do/mean. for example, i think the first is a domain, but only because I recognize that domain, and I am not sure what the other two fields control looking here.

@@ -97,7 +105,7 @@ def process_change(self, change):
return

if doc.get('doc_type') is not None and doc['doc_type'].endswith("-Deleted"):
self._delete_doc_if_exists(change.id)
self._delete_doc_if_exists(change.id, domain=doc.get('domain'))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why does this not need to check the change meta like the above did?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is because this isn't a deletion, and therefore there should still be a valid DB object when we call change.get_document().

The CASE_SEARCH_SUB_INDICES should look like this:
{
'co-carecoordination-perf': {
'index_cname': 'case_search_bha',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where is the cname mapped to an actual index name?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The case_search_bha_adapter effectively maps cname to the actual index name. So using the cname defined in settings, we can grab the adapter as done here: https://github.com/dimagi/commcare-hq/pull/34673/files#diff-7ba2ac2cd9c42581b497ab0c7cb5ea77288d3a8d7e8583cfc2c35a4da9e063b6R232

if isinstance(doc, dict):
return doc["domain"]
if hasattr(doc, 'domain'):
return doc.domain
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this raise an exception if neither case is true?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I stumbled upon a case where this is also used by deletes where we don't have doc but have doc_ids.
So I have just checked for the case assuming all valid case search objects should have a domain. If they don't have a domain they can be safely ignored as they don't belong to the BHA index.

if hasattr(doc, 'domain'):
return doc.domain

def index(self, doc, refresh=False):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do the 'update' and 'delete' methods also need similar treatment?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deletes are handled a level up in pillows code and we don't use updates with the case-search index.

@@ -1155,6 +1156,12 @@ def scroll(self, *args, **kw):
def search(self, *args, **kw):
return self.primary.search(*args, **kw)

def _get_case_search_sub_index_docs(self, action):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also seemed a bit odd to me. Looking at the implementation again it seems like the multiplex adapter doesn't actually call the indexing methods on the primary or secondary but instead uses it's own logic which is why changes need to be applied here as well.

But I do agree with @millerdev that it would be ideal if the multiplex adapter didn't have to know about this. One option could be to have _render_bulk_action return a list of actions instead of a single one. That would allow the primary adapter to generate the actions for both the main and sub index.

(I'm OK with the current implementation for the short term)

@gherceg
Copy link
Contributor

gherceg commented May 26, 2024

Acknowledging that we will followup on this to find a better long term solution, but we want to get a first implementation of multiplexing merged so that we can test using the new index. For that reason, I'm going to merge this PR.

@gherceg gherceg merged commit 63a9782 into master May 26, 2024
13 checks passed
@gherceg gherceg deleted the ap/sync-sub-indices branch May 26, 2024 22:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants