Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

delete doesn't sync if using relate #473

Closed
sachinnagesh opened this issue Jan 5, 2021 · 10 comments
Closed

delete doesn't sync if using relate #473

sachinnagesh opened this issue Jan 5, 2021 · 10 comments

Comments

@sachinnagesh
Copy link

sachinnagesh commented Jan 5, 2021

Hi @rwynn, I have similar scenario like this #150
I am using

monstache 6.7.0
mongodb version mongo:4.2.9
elasticsearch:7.7.0

My config looks something like this,

mongo-url =""
elasticsearch-urls =[]
direct-read-namespaces=["user-db.users_view"] #copy view to es index completely
change-stream-namespaces=["user-db.users","user-db.user_address_details"]

gzip = true
stats = true
index-stats = true
elasticsearch-max-conns = 2
elasticsearch-max-docs = 1000
dropped-collections = false
dropped-databases = false
replay = false
resume = true
resume-write-unsafe = false
resume-name = "default"
resume-strategy = 1
file-highlighting = true
verbose = true
cluster-name = "MONSTACHE_CLUSTER"
exit-after-direct-reads = false
direct-read-split-max = -1
direct-read-stateful = true
elasticsearch-retry = true
prune-invalid-json = true

[gtm-settings]
buffer-duration = "100ms"

[[mapping]]
namespace = "user-db.users_view"
index = "user-db.users_view-index"

[[relate]]
namespace = "user-db.users"
with-namespace = "user-db.users_view"
keep-src = false

[[relate]]
namespace = "user-db.user_address_details"
with-namespace = "user-db.users"
src-field = "field1"
match-field = "field1"
keep-src = false

'user-db.users_view' is created on user-db.users and user-db.user_address_details. One user can have multiple address records. Whenever I am adding new address records or updating existing one, it works fine. But in case of deletion of address record update is not happening in record stored on index user-db.users_view-index

@rwynn
Copy link
Owner

rwynn commented Jan 5, 2021

hi @sachinnagesh, monstache attempts to propogate a delete event when relating using this function. Are you able to trace what is not happening correctly in that function?

@rwynn
Copy link
Owner

rwynn commented Jan 5, 2021

If you are not relating by _id then it might not be able to work unless you keep-src on all user_address_details documents so they can be looked up for matching. Cause once deleted in MongoDB that information is lost for making the connection.

@sachinnagesh
Copy link
Author

sachinnagesh commented Jan 6, 2021

@rwynn I am not relating it with _id. I tried by keeping keep-src on, in that case deletes whole record from user-db.users_view-index, but I am expecting an update.

My record looks something like this

user-db.users

{
	"_id" : "12345qwerty",
	"field1" : "11111",
	"fname" : "Sachin"
	"lname" : "N"
}

user-db.user_address_details

{
	"_id" : "998877",
	"field1" : "11111", //this will match with field1 of record in user-db.users collection with which it belong
	"street" : "NDA road",
	"building_name" : "Shree Niwas",
	"pincode" : 223311
}
{
	"_id" : "990022",
	"field1" : "11111", //this will match with field1 of record in user-db.users collection with which it belong
	"street" : "FC Road",
	"building_name" : "Venture",
	"pincode" : 431122
}

Created a view in which record looks like this.
user-db.users_view

{
	"_id" : "12345qwerty",
	"field1" : "11111",
	"fname" : "Sachin"
	"lname" : "N",
	"address_details" : [
		{
			"_id" : "998877",
			"field1" : "11111", //this will match with field1 of record in user-db.users collection with which it belong
			"street" : "NDA road",
			"building_name" : "Shree Niwas",
			"pincode" : 223311
		},
		{
			"_id" : "990022",
			"field1" : "11111", //this will match with field1 of record in user-db.users collection with which it belong
			"street" : "FC Road",
			"building_name" : "Venture",
			"pincode" : 431122
		}
	]
}

Suppose I am deleting record with id : 990022 from user-db.user_address_details collection. In this case record in
user-db.users_view looks like this

{
	"_id" : "12345qwerty",
	"field1" : "11111",
	"fname" : "Sachin"
	"lname" : "N",
	"address_details" : [
		{
			"_id" : "998877",
			"field1" : "11111", //this will match with field1 of record in user-db.users collection with which it belong
			"street" : "NDA road",
			"building_name" : "Shree Niwas",
			"pincode" : 223311
		}
	]
}

Same thing I am expecting should get updated to user-db.users_view-index in es, but it's not updating anything with my config.

@rwynn
Copy link
Owner

rwynn commented Jan 6, 2021

When you delete something in MongoDB the change event only gets the _id of that document, not the entire document like in insert/update. Monstache will detect that you are not using the _id (best case) and fallback to try to lookup the document in Elasticsearch by _id to get the field1 value before it is deleted there also. This requires all your addresses to be indexed to Elasticsearch.

If this succeeds it will queue a resync of the related document after looking it up by the value of field1, in your case the user.

Would need more info about what is failing in findDeletedSrcDoc. Any of those errors being reported?

@rwynn
Copy link
Owner

rwynn commented Jan 6, 2021

More intrusive option to your application is that when you delete an address you also perform a multi document update of a timestamp on all user documents where user.field1 == address.field1.

@sachinnagesh
Copy link
Author

@rwynn Thank you. We went with multi document update.

@irfanbacker
Copy link

irfanbacker commented Jun 14, 2021

Hi @rwynn . I have the same issue, but I have also indexed both collections on es. For some reason, there is still an error saying 'No hits for deleted document' with an object ID which I suppose is the ID of the state.

ERROR 2021/06/14 17:08:54 Found no hits for deleted document 60c73d2e784805481c486342

Config:

direct-read-namespaces = ["test.ideas", "test.events", "test.votes"]

change-stream-namespaces = ["test.ideas", "test.events", "test.votes"]

[[mapping]]
namespace = "test.ideas"
index = "ideas"

[[mapping]]
namespace = "test.events"
index = "events"

[[relate]]
namespace = "test.votes"
with-namespace = "test.ideas"
src-field = "ideaId"
match-field = "_id"

[[script]]
namespace = "test.ideas"
script = """
module.exports = function(idea) {
    if (idea.eventId) {
        idea.eventId = findId(idea.eventId, {
          database: "test",
          collection: "events"
        });
        idea.eventId.description = undefined;
        idea.eventId.__v = undefined;
    }
    var votes = pipe([
      { $match: {ideaId: idea._id} },
      { $count: 'count' },
    ],{ database: "test", collection: "votes"});
    if(votes.length != 0) idea.votes = votes[0].count;
    else idea.votes = 0;
    return idea;
}
"""

Also for some reason, there are additional indices being created in es named of the form 'objectID_namespace' for all collections and along with it, an error from bulk response:

ERROR 2021/06/14 16:55:50 Bulk response item: {"_index":"60b487f58f9bae631d0dd39d_test.votes","_type":"_doc","_id":"60bf503a66c184292cb9620d","_version":6971659866272169994,"result":"not_found","_shards":{"total":2,"successful":1,"failed":0},"_seq_no":8,"_primary_term":1,"status":404}

yellow open 60b487f58f9bae631d0dd39d_test.events VSS96kVyTJOGOHzFydHitg 1 1 4 1 13.4kb 13.4kb
yellow open monstache.stats.2021-06-14 SqbBstDmRs6FKgM46GJ3IA 1 1 28 0 18.4kb 18.4kb
yellow open 60b487f58f9bae631d0dd39d_test.ideas 63Rllz2bQSCwliJJRVdzHg 1 1 13 2 30.2kb 30.2kb
yellow open 60b487f58f9bae631d0dd39d_test.votes 8mbpjBsUT7-0aq1jzkd0Qg 1 1 7 14 11kb 11kb
yellow open ideas x3CnElw_Tke-qzc_nuFPig 1 1 13 1 57.4kb 57.4kb
yellow open test.votes VZyuvI97Qv-5_xBqR1i3Sg 1 1 15 1 11.9kb 11.9kb
yellow open events tAdx49i4Rk-mCgkgwmg76w 1 1 9 0 16.1kb 16.1kb

Please do clarify if i'm doing something wrong. Also, I didn't understand what you were referring to as multi-document update.

@rwynn
Copy link
Owner

rwynn commented Jun 15, 2021

@irfanbacker regarding the index name in Elasticsearch, are you by chance using MongoDB Atlas free tier (shared cluster)? There was a previous issue which is very similar.

I filed a ticket with MongoDB long back and it did get resolved at one point, but not sure if there is a regression?

@irfanbacker
Copy link

irfanbacker commented Jun 15, 2021

@rwynn , Thanks for the quick reply. I am using the free tier of Atlas. Unlike that issue, I wasn't using resume but only replay.
I tried running monstache without specifying the change namespace directly and for some reason, the creation of other indices with an objectID in their name stopped. But the problems still persist. I now realise that the problem isn't related to [[relate]] at all, because the deletion still had problems even after I removed the [[relate]] configs. But while relate was configured, it showed this error when deleting the document:
ERROR 2021/06/15 10:07:26 Found no hits for deleted document 60c82e4e784805481c486345
while it didn't show any when I removed the [[relate]] configs but still didn't delete the document from es.

Also, when I removed the change namespaces, the bulk response item error still persisted but now, with a proper index name unlike the one with an objectID in it previously:

ERROR 2021/06/15 10:13:38 Bulk response item: {"_index":"test.votes","_type":"_doc","_id":"60c82e0a784805481c486344","_version":6973874828151357462,"result":"not_found","_shards":{"total":2,"successful":1,"failed":0},"_seq_no":24,"_primary_term":1,"status":404}

The issue you mentioned is still occuring, when resume is true. I'll try to looking into it also, but there are other problems too like mentioned above

EDIT: Just realised that the collection test.votes and some others aren't being fully synced on replay. Should this be made as another issue?

@irfanbacker
Copy link

irfanbacker commented Jun 15, 2021

@rwynn
I did a bit more study on what is happening. I was using replay = true along with change namespace. I think that was the cause of most of the errors mentioned above.

Now, everything works perfectly except on test.votes collection, where relate is configured. The same no hits error shows up when deleting as insert operations aren't being done on es. This was caused due to not mentioning keep-src = true for relate. I thought it was true by default. Adding that fixed the issue. But using resume still creates the additional indices, which could be a seperate issue related to atlas cloud. But the relate trigger isn't running now with or without resume, which takes us back to the issue mentioned in this thread.

I have made keep-src as true but still it isn't synced on deletion. There are no error outputs on the console. Using verbose = true shows that the trigger isn't triggered at all. When the doc is deleted, only a request for deleting that test.votes doc is sent. No request for updating the test.ideas doc is sent.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants