-
Notifications
You must be signed in to change notification settings - Fork 15
Enable users to control collections during merging #190
Comments
Proposal: allow a {
"options:" {
"algorithms": {
"collectionStrategy": {
"function": "myCollectionBuilder",
"at": "/some/dir/myCode.sjs"
}
}
}
} The collection strategy function could get an object (SJS) or map (XQuery) where the keys are the URIs of the source document and the values are the collections that each source document is in. The function would return an array (SJS) or sequence (XQuery) of strings with the names of the collections to add to the new document. The default strategy (as happens now) would be the union of source document collections plus Matching already allows callers to specify a filter-query to narrow down what documents should be considered for matching. I think you're also saying we should just rely on the filter query and forget the requirement that docs be in the |
Should allow the collection strategy to control the collections when archiving, too. Currently when merging, all source docs get put into the |
Configuration will control the collections that get applied to documents at various times. Configuration will be part of the merge options. <algorithms xmlns="http://marklogic.com/smart-mastering/merging">
<collections>
<on-merge function="union" at="/some/dir/code.xqy" ns="some-namespace"/>
<on-archive function="remove-content-coll" at="/some/dir/code.xqy" ns="some-namespace"/>
<on-no-match function="add-content-coll" at="/some/dir/code.xqy" ns="some-namespace"/>
<on-notification function="add-notification-coll" at="/some/dir/code.xqy" ns="some-namespace"/>
</collections>
</algorithms> The The The The For each type of content strategy, we'll define an API that can be used to make custom strategies. |
To restate: The goal is for users to write merge algorithms/logic for metadata, including collections. Correct? |
this story covers collections, there's another one for permissions, but yeah, that's the idea |
Currently, a merged document inherits the collections from all source documents being merged, with the "mdm-merged" collection being added.
To support business workflows, smart mastering should allow users to select how collections are handled during a merge, including what collections should be added, and what collections should be removed from a merged document. Without having control of the collections, batch processing of documents requires writing more complex queries to isolate documents for processing. And being able to put merged documents into custom collections allows users to master multiple types of entities in the same database, instead of having multiple entities all being in the "mdm-merged" collection.
Ideally, this type of workflow should be supported:
(1) User puts all documents to be mastered in a "toBeMastered" collection
(2) A batch process runs by selected documents in the "toBeMastered" collection
(3) New documents are put into a user-specified collection, including the option to bring forward collections from source documents or not, like "masterPerson".
(4) Users can remove collections, like "toBeMastered" from merged and original documents, so the next time the batch process runs to select documents in the "toBeMastered" collection, it won't rerun against the same documents.
The text was updated successfully, but these errors were encountered: