You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
When we ingest datatypes from different sources, we may run into the issue where we ingest data from an old source and one from a recent source with different properties. The current implementation builds a merge query which simply overwrites the properties on matched nodes based on which record was ingested first. I would like for there to be a way to conditionally change properties on a node.
Describe the solution you'd like
When I create an interpretation within my pipeline, I would like to declare the following:
merge_condition: latest # Where we could have different options latest being greater value wins. default=None
condition_key: date_created # key of the value we are comparing with.
condition_value: !!python/jmespath date_created #The value from the record we are pulling from
This would then modify the merge query which currently performs the following for source nodes:
MERGE(node:$node_type) WHERE node.key = $key
ON CREATE
SET node.param = param.value
ON MATCH
SET node.param = param.value
I would like it to create the following:
MERGE(node:$node_type) WHERE node.key = $key
ON CREATE
SET node.param = param.value
ON MATCH
SET node.condition = CASE WHEN $condition THEN true ELSE false END // We need a variable for the condition in some way
SET node.param = CASE WHEN node.condition THEN param.value ELSE node.param END
// Find a way to unset node.condition
Where condition in our case would be (date_created > node.date_created)
Describe alternatives you've considered
The alternative I can perform to ensure the recency of my data is I can schedule my pipelines such that the recent data comes in after the old data.
In my pipeline, I can create an interpreter that makes a call to the database to get the value, then conditionally write to the database (this takes too long.)
Additional context
The text was updated successfully, but these errors were encountered:
Is your feature request related to a problem? Please describe.
When we ingest datatypes from different sources, we may run into the issue where we ingest data from an old source and one from a recent source with different properties. The current implementation builds a merge query which simply overwrites the properties on matched nodes based on which record was ingested first. I would like for there to be a way to conditionally change properties on a node.
Describe the solution you'd like
When I create an interpretation within my pipeline, I would like to declare the following:
This would then modify the merge query which currently performs the following for source nodes:
I would like it to create the following:
Where condition in our case would be
(date_created > node.date_created)
Describe alternatives you've considered
The alternative I can perform to ensure the recency of my data is I can schedule my pipelines such that the recent data comes in after the old data.
In my pipeline, I can create an interpreter that makes a call to the database to get the value, then conditionally write to the database (this takes too long.)
Additional context
The text was updated successfully, but these errors were encountered: