Skip to content

deduper

Nathan Richardson edited this page Jan 9, 2019 · 14 revisions
Deduper

Icon

metl deduper 48x48 color

Use When

Input model based records must be deduplicated before being sent to downstream components

Samples

Remove Duplicates

Description

The Deduper removes duplicate records from a message or set of messages in a given unit of work. Once the complete unit of work has been received by the deduper, the unit of work will repackage the records from the inbound message(s) to a set of output messages based on the configured number of records per message, and send those deduped outbound messages to downstream component(s). Records may be deduped by Entity (full record) or Attribute (selected column(s)).

Inbound Message Type

Model Based Message

Output Message Type

Model Based Message

Control Message Handling

Input: When the Unit of Work Boundary is received the dedupe process begins.

Output: A single control message will be forwarded to downstream components once all messages have been processed through this step.

Properties
Name Description

Input Model

Error Suspense Step

Whether to forward failed messages and continue processing. This is the name of a linked component to forward the failed messages to.

Enabled

Dedupe Type

This is the type of dedupe to execute. Options are ENTITY or ATTRIBUTE. Entity will execute a full record compare and Attribute will only look for duplicate records based on selected attributes defined through the component editor screen.

Preserve Record

Applicable only when dedupe type is Attribute. Options are 'First Record' or 'Last Record'. If a duplicate record is found and 'First Record' is chosen then the first matched record is forwarded. If 'Last Record' is chosen, then the final record matching the selected attribute(s) is forwarded and the earlier record(s) are dropped.

Rows Per Message

Log Input

Log Output

Inbound Queue Capacity

Component Editor

Double clicking on the Deduper component in the flow will result in the Deduper editor being displayed as shown below.

deduper editor

The Deduper editor displays the entities contained within the input model in a list. To selected which attribute to dedupe on, select the Entity the attribute desired belongs to and click the 'Edit Columns' button at the top.

In the window that appears, click the checkbox next to the attribute(s) to use as the dedupe key shown below.

deduper attribute editor

Clone this wiki locally