This repository is purely for reference and is illustrative in it is purpose. Please do not use in production as is, use this as a guide or starting point for a production level implementation.
This project illustrates the use of Nats.io to perform a "server-wide" or "targeted" message deduplication strategy using Nats.io built-in Subject Wildcards. In most pub/sub architectures there is a need for message deduplication based on TTL, as in "duplicate messages arriving within the same timeframe should only invoke subscribers once". Not only does this implementation provide this capability via filtering but dynamically through message headers as well.
What will be seen in this example is a generic subscriber filtering on "dedupe.>". If a NATS Subject named "us.east.regional" is the target subject for deduplication, publishing to the "dedupe.us.east.regional" NATS Subject will deduplicate based on a TTL (default 1s) and post to "us.east.regional" once reached in a FIFO fashion (this can be modified, see below).
Before you continue, ensure you have met the following requirements:
- Nats Server or Nats Docker Server installed and running
- If installing the Go Server, Go must be installed
- nats-cli installed
- go will need to be installed via Chocolatey or Brew
- NodeJS v18 or higher installed
- Npm installed
This repository uses dotenv, feel free to create a .env file to override other aspects of the program.
- NATS_SERVER : The Nats server that will be facilitating Pub-Sub (defaults to localhost)
- DEDUPE_TTL_MS : window in which duplicate messages will be prevented from publishing in milliseconds (defaults to 1s)
- DEDUPE_LIFO : ensures that the last duplicate message in is the message published (defaults to false)
- X_DEDUPE_TTL_MS : NATS message header Key for message independent TTL (defaults to X_DEDUPE_TTL_MS)
- 'cd' to the root of this repository (where it was cloned)
- OPTIONAL Create a file in the root named .env
- Add environment variables above if needed
- run npm install from the command line
- open a terminal to the root of this repository and run :
- npm run dedupe
- allow the subscription a few additional seconds to bind, 503 errors may be experienced during this binding time
- open another terminal to the root of this repository and run :
- nats subscribe us.east.regional
- allow the subscription a few additional seconds to bind, 503 errors may be experienced during this binding time
- open another terminal to the root of this repository and run :
- nats publish dedupe.us.east.regional 'Some random information'
- message independent TTL can be set by doing the following (example of 5 second message level TTL)
- nats publish dedupe.us.east.regional 'Some random information' -H X-DEDUPE-TTL-MS:5000
- nats publish dedupe.us.east.regional 'Some random information'
user@computer nats-dedupe-subscriber % nats publish dedupe.us.east.regional 'Chemical Spill on Level 2'
22:25:49 Published 26 bytes to "dedupe.us.east.regional"
user@computer nats-dedupe-subscriber % nats publish dedupe.us.east.regional 'Chemical Spill on Level 2'
22:25:50 Published 26 bytes to "dedupe.us.east.regional"
user@computer nats-dedupe-subscriber % nats publish dedupe.us.east.regional 'Chemical Spill on Level 2'
22:25:50 Published 26 bytes to "dedupe.us.east.regional"
user@computer nats-dedupe-subscriber % nats publish dedupe.us.east.regional 'Chemical Spill on Level 2'
22:25:51 Published 26 bytes to "dedupe.us.east.regional"
user@computer nats-dedupe-subscriber % nats publish dedupe.us.east.regional 'Chemical Spill on Level 2'
22:25:51 Published 26 bytes to "dedupe.us.east.regional"
user@computer nats-dedupe-subscriber % nats publish dedupe.us.east.regional 'Chemical Spill on Level 2'
22:25:51 Published 26 bytes to "dedupe.us.east.regional"
user@computer nats-dedupe-subscriber % nats publish dedupe.us.east.regional 'Chemical Spill on Level 2'
22:25:52 Published 26 bytes to "dedupe.us.east.regional"
user@computer nats-dedupe-subscriber % nats subscribe us.east.regional
22:25:39 Subscribing on us.east.regional
[#1] Received on "us.east.regional"
Chemical Spill on Level 2
- JetStream does provide time-scoped deduplication. However as to the granularity and flexibility of JetStream, more information is needed. This solution is also applicable for those not using JetStream
- The TTL is a SLIDING TTL, where a new message resets the deduplication timeout
- Matching is performed using the MD5 checksum on the NATS Message UInt8Array (Byte[]) which allows for checksum on all types JSON,string,number, protobuff
- Use && between duplicate commands to push multiple messages
- This repository is heavily commented to provide context as to what and why, if in VS Code feel free to collapse all comments if they are obtrusive
- On Mac -> Press ⌘ + K then ⌘ + /
- On Windows & Linux -> Press Ctrl + K then Ctrl + /