Skip to content

Nats.io Deduplication Subscriber, automatically provides TTL deduplication capabilities to existing subjects w/o modification

License

Notifications You must be signed in to change notification settings

kamauwashington/nats-deduplication-subscriber

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Nats.io Deduplication Subscriber

This repository is purely for reference and is illustrative in it is purpose. Please do not use in production as is, use this as a guide or starting point for a production level implementation.

This project illustrates the use of Nats.io to perform a "server-wide" or "targeted" message deduplication strategy using Nats.io built-in Subject Wildcards. In most pub/sub architectures there is a need for message deduplication based on TTL, as in "duplicate messages arriving within the same timeframe should only invoke subscribers once". Not only does this implementation provide this capability via filtering but dynamically through message headers as well.

What will be seen in this example is a generic subscriber filtering on "dedupe.>". If a NATS Subject named "us.east.regional" is the target subject for deduplication, publishing to the "dedupe.us.east.regional" NATS Subject will deduplicate based on a TTL (default 1s) and post to "us.east.regional" once reached in a FIFO fashion (this can be modified, see below).

Prerequisites

Before you continue, ensure you have met the following requirements:

Environment Variables

This repository uses dotenv, feel free to create a .env file to override other aspects of the program.

  • NATS_SERVER : The Nats server that will be facilitating Pub-Sub (defaults to localhost)
  • DEDUPE_TTL_MS : window in which duplicate messages will be prevented from publishing in milliseconds (defaults to 1s)
  • DEDUPE_LIFO : ensures that the last duplicate message in is the message published (defaults to false)
  • X_DEDUPE_TTL_MS : NATS message header Key for message independent TTL (defaults to X_DEDUPE_TTL_MS)

Running the Application

  1. 'cd' to the root of this repository (where it was cloned)
  2. OPTIONAL Create a file in the root named .env
    • Add environment variables above if needed
  3. run npm install from the command line
  4. open a terminal to the root of this repository and run :
    • npm run dedupe
    • allow the subscription a few additional seconds to bind, 503 errors may be experienced during this binding time
  5. open another terminal to the root of this repository and run :
    • nats subscribe us.east.regional
    • allow the subscription a few additional seconds to bind, 503 errors may be experienced during this binding time
  6. open another terminal to the root of this repository and run :
    • nats publish dedupe.us.east.regional 'Some random information'
      • message independent TTL can be set by doing the following (example of 5 second message level TTL)
      • nats publish dedupe.us.east.regional 'Some random information' -H X-DEDUPE-TTL-MS:5000

Visualize

Sending multiple messages quickly

user@computer nats-dedupe-subscriber % nats publish dedupe.us.east.regional 'Chemical Spill on Level 2'
22:25:49 Published 26 bytes to "dedupe.us.east.regional"

user@computer nats-dedupe-subscriber % nats publish dedupe.us.east.regional 'Chemical Spill on Level 2'
22:25:50 Published 26 bytes to "dedupe.us.east.regional"

user@computer nats-dedupe-subscriber % nats publish dedupe.us.east.regional 'Chemical Spill on Level 2'
22:25:50 Published 26 bytes to "dedupe.us.east.regional"

user@computer nats-dedupe-subscriber % nats publish dedupe.us.east.regional 'Chemical Spill on Level 2'
22:25:51 Published 26 bytes to "dedupe.us.east.regional"

user@computer nats-dedupe-subscriber % nats publish dedupe.us.east.regional 'Chemical Spill on Level 2'
22:25:51 Published 26 bytes to "dedupe.us.east.regional"

user@computer nats-dedupe-subscriber % nats publish dedupe.us.east.regional 'Chemical Spill on Level 2'
22:25:51 Published 26 bytes to "dedupe.us.east.regional"

user@computer nats-dedupe-subscriber % nats publish dedupe.us.east.regional 'Chemical Spill on Level 2'
22:25:52 Published 26 bytes to "dedupe.us.east.regional"

Resulting Deduplication

user@computer nats-dedupe-subscriber % nats subscribe us.east.regional
22:25:39 Subscribing on us.east.regional 

[#1] Received on "us.east.regional"
Chemical Spill on Level 2

Notes

  • JetStream does provide time-scoped deduplication. However as to the granularity and flexibility of JetStream, more information is needed. This solution is also applicable for those not using JetStream
  • The TTL is a SLIDING TTL, where a new message resets the deduplication timeout
  • Matching is performed using the MD5 checksum on the NATS Message UInt8Array (Byte[]) which allows for checksum on all types JSON,string,number, protobuff
  • Use && between duplicate commands to push multiple messages
  • This repository is heavily commented to provide context as to what and why, if in VS Code feel free to collapse all comments if they are obtrusive
    • On Mac -> Press + K then + /
    • On Windows & Linux -> Press Ctrl + K then Ctrl + /

About

Nats.io Deduplication Subscriber, automatically provides TTL deduplication capabilities to existing subjects w/o modification

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published