Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More lineage dispatcher types #156

Closed
wajda opened this issue Dec 10, 2020 · 0 comments
Closed

More lineage dispatcher types #156

wajda opened this issue Dec 10, 2020 · 0 comments
Milestone

Comments

@wajda
Copy link
Contributor

wajda commented Dec 10, 2020

In addition to the existing REST (sync) dispatcher, there is a demand for:

  • Async implementation (Kafka)
  • Offline implementation (HDFS)

All dispatchers could be extracted to their own modules to avoid including unused dependencies to the bundle (for now only the Http and Kafka dispatchers require extra dependencies, where Kafka is considered a default production impl anyway, and the Http one only requires a single extra dependency - ScalaJ which is a tiny [~200kb] library with no extra deps. So extracting those into separate bundles doesn't seem to be going to bring a lot benefits, but will rather complicate the project and deployment structure)


UPDATE

All embedded dispatchers should also be pre-configured on the default properties file, so that activation of the dispatcher of choice would be accomplished by just specifying the root dispatcher by name, without having to specify the class name. E.g.:

spline.lineageDispatcher=kafka

or, to split the lineage data into several dispatchers (e.g. for logging or backup purposes):

spline.lineageDispatcher=composite
spline.lineageDispatcher.composite.dispatchers=http,console # will send the lineage to both Http and Console dispatchers

Attaching a custom dispatcher is as easy as this:

# register a custom dispatcher
spline.lineageDispatcher.my-custom-dispatcher.className=org.example.MyCustomLineageDispatcherImpl

# and attach it to the dispatcher pipeline in one of the two variants:
# ... either as a composite dispatcher component (1)
spline.lineageDispatcher.composite.dispatchers=my-custom-dispatcher, ...(other dispatchers)...
# ... or as a root dispatcher (2)
spline.lineageDispatcher=my-custom-dispatcher

The embedded dispatchers' names will be as follows:

Name Class
composite za.co.absa.spline.harvester.dispatcher.CompositeLineageDispatcher
http za.co.absa.spline.harvester.dispatcher.HttpLineageDispatcher
kafka KafkaLineageDispatcher (#166)
console za.co.absa.spline.harvester.dispatcher.ConsoleLineageDispatcher
logging za.co.absa.spline.harvester.dispatcher.LoggingLineageDispatcher
hdfs za.co.absa.spline.harvester.dispatcher.HDFSLineageDispatcher
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

No branches or pull requests

1 participant