Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Full HDFS persistence support (read/write) #23

Closed
wajda opened this issue Feb 15, 2018 · 3 comments
Closed

Full HDFS persistence support (read/write) #23

wajda opened this issue Feb 15, 2018 · 3 comments

Comments

@wajda
Copy link
Contributor

wajda commented Feb 15, 2018

Currently HDFS persustence factory can only write the lineage data off to a file, but never read it back. To fully make use of this type of persistence we need to be able to read and parse the written lineage JSON to be able to link it with the descendant lineages and visualize in the Spline UI.

@GeorgiChochov
Copy link

The ability to persist lineage offline and import it after the fact would be quite useful.
I know you have ideas for how to expand on this, @wajda.

@wajda
Copy link
Contributor Author

wajda commented Aug 8, 2019

We could export/import the lineage.
Imaginary example:

  1. I have a file A on HDFS, and its lineage in Spline
  2. I want to copy A to a different infrastructure (network, cluster, cloud, you name it), monitored by another independent Spline instance. And I don't want to loose the lineage of A
  3. Using Spline UI or CLI I dump A's full lineage to a file and carry it with A to a new infrastructure
  4. There the lineage dump can be imported manually (again by using UI or CLI), or even automatically on the first Spline tracked read from A.

@wajda
Copy link
Contributor Author

wajda commented Dec 10, 2020

Superseded by #815 and AbsaOSS/spline-spark-agent#156

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

No branches or pull requests

2 participants