-
Notifications
You must be signed in to change notification settings - Fork 902
ATLAS-3661 Create 'spark_column_lineage' type and relationship definition #93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…tion to add support of column level lineage
sarathsubramanian
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes looks good. +1. Thanks @vladhlinsky.
HeartSaVioR
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
+1 for PR, thanks @vladhlinsky for PR. |
|
what version is spark-atlas-conection used |
|
What version does spark-atlas-connector use? |
1 similar comment
|
What version does spark-atlas-connector use? |
|
I think cloudera has not open sourced it yet, and completely hiding the implementation of column lineage harvester! |
maybe,how about apache? |





What changes were proposed in this pull request?
Create
spark_column_lineagetype and relationship definition to add support of column level lineage forCREATE TABLE AS SELECT ...statements and views. Column level lineage refers to lineage created between the input and output columns.For example:
For the above query, lineage is created from
employeetoemployee_ctas, and also fromemployee.idtoemployee_ctas.id.How was this patch tested?
Manually using modified version of Spark Atlas Connector:
1100-spark_model.jsonis updated with proposed changes. Atlas is restarted.spark_column_lineageentity is created.