-
Notifications
You must be signed in to change notification settings - Fork 90
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
UDF lineage? #181
Comments
Current support for Scala UDFs in SplineNow Scala UDFs are captured and stored as expressions, they seem to have correct input and output. There are not many details about the function itself. For example UDF that consists of several expressions will be represented as just one UDF expression. From lineage point of view this seems to be still a lot of detail, since even expression level lineage is provided. When UDF is pure function (using only inputs to compute output) all seems to work already. The issue is when the UDF is using another external data. Then this data source is not captured. Adding additional lineage infoPost Processing Filters seems to be a good solution to this problem. It is possible to select Scala UDF by name using currently captured information. Using annotation or another form of marking the function seems to be unnecessary since the name is already an identification. But if it is needed some wrapping function could be created that would contain the additional info for spline and just delegate the call to the wrapped UDF. Since we are able to find the function inside a filter, we can add additional expressions and modify it to better represent the actual lineage. Adding another data source may be more complicated, since spline expects all expression data to come from operations. But still it can be done via filter. (Here it would help to have some example what kind of external data are used) We could create some filters that would do the common tasks for UDF and left the user to add the additional info they want to provide, but each such filter is also useful only for the intended use case, whereas the general filter that is already available can each user modify as they please. |
To be investigated.
Originally posted by @vidma in AbsaOSS/spline#114 (comment)
The text was updated successfully, but these errors were encountered: