-
Notifications
You must be signed in to change notification settings - Fork 90
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Write delta in spark, when spline parses spark lineage, an error is reported? #609
Comments
Can you post the whole exception? Preferably as a text. |
ok
|
From the log it seems this is caused by loading some class in Can you try to run the same thing without Spline Agent? I would also check if the proper delta version is in use: https://docs.delta.io/latest/releases.html |
yes, thanks |
@cerveada My spark task is wrong, but the lineage still comes out,is this normal? |
|
yes, spline is able to capture lineage of both successful and failed jobs. |
@cerveada so, i have another question,Lineage sent to kafka
|
There could be some issue with the new yaml config. I will check that. As a workaround you can set the same property using config for spark-submit:
that should fix the issue for you. |
thanks |
All classes in this package represents a lineage that the agent captured. This is an internal model for the agent. It will be later transformed to similar "output" model defined by producer API. As the last step, the "output" model is converted to json and send to server. |
like this json,"expressions" Can you help me find out what this model means? by the way ,spline-spark-agent Can the parsed lineage be sent to a third-party system? like dataHub? "expressions":{
"constants":[
{
"id":"expr-0",
"dataType":"75fe27b9-9a00-5c7d-966f-33ba32333133",
"extra":{
"simpleClassName":"Literal",
"_typeHint":"expr.Literal"
},
"value":1
},
{
"id":"expr-1",
"dataType":"75fe27b9-9a00-5c7d-966f-33ba32333133",
"extra":{
"simpleClassName":"Literal",
"_typeHint":"expr.Literal"
},
"value":1
},
{
"id":"expr-2",
"dataType":"75fe27b9-9a00-5c7d-966f-33ba32333133",
"extra":{
"simpleClassName":"Literal",
"_typeHint":"expr.Literal"
},
"value":1
},
{
"id":"expr-3",
"dataType":"75fe27b9-9a00-5c7d-966f-33ba32333133",
"extra":{
"simpleClassName":"Literal",
"_typeHint":"expr.Literal"
},
"value":1
}
]
} |
https://firststr.com/2021/04/26/spark-compute-lineage-to-datahub/ |
Maybe this discussion will help #563
Yes check the documentation: https://github.com/AbsaOSS/spline-spark-agent#dispatchers
We don't provide any support for external servers other than Spline, but using the dispatchers should allow you to send the data wherever you want, you can always create custom dispatcher if the ones provided are not enough. |
Thank you very much for your reply |
In Kafka, each lineage process receives 2 messages,Is there any concurrency order problem when dealing with exectionPlan and exectionEvent? I mean if i parse “exectionEvent”, but only from “exectionPlan” i can confirm if this spark task is successful? |
I was wrong, it is necessary to judge whether the task is successful by checking whether there is error information in the "exectionEvent" again |
That's right. Any non-null value in the "error" property means there was an error. |
@cerveada Hi ,i have another question? |
spark on yarn |
What Spark and Yarn version are you using? I'm closing this issue now, as it turned into a thread where different kinds of issues are discussed. |
@wajda spark 3.1.2 and yarn 3.1 |
submit code
this code
who can help me?
The text was updated successfully, but these errors were encountered: