-
Notifications
You must be signed in to change notification settings - Fork 312
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Array/Map return types for UDF do not work correctly. #112
Comments
Looks like the array return type has a bug. I will fix it asap. I have a local fix working as follows: var udf = Udf<string, string[]>((str) => new[] { str, str + str });
df.Select(Explode(udf(df["name"]))).Show(); The original table:
After exploding:
|
few more related questions.
|
No, at the moment. However, we'd like to understand the use case. Can you explain the scenario where you want this? (sample scenario with some snippets would be best)
From what I understand, you want to iterate over the result set? If so, have you considered using ToLocalIterator which returns an IEnumerable. |
I think @guruvonline meant to have IEnumerable as a return type of UDF. Yes, this will be supported: var udf = Udf<string, IEnumerable<string>>((str) => new[] { str, str + str }); |
I have added a new feature request with example scenario |
I also get this error, and the workaround don't work.
I always get the following error stack:
just look at the source code [DateType.scala], (https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/types/DataType.scala) , i am totally new guy about Scala, and i can't understand why "array" go wrong switch-case.
any workaround for me ? if possible, I can modify Microsoft.Spark locally to make it works. |
finally, i understand the Scala code, and i fix it now locally in Microsoft.Spark. will send PR later. |
@danny8002 there is already a PR for this: #114. |
I have a scenario where for structured streaming input and for each event/row i have to write a custom logic/function which can return multiple rows.
looks like for return type UDF only supports basic type and not list/array.
Any workaround for this?
for sample my UDF is something like below, so that i can explode to create multiple rows.
`
Func<Column, Column> ToUpperList = Udf<string, string[]>((arg) =>
{
var ret = new string[] { arg, arg.ToUpper()};
return ret;
});
The text was updated successfully, but these errors were encountered: