Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-46586][SQL] Support s.c.immutable.ArraySeq as customCollectionCls in MapObjects #44591

Closed
wants to merge 4 commits into from

Conversation

panbingkun
Copy link
Contributor

@panbingkun panbingkun commented Jan 4, 2024

What changes were proposed in this pull request?

The pr aims to support s.c.immutable.ArraySeq as customCollectionCls in MapObjects.

Why are the changes needed?

Because s.c.immutable.ArraySeq is a commonly used type in Scala 2.13, we should support it.

Does this PR introduce any user-facing change?

Yes, We support s.c.immutable.ArraySeq in MapObjects.

How was this patch tested?

  • Add new UT: Added a new test for ArraySeq in UDFSuite; Also updated ObjectExpressionsSuite for MapObjects.
  • Pass GA.

Was this patch authored or co-authored using generative AI tooling?

No.

@github-actions github-actions bot added the SQL label Jan 4, 2024
panbingkun and others added 2 commits January 4, 2024 16:35
@LuciferYang
Copy link
Contributor

GA have passed, could you please update the PR description? @panbingkun

@panbingkun
Copy link
Contributor Author

GA have passed, could you please update the PR description? @panbingkun

Done.

@panbingkun panbingkun marked this pull request as ready for review January 5, 2024 01:38
@LuciferYang
Copy link
Contributor

cc @cloud-fan @viirya @MaxGekk @dongjoon-hyun FYI

@@ -938,6 +939,14 @@ case class MapObjects private(
executeFuncOnCollection(input).foreach(builder += _)
mutable.ArraySeq.make(builder.result())
}
case Some(cls) if classOf[immutable.ArraySeq[_]].isAssignableFrom(cls) =>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not related to this PR, but this MapObjects expression gets more and more complicated. Can we rewrite it with RuntimeReplacable using StaticInvoke/Invoke so that we can implement it with pure scala?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, let me give it a try.

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM.

@dongjoon-hyun
Copy link
Member

Merged to master. Thank you, @panbingkun and @cloud-fan .
Please proceed the rewriting task as a separate JIRA.

Copy link
Member

@viirya viirya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

Row(ArrayBuffer(100)))

val myUdf2 = udf((a: immutable.ArraySeq[Int]) =>
immutable.ArraySeq.unsafeWrapArray[Int](a.appended(5).appended(6).toArray))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

super nit: isn't it more common to use :+ to create new immutable collection with new elements?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, let me improve it.

vinodkc pushed a commit to vinodkc/spark that referenced this pull request Jan 5, 2024
…ionCls` in `MapObjects`

### What changes were proposed in this pull request?
The pr aims to support `s.c.immutable.ArraySeq` as `customCollectionCls` in `MapObjects`.

### Why are the changes needed?
Because `s.c.immutable.ArraySeq` is a commonly used type in Scala 2.13, we should support it.

### Does this PR introduce _any_ user-facing change?
Yes, We support `s.c.immutable.ArraySeq` in `MapObjects`.

### How was this patch tested?
- Add new UT: Added a new test for `ArraySeq` in UDFSuite; Also updated `ObjectExpressionsSuite` for `MapObjects`.
- Pass GA.

### Was this patch authored or co-authored using generative AI tooling?
No.

Closes apache#44591 from panbingkun/SPARK-46586.

Lead-authored-by: panbingkun <panbingkun@baidu.com>
Co-authored-by: panbingkun <pbk1982@gmail.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
LuciferYang pushed a commit that referenced this pull request Jan 12, 2024
### What changes were proposed in this pull request?
The pr is following up #44591.

### Why are the changes needed?
Fix issues in comments.
<img width="908" alt="image" src="https://github.com/apache/spark/assets/15246973/deec66a1-1bd1-44bc-9c16-f1fe8bbb14e8">

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
- Pass GA.
- Manually test.

### Was this patch authored or co-authored using generative AI tooling?
No.

Closes #44702 from panbingkun/SPARK-46586_fix.

Authored-by: panbingkun <panbingkun@baidu.com>
Signed-off-by: yangjie01 <yangjie01@baidu.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
5 participants