Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-45338][CORE][SQL] Replace scala.collection.JavaConverters to scala.jdk.CollectionConverters #43126

Closed
wants to merge 5 commits into from

Conversation

Hisoka-X
Copy link
Member

What changes were proposed in this pull request?

Since scala 2.13.0, scala.collection.JavaConverters mark as deprecated. So this PR change all scala.collection.JavaConverters to scala.jdk.CollectionConverters.

Why are the changes needed?

Remove deprecated scala.collection.JavaConverters.

Does this PR introduce any user-facing change?

No

How was this patch tested?

passed CI

Was this patch authored or co-authored using generative AI tooling?

No

@Hisoka-X
Copy link
Member Author

cc @dongjoon-hyun @LuciferYang

@LuciferYang
Copy link
Contributor

also cc @srowen

@LuciferYang
Copy link
Contributor

spark/scalastyle-config.xml

Lines 254 to 259 in 2aa06fc

<!-- As of SPARK-9613 JavaConversions should be replaced with JavaConverters -->
<check customId="javaconversions" level="error" class="org.scalastyle.scalariform.TokenChecker" enabled="true">
<parameters><parameter name="regex">JavaConversions</parameter></parameters>
<customMessage>Instead of importing implicits in scala.collection.JavaConversions._, import
scala.collection.JavaConverters._ and use .asScala / .asJava methods</customMessage>
</check>

Will this check rule be fixed in another pr? Or fixed together in this one?

@Hisoka-X
Copy link
Member Author

spark/scalastyle-config.xml

Lines 254 to 259 in 2aa06fc

<!-- As of SPARK-9613 JavaConversions should be replaced with JavaConverters -->
<check customId="javaconversions" level="error" class="org.scalastyle.scalariform.TokenChecker" enabled="true">
<parameters><parameter name="regex">JavaConversions</parameter></parameters>
<customMessage>Instead of importing implicits in scala.collection.JavaConversions._, import
scala.collection.JavaConverters._ and use .asScala / .asJava methods</customMessage>
</check>

Will this check rule be fixed in another pr? Or fixed together in this one?

Let me change it in this PR.

@LuciferYang
Copy link
Contributor

I move the parent to SPARK-45314

Copy link
Contributor

@LuciferYang LuciferYang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although the change involves many files, this is fine to me.

@srowen srowen closed this in b7763a7 Sep 27, 2023
@srowen
Copy link
Member

srowen commented Sep 27, 2023

Merged to master

@Hisoka-X
Copy link
Member Author

Thanks @LuciferYang @srowen

@Hisoka-X Hisoka-X deleted the replace-java-converter branch September 27, 2023 13:38
@@ -37,7 +37,7 @@ private[python] class GaussianMixtureModelWrapper(model: GaussianMixtureModel) {
val modelGaussians = model.gaussians.map { gaussian =>
Array[Any](gaussian.mu, gaussian.sigma)
}
SerDe.dumps(JavaConverters.seqAsJavaListConverter(modelGaussians).asJava)
SerDe.dumps(modelGaussians.toSeq.asJava)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Hisoka-X

On second thought, maybe we should replace modelGaussians.toSeq.asJava with immutable.ArraySeq.unsafeWrapArray(modelGaussians).asJava here, as modelGaussians.toSeq would result in an extra Array.copyOf, which is more memory-consuming.

WDYT @srowen

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about we should do this, if we do that, should we change other toSeq? Which toSeq need to be modified and which ones do not? Are there any rules?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm inclined to optimize the several cases involved in this PR at #43172 first. For other similar issues, we can fix them together in the future. The current version has suppressed many compilation warnings, and if my memory serves me right, this includes parts related to this case.

@@ -87,7 +87,7 @@ private[hive] class SparkGetColumnsOperation(
}.toMap

if (isAuthV2Enabled) {
val privObjs = seqAsJavaListConverter(getPrivObjs(db2Tabs)).asJava
val privObjs = getPrivObjs(db2Tabs).toSeq.asJava
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this .toSeq is redundant

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me fix it. Thanks for review again!

@@ -68,7 +68,7 @@ private[hive] class SparkGetFunctionsOperation(
if (isAuthV2Enabled) {
// authorize this call on the schema objects
val privObjs =
HivePrivilegeObjectUtils.getHivePrivDbObjects(seqAsJavaListConverter(matchingDbs).asJava)
HivePrivilegeObjectUtils.getHivePrivDbObjects(matchingDbs.toSeq.asJava)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this .toSeq is redundant

srowen pushed a commit that referenced this pull request Sep 29, 2023
### What changes were proposed in this pull request?
This is a follow up PR for #43126 , remove useless  invoke `toSeq`

### Why are the changes needed?
Remove useless convert.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
exist test

### Was this patch authored or co-authored using generative AI tooling?
No

Closes #43172 from Hisoka-X/SPARK-45338-followup-remove-toseq.

Authored-by: Jia Fan <fanjiaeminem@qq.com>
Signed-off-by: Sean Owen <srowen@gmail.com>
dongjoon-hyun pushed a commit that referenced this pull request Oct 1, 2023
…WrapArray` instead of `array.toSeq`

### What changes were proposed in this pull request?
In #43126, there is a change from `JavaConverters.seqAsJavaListConverter(array).asJava` to `array.toSeq.asJava`. Since `array.toSeq` will involve an extra `Array.copyOf`, and the involved scenario in the subsequent processing is to directly convert the result of `array.toSeq.asJava` into an `Array[Byte]` result without the possibility of modifying the array content, so this pr replaces `array.toSeq` with `immutable.ArraySeq.unsafeWrapArray(array)` to avoid the extra `Array.copyOf`.

### Why are the changes needed?
In the current scenario, there is no risk of array data being modified, and `immutable.ArraySeq.unsafeWrapArray(array)` compared to `array.toSeq` avoids an extra `Array.copyOf`, bringing certain performance benefits and reducing memory usage.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Pass GitHub Actions

### Was this patch authored or co-authored using generative AI tooling?
No

Closes #43171 from LuciferYang/SPARK-45338-FOLLOWUP.

Authored-by: yangjie01 <yangjie01@baidu.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
@beliefer
Copy link
Contributor

beliefer commented Oct 8, 2023

@Hisoka-X Thank you! I got the change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants