Skip to content

[SPARK-29080][CORE][SPARKR] Support R file extension case-insensitively#25778

Closed
Loquats wants to merge 4 commits intoapache:masterfrom
Loquats:r-case
Closed

[SPARK-29080][CORE][SPARKR] Support R file extension case-insensitively#25778
Loquats wants to merge 4 commits intoapache:masterfrom
Loquats:r-case

Conversation

@Loquats
Copy link
Contributor

@Loquats Loquats commented Sep 12, 2019

What changes were proposed in this pull request?

Make r file extension check case insensitive for spark-submit.

Why are the changes needed?

spark-submit does not accept .r files as R scripts. Some codebases have r files that end with lowercase file extensions. It is inconvenient to use spark-submit with lowercase extension R files. The error is not very clear (https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/deploy/SparkSubmitArguments.scala#L232).

$ ./bin/spark-submit examples/src/main/r/dataframe.r
Exception in thread "main" org.apache.spark.SparkException: Cannot load main class from JAR file:/Users/dongjoon/APACHE/spark-release/spark-2.4.4-bin-hadoop2.7/examples/src/main/r/dataframe.r

Does this PR introduce any user-facing change?

Yes. spark-submit can now be used to run R scripts with .r file extension.

How was this patch tested?

Manual.

$ mv examples/src/main/r/dataframe.R examples/src/main/r/dataframe.r
$ ./bin/spark-submit examples/src/main/r/dataframe.r

@dongjoon-hyun dongjoon-hyun changed the title [WIP] Make r file extension check case insensitive [WIP][CORE][SPARKR] Make r file extension check case insensitive Sep 12, 2019
Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To check the coding style quickly, you can use dev/scalastyle in your PR directory. Please try to run it and fix the errors. Thanks.

@Loquats
Copy link
Contributor Author

Loquats commented Sep 13, 2019

@dongjoon-hyun Thanks for the tip. I ran dev/scalastyle and fixed the scalastyle errors. Also tried testing this in build/sbt with:

project core
test
project launcher
test

The tests for these projects pass.
I'm not sure what project the resource-manager files are part of, but I ran the following and they also pass:

testOnly *ClientSuite
testOnly *ApplicationMasterSuite

Please let me know if there is anything else I should test.

@dongjoon-hyun
Copy link
Member

dongjoon-hyun commented Sep 13, 2019

Thank you for update. Let's trigger a full test.

@dongjoon-hyun
Copy link
Member

ok to test

@dongjoon-hyun dongjoon-hyun changed the title [WIP][CORE][SPARKR] Make r file extension check case insensitive [SPARK-29080][CORE][SPARKR] Support RMake r file extension check case insensitive Sep 13, 2019
@dongjoon-hyun
Copy link
Member

@Loquats . BTW, do you have Apache JIRA account? You can create an Apache Spark JIRA issue there. Please file a JIRA issue and use the JIRA ID in the PR title next time.

For this issue, I created one for you and updated this PR title.

@Loquats
Copy link
Contributor Author

Loquats commented Sep 13, 2019

@dongjoon-hyun Thanks for creating the JIRA issue. I just created an Apache JIRA account, and I will remember to create a JIRA issue next time. My username is andyzhang.

@SparkQA
Copy link

SparkQA commented Sep 14, 2019

Test build #110575 has finished for PR 25778 at commit 623d15f.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dongjoon-hyun
Copy link
Member

Thank you for informing your JIRA id. When this PR is merged, your ID will be added to Apache Spark contributor group and SPARK-29080 will be assigned to you.

@dongjoon-hyun dongjoon-hyun self-requested a review September 14, 2019 04:39
@dongjoon-hyun dongjoon-hyun changed the title [SPARK-29080][CORE][SPARKR] Support RMake r file extension check case insensitive [SPARK-29080][CORE][SPARKR] Support R file extension case-insensitively Sep 14, 2019
@SparkQA
Copy link

SparkQA commented Sep 15, 2019

Test build #110601 has finished for PR 25778 at commit 8075892.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM.

@dongjoon-hyun
Copy link
Member

dongjoon-hyun commented Sep 15, 2019

I updated the PR description for the others because this will become a commit log.
Thank you for your first contribution, @Loquats . And, thank you for review and approval, @felixcheung .
Merged to master.

@dongjoon-hyun
Copy link
Member

You are added to the Apache Spark contributor group and SPARK-29080 is assigned to you.

Copy link
Member

@HyukjinKwon HyukjinKwon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm too

@Loquats
Copy link
Contributor Author

Loquats commented Sep 16, 2019

Thanks all for reviewing this PR!

Gaurangi94 pushed a commit to Gaurangi94/spark that referenced this pull request Sep 2, 2020
### What changes were proposed in this pull request?

Make r file extension check case insensitive for spark-submit.

### Why are the changes needed?

spark-submit does not accept `.r` files as R scripts. Some codebases have r files that end with lowercase file extensions. It is inconvenient to use spark-submit with lowercase extension R files. The error is not very clear (https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/deploy/SparkSubmitArguments.scala#L232).

```
$ ./bin/spark-submit examples/src/main/r/dataframe.r
Exception in thread "main" org.apache.spark.SparkException: Cannot load main class from JAR file:/Users/dongjoon/APACHE/spark-release/spark-2.4.4-bin-hadoop2.7/examples/src/main/r/dataframe.r
```

### Does this PR introduce any user-facing change?

Yes. spark-submit can now be used to run R scripts with `.r` file extension.

### How was this patch tested?

Manual.

```
$ mv examples/src/main/r/dataframe.R examples/src/main/r/dataframe.r
$ ./bin/spark-submit examples/src/main/r/dataframe.r
```

Closes apache#25778 from Loquats/r-case.

Authored-by: Andy Zhang <yue.zhang@databricks.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
Change-Id: I57ca6068c04ac3cdfd8606dbb48d5192bff1749d
Reviewed-on: https://bigdataoss-internal-review.googlesource.com/c/third_party/apache/spark/+/17710
Reviewed-by: Igor Dvorzhak <idv@google.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants