Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-43739][BUILD] Upgrade commons-io to 2.12.0 #41271

Closed
wants to merge 5 commits into from

Conversation

panbingkun
Copy link
Contributor

@panbingkun panbingkun commented May 23, 2023

What changes were proposed in this pull request?

The pr aims to upgrade common-io from 2.11.0 to 2.12.0.

Why are the changes needed?

common-io new version includes some improvement & bug fixed, eg

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Pass GA.

@github-actions github-actions bot added the BUILD label May 23, 2023
@LuciferYang
Copy link
Contributor

Please don't upgrade commons-crypto, there is an existing issue for Apple Silicon.

#40082 (comment)

@panbingkun
Copy link
Contributor Author

#40082 (comment)

Ok, Let me revert it.

@panbingkun panbingkun changed the title [SPARK-43739][BUILD] Upgrade commons-io & commons-crypto to newest version [SPARK-43739][BUILD] Upgrade commons-io to 2.12.0 May 23, 2023
Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM.

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you check the avro failures and re-trigger the failed pipeline?

[info] *** 2 TESTS FAILED ***
[error] Failed: Total 294, Failed 2, Errors 0, Passed 292, Ignored 2
[error] Failed tests:
[error] 	org.apache.spark.sql.avro.AvroV1Suite
[error] 	org.apache.spark.sql.avro.AvroV2Suite

@panbingkun
Copy link
Contributor Author

Could you check the avro failures and re-trigger the failed pipeline?


[info] *** 2 TESTS FAILED ***

[error] Failed: Total 294, Failed 2, Errors 0, Passed 292, Ignored 2

[error] Failed tests:

[error] 	org.apache.spark.sql.avro.AvroV1Suite

[error] 	org.apache.spark.sql.avro.AvroV2Suite

OK,let me check it

@@ -1349,7 +1349,7 @@ abstract class AvroSuite
spark.read.format("avro").load("*/*/*/*/*/*/*/something.avro")
}

intercept[FileNotFoundException] {
intercept[IOException] {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a breaking change because this is a kind of information loss.
Could you mention this to the PR description?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know what is the background of this change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please do not merge it until I have investigated it thoroughly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

@LuciferYang LuciferYang May 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
intercept[IOException] {
intercept[java.nio.file.NoSuchFileException] {

?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@panbingkun
Copy link
Contributor Author

panbingkun commented May 23, 2023

  • 2.11.0 FileUtils.touch, Call stack as follows
    1.FileUtils.touch
    2.FileUtils.setLastModified
    3.File.setLastModified(timeMillis)
    4.FileSystem.setLastModifiedTime(this, time)
    5.UnixFileSystem.setLastModifiedTime --> Found that the file does not exist, throw FileNotFoundException
image
  • 2.12.0 FileUtils.touch, Call stack as follows
    1.FileUtils.touch
    2.PathUtils.touch
    3.java.nio.file.Files.createFile
    4.java.nio.file.Files.newByteChannel
    5.sun.nio.fs.UnixFileSystemProvider.newByteChannel
    6.sun.nio.fs.UnixChannelFactory.newFileChannel
    7.sun.nio.fs.UnixChannelFactory.open
    8.UnixNativeDispatcher.open
    9.(Native)UnixNativeDispatcher.open0 -> throw UnixException
    10.UnixException.rethrowAsIOException
    10.UnixException.translateToIOException --> Convert to java.nio.file.NoSuchFileException throw it
image image

@panbingkun panbingkun changed the title [SPARK-43739][BUILD] Upgrade commons-io to 2.12.0 [WIP][SPARK-43739][BUILD] Upgrade commons-io to 2.12.0 May 23, 2023
@dongjoon-hyun
Copy link
Member

  1. Please put the analysis to the PR description.
  2. +1 for NoSuchFileException because it's better than a general IOException.

@dongjoon-hyun dongjoon-hyun marked this pull request as draft May 23, 2023 15:40
@panbingkun panbingkun marked this pull request as ready for review May 24, 2023 01:29
@panbingkun
Copy link
Contributor Author

  1. Please put the analysis to the PR description.
  2. +1 for NoSuchFileException because it's better than a general IOException.

This is done.

@panbingkun
Copy link
Contributor Author

panbingkun commented May 24, 2023

@dongjoon-hyun @HyukjinKwon @LuciferYang
In fact, there is a problem with this UT. Its original intention was to test if there is no file with .avro extensions in the directory, and the read should fail. However, this UT triggered the error as FileUtils.touch instead of spark.read.format("avro").load(dir.toString).

After this PR is completed, I will submit a new PR to fix this issue.

The root cause for the failure of this case is that the parent directory was not created. When FileUtils.touch is called in version 1.11.0, it just throws java.io.FileNotFoundException, which covers the error.

@panbingkun panbingkun changed the title [WIP][SPARK-43739][BUILD] Upgrade commons-io to 2.12.0 [SPARK-43739][BUILD] Upgrade commons-io to 2.12.0 May 24, 2023
dongjoon-hyun pushed a commit that referenced this pull request May 24, 2023
…lid path throws exception'

### What changes were proposed in this pull request?
The pr aims to fix bug in AvroSuite for 'reading from invalid path throws exception'.

### Why are the changes needed?
- As discussed and analyzed in [41271](#41271 (comment))

- There is a problem with this UT. Its original intention was to test if there is no file with .avro extensions in the directory, and the read should fail. However, this UT triggered the error as FileUtils.touch instead of spark.read.format("avro").load(dir.toString).The root cause for the failure of this case is that the parent directory was not created. When FileUtils.touch is called in version 1.11.0, it just throws java.io.FileNotFoundException, which covers the error.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Pass GA.

Closes #41289 from panbingkun/SPARK-43767.

Authored-by: panbingkun <pbk1982@gmail.com>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I merged #41289 . Please rebase this PR to the master branch.

@panbingkun
Copy link
Contributor Author

I merged #41289 . Please rebase this PR to the master branch.

This is done.

@LuciferYang
Copy link
Contributor

LuciferYang commented May 24, 2023

2.Note: The error exception of the FileUtils.touch method has been changed from java.io.FileNotFoundException to java.nio.file.NoSuchFileException, changes from PR: [Add PathUtils.touch(Path)](https://github.com/apache/commons-io/commit/fd7c8182d2117d01f43ccc9fe939105f834ba672). The analysis process is as follows:

still need this part in description? @panbingkun

@dongjoon-hyun
Copy link
Member

Right, @LuciferYang . We can remove that from this PR description.

@panbingkun
Copy link
Contributor Author

Right, @LuciferYang . We can remove that from this PR description.

Ok, Let me update it.

@dongjoon-hyun
Copy link
Member

Thank you, @panbingkun and all! Merged to master.

czxm pushed a commit to czxm/spark that referenced this pull request Jun 12, 2023
…lid path throws exception'

### What changes were proposed in this pull request?
The pr aims to fix bug in AvroSuite for 'reading from invalid path throws exception'.

### Why are the changes needed?
- As discussed and analyzed in [41271](apache#41271 (comment))

- There is a problem with this UT. Its original intention was to test if there is no file with .avro extensions in the directory, and the read should fail. However, this UT triggered the error as FileUtils.touch instead of spark.read.format("avro").load(dir.toString).The root cause for the failure of this case is that the parent directory was not created. When FileUtils.touch is called in version 1.11.0, it just throws java.io.FileNotFoundException, which covers the error.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Pass GA.

Closes apache#41289 from panbingkun/SPARK-43767.

Authored-by: panbingkun <pbk1982@gmail.com>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
czxm pushed a commit to czxm/spark that referenced this pull request Jun 12, 2023
### What changes were proposed in this pull request?
The pr aims to upgrade common-io from 2.11.0 to 2.12.0.

### Why are the changes needed?
common-io new version includes some improvement & bug fixed, eg
- apache/commons-io#450
- apache/commons-io#368
- [Add PathUtils.touch(Path)](apache/commons-io@fd7c818) The error exception of the FileUtils.touch method has been changed from `java.io.FileNotFoundException` to `java.nio.file.NoSuchFileException`
- common-io 2.11.0 VS 2.12.0
apache/commons-io@rel/commons-io-2.11.0...rel/commons-io-2.12.0

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Pass GA.

Closes apache#41271 from panbingkun/SPARK-43739.

Authored-by: panbingkun <pbk1982@gmail.com>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
4 participants