-
Notifications
You must be signed in to change notification settings - Fork 28.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-22635][SQL][ORC] FileNotFoundException while reading ORC files containing special characters #19844
Conversation
… containing special characters
@dongjoon-hyun @gatorsmile you helped reviewing SPARK-22146. Might you please help reviewing this too? Sorry for pinging you directly. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@@ -59,8 +59,9 @@ class OrcFileFormat extends FileFormat with DataSourceRegister with Serializable | |||
sparkSession: SparkSession, | |||
options: Map[String, String], | |||
files: Seq[FileStatus]): Option[StructType] = { | |||
val fileNames = files.map(_.getPath.toString) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not a big deal but just curious - why did you name it fileNames
BTW? I thought it's going to be just like paths
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because they are String
and not Path
s. Actually this change is not needed, I changed it only because it was easier for me while debugging. I can put it back as it was before.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean it's not the name of a file though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, I reverted the change, thanks.
Test build #84292 has finished for PR 19844 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, LGTM.
Thank you, @mgaido91 .
Test build #84304 has finished for PR 19844 at commit
|
ping, @cloud-fan , too. |
Merged to master. |
thanks @HyukjinKwon , may I kindly ask you to backport also to branch-2.2? Thanks. |
I think SPARK-22146 is not backported though? |
Ah, it was mistake of fixed version in the JIRA. Sure, will push it too. |
… containing special characters ## What changes were proposed in this pull request? SPARK-22146 fix the FileNotFoundException issue only for the `inferSchema` method, ie. only for the schema inference, but it doesn't fix the problem when actually reading the data. Thus nearly the same exception happens when someone tries to use the data. This PR covers fixing the problem also there. ## How was this patch tested? enhanced UT Author: Marco Gaido <mgaido@hortonworks.com> Closes #19844 from mgaido91/SPARK-22635.
Merged to branch-2.2 too. |
thank you @HyukjinKwon |
… containing special characters ## What changes were proposed in this pull request? SPARK-22146 fix the FileNotFoundException issue only for the `inferSchema` method, ie. only for the schema inference, but it doesn't fix the problem when actually reading the data. Thus nearly the same exception happens when someone tries to use the data. This PR covers fixing the problem also there. ## How was this patch tested? enhanced UT Author: Marco Gaido <mgaido@hortonworks.com> Closes apache#19844 from mgaido91/SPARK-22635.
What changes were proposed in this pull request?
SPARK-22146 fix the FileNotFoundException issue only for the
inferSchema
method, ie. only for the schema inference, but it doesn't fix the problem when actually reading the data. Thus nearly the same exception happens when someone tries to use the data. This PR covers fixing the problem also there.How was this patch tested?
enhanced UT