-
Notifications
You must be signed in to change notification settings - Fork 28.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-19260] Spaces or "%20" in path parameter are not correctly handled with… #16614
Conversation
.map { d => Utils.resolveURI(d).toString } | ||
.getOrElse(DEFAULT_LOG_DIR) | ||
|
||
private val logDir = URLDecoder.decode(logURIString, "UTF-8") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, this is either a local path, in which case it shouldn't be URI-escaped, or it's a URI, in which case it can be parsed with java.net.URI
right? I think we take the latter approach in most other places in the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes you are right. I think the logic in Utils:resolveURI(path: String) seems not correct. When the path is "hdfs://namenode:9000/a b", "new URI(path)" will throws URISyntaxException and return a URI with "file" scheme instead of “hdfs” scheme .
To avoid this maybe we can use Path.getFileSystem to get the right filesystem. Thank you.
Test build #3543 has finished for PR 16614 at commit
|
…dled with HistoryServer
ebca3ed
to
e5ac5ff
Compare
Test build #3549 has finished for PR 16614 at commit
|
@@ -47,7 +47,7 @@ class FsHistoryProviderSuite extends SparkFunSuite with BeforeAndAfter with Matc | |||
private var testDir: File = null | |||
|
|||
before { | |||
testDir = Utils.createTempDir() | |||
testDir = Utils.createTempDir(namePrefix = s"a b%20c+d") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, because this affects the whole suite, I wonder if it's more sensible to test this behavior once in a specific test case, rather than test the behavior everywhere. It works, and I don't have a big problem with it, just looks a little surprising to read.
Ping @zuotingbing |
@zuotingbing see my last comment. I can fix this separately if you aren't able to follow up |
@srowen Happy Chinese New Year ! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, it's a little funny to me to not isolate the test for this particular corner case but it's not unreasonable.
It would be great if you could merge this to master and close this PR. |
Merged to master |
Thanks! |
…dled with… JIRA Issue: https://issues.apache.org/jira/browse/SPARK-19260 ## What changes were proposed in this pull request? 1. “spark.history.fs.logDirectory” supports with space character and “%20” characters. 2. As usually, if the run classpath includes hdfs-site.xml and core-site.xml files, the supplied path eg."/test" which does not contain a scheme should be taken as a HDFS path rather than a local path since the path parameter is a Hadoop dir. ## How was this patch tested? Update Unit Test and take some manual tests local: .sbin/start-history-server.sh "file:/a b" .sbin/start-history-server.sh "/abc%20c" (without hdfs-site.xml,core-site.xml) .sbin/start-history-server.sh "/a b" (without hdfs-site.xml,core-site.xml) .sbin/start-history-server.sh "/a b/a bc%20c" (without hdfs-site.xml,core-site.xml) hdfs: .sbin/start-history-server.sh "hdfs:/namenode:9000/a b" .sbin/start-history-server.sh "/a b" (with hdfs-site.xml,core-site.xml) .sbin/start-history-server.sh "/a b/a bc%20c" (with hdfs-site.xml,core-site.xml) Author: zuotingbing <zuo.tingbing9@zte.com.cn> Closes apache#16614 from zuotingbing/SPARK-19260.
JIRA Issue: https://issues.apache.org/jira/browse/SPARK-19260
What changes were proposed in this pull request?
How was this patch tested?
Update Unit Test and take some manual tests
local:
.sbin/start-history-server.sh "file:/a b"
.sbin/start-history-server.sh "/abc%20c" (without hdfs-site.xml,core-site.xml)
.sbin/start-history-server.sh "/a b" (without hdfs-site.xml,core-site.xml)
.sbin/start-history-server.sh "/a b/a bc%20c" (without hdfs-site.xml,core-site.xml)
hdfs:
.sbin/start-history-server.sh "hdfs:/namenode:9000/a b"
.sbin/start-history-server.sh "/a b" (with hdfs-site.xml,core-site.xml)
.sbin/start-history-server.sh "/a b/a bc%20c" (with hdfs-site.xml,core-site.xml)