[SPARK-16411][SQL][STREAMING] Add textFile to Structured Streaming.#14087
[SPARK-16411][SQL][STREAMING] Add textFile to Structured Streaming.#14087ScrapCodes wants to merge 2 commits intoapache:masterfrom
Conversation
|
Test build #61908 has finished for PR 14087 at commit
|
There was a problem hiding this comment.
No actually it should be text files.
|
Test build #61976 has finished for PR 14087 at commit
|
|
@marmbrus Do you think this is useful ? |
|
/cc @tdas |
|
@tdas Ping ! |
|
Test build #66261 has finished for PR 14087 at commit
|
There was a problem hiding this comment.
This change seems unrelated and takes us out of sync with the batch version. I don't think this means a JVM interface, but rather the interface in API.
There was a problem hiding this comment.
Understood, thanks for correcting !
There was a problem hiding this comment.
Should text files be plural here? The api would be more intuitive by copying the non-streaming equivalent with a vararg-method for multiple parameters
There was a problem hiding this comment.
I would like to be corrected, as I just followed the convention over here. Since this class does not have any vararg method for other APIs, I was doubtful in adding one myself.
There was a problem hiding this comment.
It might be weird to add var args, since the streaming case would always be to watch a directory (not list a bunch of files). I think its fine to leave it out for now.
This is existing, but its a little odd that the methods in this file talk about loading files rather than watching directories of files and processing them as they appear.
|
Test build #66303 has finished for PR 14087 at commit
|
|
Test build #66376 has finished for PR 14087 at commit
|
| test("read from textfile") { | ||
| withTempDirs { case (src, tmp) => | ||
| val textStream = spark.readStream.textFile(src.getCanonicalPath) | ||
| val filtered = textStream.filter($"value" contains "keep") |
There was a problem hiding this comment.
One last comment. I'd use the typed API here since that is the whole point of textFile vs text.
|
Test build #66498 has finished for PR 14087 at commit
|
|
Thanks, merging to master. |
## What changes were proposed in this pull request? Adds the textFile API which exists in DataFrameReader and serves same purpose. ## How was this patch tested? Added corresponding testcase. Author: Prashant Sharma <prashsh1@in.ibm.com> Closes apache#14087 from ScrapCodes/textFile.
What changes were proposed in this pull request?
Adds the textFile API which exists in DataFrameReader and serves same purpose.
How was this patch tested?
Added corresponding testcase.