diff --git a/docs/structured-streaming-programming-guide.md b/docs/structured-streaming-programming-guide.md index 9b9177d44145f..d478042dea5c8 100644 --- a/docs/structured-streaming-programming-guide.md +++ b/docs/structured-streaming-programming-guide.md @@ -510,7 +510,20 @@ Here are the details of all the sources in Spark. File source path: path to the input directory, and common to all file formats. -

+
+ maxFilesPerTrigger: maximum number of new files to be considered in every trigger (default: no max) +
+ latestFirst: whether to processs the latest new files first, useful when there is a large backlog of files (default: false) +
+ fileNameOnly: whether to check new files based on only the filename instead of on the full path (default: false). With this set to `true`, the following files would be considered as the same file, because their filenames, "dataset.txt", are the same: +
+ · "file:///dataset.txt"
+ · "s3://a/dataset.txt"
+ · "s3n://a/b/dataset.txt"
+ · "s3a://a/b/c/dataset.txt"
+
+ +
For file-format-specific options, see the related methods in DataStreamReader (Scala/Java/Python/R). @@ -1234,18 +1247,7 @@ Here are the details of all the sinks in Spark. Append path: path to the output directory, must be specified. -
- maxFilesPerTrigger: maximum number of new files to be considered in every trigger (default: no max) -
- latestFirst: whether to processs the latest new files first, useful when there is a large backlog of files (default: false) -
- fileNameOnly: whether to check new files based on only the filename instead of on the full path (default: false). With this set to `true`, the following files would be considered as the same file, because their filenames, "dataset.txt", are the same: -
- · "file:///dataset.txt"
- · "s3://a/dataset.txt"
- · "s3n://a/b/dataset.txt"
- · "s3a://a/b/c/dataset.txt"
-
+

For file-format-specific options, see the related methods in DataFrameWriter (Scala/Java/Python/R).