Skip to content

Commit

Permalink
[SPARK-18760][SQL] Consistent format specification for FileFormats
Browse files Browse the repository at this point in the history
## What changes were proposed in this pull request?
This patch fixes the format specification in explain for file sources (Parquet and Text formats are the only two that are different from the rest):

Before:
```
scala> spark.read.text("test.text").explain()
== Physical Plan ==
*FileScan text [value#15] Batched: false, Format: org.apache.spark.sql.execution.datasources.text.TextFileFormatxyz, Location: InMemoryFileIndex[file:/scratch/rxin/spark/test.text], PartitionFilters: [], PushedFilters: [], ReadSchema: struct<value:string>
```

After:
```
scala> spark.read.text("test.text").explain()
== Physical Plan ==
*FileScan text [value#15] Batched: false, Format: Text, Location: InMemoryFileIndex[file:/scratch/rxin/spark/test.text], PartitionFilters: [], PushedFilters: [], ReadSchema: struct<value:string>
```

Also closes #14680.

## How was this patch tested?
Verified in spark-shell.

Author: Reynold Xin <rxin@databricks.com>

Closes #16187 from rxin/SPARK-18760.

(cherry picked from commit 5f894d2)
Signed-off-by: Reynold Xin <rxin@databricks.com>
  • Loading branch information
rxin committed Dec 8, 2016
1 parent a035644 commit 9483242
Show file tree
Hide file tree
Showing 3 changed files with 7 additions and 4 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,7 @@ class ParquetFileFormat

override def shortName(): String = "parquet"

override def toString: String = "ParquetFormat"
override def toString: String = "Parquet"

override def hashCode(): Int = getClass.hashCode()

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,8 @@ class TextFileFormat extends TextBasedFileFormat with DataSourceRegister {

override def shortName(): String = "text"

override def toString: String = "Text"

private def verifySchema(schema: StructType): Unit = {
if (schema.size != 1) {
throw new AnalysisException(
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,8 @@ import org.apache.spark.sql.test.SharedSQLContext
import org.apache.spark.sql.types._
import org.apache.spark.util.Utils

class FileStreamSourceTest extends StreamTest with SharedSQLContext with PrivateMethodTester {
abstract class FileStreamSourceTest
extends StreamTest with SharedSQLContext with PrivateMethodTester {

import testImplicits._

Expand Down Expand Up @@ -848,13 +849,13 @@ class FileStreamSourceSuite extends FileStreamSourceTest {
val explainWithoutExtended = q.explainInternal(false)
// `extended = false` only displays the physical plan.
assert("Relation.*text".r.findAllMatchIn(explainWithoutExtended).size === 0)
assert("TextFileFormat".r.findAllMatchIn(explainWithoutExtended).size === 1)
assert(": Text".r.findAllMatchIn(explainWithoutExtended).size === 1)

val explainWithExtended = q.explainInternal(true)
// `extended = true` displays 3 logical plans (Parsed/Optimized/Optimized) and 1 physical
// plan.
assert("Relation.*text".r.findAllMatchIn(explainWithExtended).size === 3)
assert("TextFileFormat".r.findAllMatchIn(explainWithExtended).size === 1)
assert(": Text".r.findAllMatchIn(explainWithExtended).size === 1)
} finally {
q.stop()
}
Expand Down

0 comments on commit 9483242

Please sign in to comment.