Skip to content

Commit

Permalink
[SPARK-33089][SQL] make avro format propagate Hadoop config from DS o…
Browse files Browse the repository at this point in the history
…ptions to underlying HDFS file system

### What changes were proposed in this pull request?

In `AvroUtils`'s `inferSchema()`, propagate Hadoop config from DS options to underlying HDFS file system.

### Why are the changes needed?

There is a bug that when running:
```scala
spark.read.format("avro").options(conf).load(path)
```
The underlying file system will not receive the `conf` options.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

unit test added

Closes #29971 from yuningzh-db/avro_options.

Authored-by: Yuning Zhang <yuning.zhang@databricks.com>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
  • Loading branch information
yuningzh-db authored and HyukjinKwon committed Oct 8, 2020
1 parent 39510b0 commit bbc887b
Show file tree
Hide file tree
Showing 2 changed files with 11 additions and 1 deletion.
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ private[sql] object AvroUtils extends Logging {
spark: SparkSession,
options: Map[String, String],
files: Seq[FileStatus]): Option[StructType] = {
val conf = spark.sessionState.newHadoopConf()
val conf = spark.sessionState.newHadoopConfWithOptions(options)
val parsedOptions = new AvroOptions(options, conf)

if (parsedOptions.parameters.contains(ignoreExtensionKey)) {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -1802,6 +1802,16 @@ abstract class AvroSuite extends QueryTest with SharedSparkSession with NestedDa
assert(version === SPARK_VERSION_SHORT)
}
}

test("SPARK-33089: should propagate Hadoop config from DS options to underlying file system") {
withSQLConf(
"fs.file.impl" -> classOf[FakeFileSystemRequiringDSOption].getName,
"fs.file.impl.disable.cache" -> "true") {
val conf = Map("ds_option" -> "value")
val path = "file:" + testAvro.stripPrefix("file:")
spark.read.format("avro").options(conf).load(path)
}
}
}

class AvroV1Suite extends AvroSuite {
Expand Down

0 comments on commit bbc887b

Please sign in to comment.