Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-33089][SQL] make avro format propagate Hadoop config from DS options to underlying HDFS file system #29971

Closed
wants to merge 1 commit into from

Conversation

yuningzh-db
Copy link
Contributor

What changes were proposed in this pull request?

In AvroUtils's inferSchema(), propagate Hadoop config from DS options to underlying HDFS file system.

Why are the changes needed?

There is a bug that when running:

spark.read.format("avro").options(conf).load(path)

The underlying file system will not receive the conf options.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

unit test added

@AmplabJenkins
Copy link

Can one of the admins verify this patch?

HyukjinKwon pushed a commit that referenced this pull request Oct 8, 2020
…ptions to underlying HDFS file system

### What changes were proposed in this pull request?

In `AvroUtils`'s `inferSchema()`, propagate Hadoop config from DS options to underlying HDFS file system.

### Why are the changes needed?

There is a bug that when running:
```scala
spark.read.format("avro").options(conf).load(path)
```
The underlying file system will not receive the `conf` options.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

unit test added

Closes #29971 from yuningzh-db/avro_options.

Authored-by: Yuning Zhang <yuning.zhang@databricks.com>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
(cherry picked from commit bbc887b)
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
@HyukjinKwon
Copy link
Member

HyukjinKwon commented Oct 8, 2020

GitHub Actions build passed. Merged to master and branch-3.0.

@yuningzh-db yuningzh-db deleted the avro_options branch October 12, 2020 17:40
holdenk pushed a commit to holdenk/spark that referenced this pull request Oct 27, 2020
…ptions to underlying HDFS file system

### What changes were proposed in this pull request?

In `AvroUtils`'s `inferSchema()`, propagate Hadoop config from DS options to underlying HDFS file system.

### Why are the changes needed?

There is a bug that when running:
```scala
spark.read.format("avro").options(conf).load(path)
```
The underlying file system will not receive the `conf` options.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

unit test added

Closes apache#29971 from yuningzh-db/avro_options.

Authored-by: Yuning Zhang <yuning.zhang@databricks.com>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
(cherry picked from commit bbc887b)
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants