Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-13503][SQL] Support to specify the (writing) option for compression codec for TEXT #11384

Closed
wants to merge 1 commit into from

Conversation

HyukjinKwon
Copy link
Member

What changes were proposed in this pull request?

https://issues.apache.org/jira/browse/SPARK-13503
This PR makes the TEXT datasource can compress output by option instead of manually setting Hadoop configurations.
For reflecting codec by names, it is similar with #10805 and #10858.

How was this patch tested?

This was tested with unittests and with dev/run_tests for coding style

@HyukjinKwon
Copy link
Member Author

cc @rxin

@rxin
Copy link
Contributor

rxin commented Feb 26, 2016

LGTM pending tests.

@SparkQA
Copy link

SparkQA commented Feb 26, 2016

Test build #52025 has finished for PR 11384 at commit 4b999fa.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member Author

@rxin Would you merge this if it looks okay?

@rxin
Copy link
Contributor

rxin commented Feb 26, 2016

I've merged this in master. Can you submit a pr to document the compression option in DataFrameReader? We should document it for all sources, from json to csv to text.

@asfgit asfgit closed this in 9812a24 Feb 26, 2016
@HyukjinKwon
Copy link
Member Author

@rxin I guess you meant DataFrameWriter. Sure but for CSV I will do it later I will just do it for CSV too as they would be just comments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants