Skip to content

Releases: databricks/spark-csv

Version 1.5.0

05 Sep 23:45
Compare
Choose a tag to compare

Last major release for Apache Spark CSV data source.

Version 1.4.0

04 Mar 19:46
Compare
Choose a tag to compare

CSV Data Source for Spark Version 1.4.0

Features

  • Support for specifying custom date format for date and time (by @HyukjinKwon)
  • Support for quote mode (by @tobithiel)
  • Support for Boolean in type inference (by @yucheng1992)

Improvements

  • Fixed setting escape character with none quoteMode (by @d18s)
  • Several documentation improvements (by @HyukjinKwon and @tanwanirahul)
  • Fixing negative index bug when user specified manual schema and data has missing fields (by @HyukjinKwon)
  • Support for blank lines (by @HyukjinKwon)
  • Support for short names in compression codec (by @HyukjinKwon)
  • Fixed a corner case with type inference (by @tanwanirahul)
  • Support for dropping malformed rows for TimestampType and DateType types (by @HyukjinKwon)
  • Support for escaping null value during parsing (by @addisonj)
  • Testing against Saprk 1.6.0 (by @HyukjinKwon)

Version 1.3.0

20 Nov 20:14
Compare
Choose a tag to compare

Spark-csv 1.3.0 adds following

Features:

  • Printing content of records when failing or dropping
  • Support for Timestamps in type inference
  • Support for roundtrip null values of any type (by @andy327)
  • Parsing double and float according to Locale if default cast fails (by @gasparms)
  • Support for nullable quote character (by @jamesblau)
  • Using system line separator for parsing (by Glenn Murray)
  • Support for pruned scan for required fields (by @HyukjinKwon)
  • Support for compressionCodec in all languages (by @msperlich)
  • Support for data source short name with Spark 1.5+

Improvements

  • Several documentation updates and fixes (by multiple community contributors)
  • Several code style and compatibility automated tests (by @JoshRosen)
  • Unit test consolidation (by @brkyvz)