Skip to content

Commit

Permalink
Removing an unnecessary note in migration guide
Browse files Browse the repository at this point in the history
  • Loading branch information
MaxGekk committed Aug 17, 2018
1 parent 2d8e754 commit 96a94cc
Showing 1 changed file with 0 additions and 1 deletion.
1 change: 0 additions & 1 deletion docs/sql-programming-guide.md
Expand Up @@ -1894,7 +1894,6 @@ working with timestamps in `pandas_udf`s to get the best performance, see
- In version 2.3 and earlier, CSV rows are considered as malformed if at least one column value in the row is malformed. CSV parser dropped such rows in the DROPMALFORMED mode or outputs an error in the FAILFAST mode. Since Spark 2.4, CSV row is considered as malformed only when it contains malformed column values requested from CSV datasource, other values can be ignored. As an example, CSV file contains the "id,name" header and one row "1234". In Spark 2.4, selection of the id column consists of a row with one column value 1234 but in Spark 2.3 and earlier it is empty in the DROPMALFORMED mode. To restore the previous behavior, set `spark.sql.csv.parser.columnPruning.enabled` to `false`.
- Since Spark 2.4, File listing for compute statistics is done in parallel by default. This can be disabled by setting `spark.sql.parallelFileListingInStatsComputation.enabled` to `False`.
- Since Spark 2.4, Metadata files (e.g. Parquet summary files) and temporary files are not counted as data files when calculating table size during Statistics computation.
- Since Spark 2.4, text-based datasources like CSV and JSON don't parse input lines if the required schema pushed down to the datasources is empty. The schema can be empty in the case of the count() action. For example, Spark 2.3 and earlier versions failed on JSON files with invalid encoding but Spark 2.4 returns total number of lines in the file. To restore the previous behavior when the underlying parser is always invoked even for the empty schema, set `true` to `spark.sql.legacy.bypassParserForEmptySchema`. This option will be removed in Spark 3.0.

## Upgrading From Spark SQL 2.3.0 to 2.3.1 and above

Expand Down

0 comments on commit 96a94cc

Please sign in to comment.