Background
Follow-up for #2281.
PR #2281 enables Parquet correctness suite wrappers and Spark correctness tests for additional Spark versions. To keep those jobs green, it also adds version-specific exclude, excludeByPrefix, and disable entries in AuronSparkTestSettings.scala.
Scope
The goal of this issue is to track those exclusions so they are not lost after #2281 is merged. Each entry should eventually be reviewed and either fixed, re-enabled, or kept excluded with a clear reason.
Tracked entries by version:
- Spark 3.1: 47 entries
- Spark 3.2: 70 entries
- Spark 3.4: 80 entries
- Spark 3.5: 83 entries
- Spark 4.0: 79 entries
- Spark 4.1: 89 entries
Total tracked entries: 448.
Tracking
The list below was extracted from the #2281 diff for AuronSparkTestSettings.scala. It includes Parquet suite exclusions and the Spark correctness suite exclusions added when enabling those jobs.
Full exclusion list by Spark version
spark31
AuronParquetIOSuite
AuronParquetInteroperabilitySuite
AuronParquetProtobufCompatibilitySuite
AuronParquetQuerySuite
AuronParquetSchemaSuite
AuronParquetThriftCompatibilitySuite
AuronParquetV1FilterSuite
AuronParquetV1PartitionDiscoverySuite
AuronParquetV1QuerySuite
AuronParquetV2FilterSuite
AuronParquetV2QuerySuite
spark32
AuronDataFrameAggregateSuite
AuronParquetIOSuite
AuronParquetInteroperabilitySuite
AuronParquetProtobufCompatibilitySuite
AuronParquetQuerySuite
AuronParquetRebaseDatetimeSuite
AuronParquetRebaseDatetimeV1Suite
AuronParquetRebaseDatetimeV2Suite
AuronParquetSchemaSuite
AuronParquetThriftCompatibilitySuite
AuronParquetV1FilterSuite
AuronParquetV1PartitionDiscoverySuite
AuronParquetV1QuerySuite
AuronParquetV2FilterSuite
AuronParquetV2QuerySuite
spark34
AuronDataFrameSuite
AuronParquetFieldIdIOSuite
AuronParquetIOSuite
AuronParquetInteroperabilitySuite
AuronParquetProtobufCompatibilitySuite
AuronParquetQuerySuite
AuronParquetRebaseDatetimeSuite
AuronParquetRebaseDatetimeV1Suite
AuronParquetRebaseDatetimeV2Suite
AuronParquetSchemaSuite
AuronParquetThriftCompatibilitySuite
AuronParquetV1FilterSuite
AuronParquetV1PartitionDiscoverySuite
AuronParquetV1QuerySuite
AuronParquetV2FilterSuite
AuronParquetV2QuerySuite
spark35
AuronDataFrameAggregateSuite
AuronDataFrameSuite
AuronParquetFieldIdIOSuite
AuronParquetIOSuite
AuronParquetInteroperabilitySuite
AuronParquetProtobufCompatibilitySuite
AuronParquetQuerySuite
AuronParquetRebaseDatetimeSuite
AuronParquetRebaseDatetimeV1Suite
AuronParquetRebaseDatetimeV2Suite
AuronParquetSchemaSuite
AuronParquetThriftCompatibilitySuite
AuronParquetV1FilterSuite
AuronParquetV1PartitionDiscoverySuite
AuronParquetV1QuerySuite
AuronParquetV2FilterSuite
AuronParquetV2QuerySuite
spark40
AuronDataFrameFunctionsSuite
AuronDateFunctionsSuite
AuronMathFunctionsSuite
AuronMiscFunctionsSuite
AuronStringFunctionsSuite
AuronDataFrameAggregateSuite
AuronDatasetAggregatorSuite
AuronTypedImperativeAggregateSuite
AuronDataFrameSuite
AuronParquetAvroCompatibilitySuite
AuronParquetColumnIndexSuite
AuronParquetEncodingSuite
AuronParquetFieldIdIOSuite
AuronParquetIOSuite
AuronParquetInteroperabilitySuite
AuronParquetPartitionDiscoverySuite
AuronParquetProtobufCompatibilitySuite
AuronParquetQuerySuite
AuronParquetRebaseDatetimeSuite
AuronParquetRebaseDatetimeV1Suite
AuronParquetRebaseDatetimeV2Suite
AuronParquetSchemaPruningSuite
AuronParquetSchemaSuite
AuronParquetThriftCompatibilitySuite
AuronParquetV1FilterSuite
AuronParquetV1PartitionDiscoverySuite
AuronParquetV1QuerySuite
AuronParquetV1SchemaPruningSuite
AuronParquetV2FilterSuite
AuronParquetV2PartitionDiscoverySuite
AuronParquetV2QuerySuite
AuronParquetV2SchemaPruningSuite
spark41
AuronDataFrameFunctionsSuite
AuronDateFunctionsSuite
AuronMathFunctionsSuite
AuronMiscFunctionsSuite
AuronStringFunctionsSuite
AuronDataFrameAggregateSuite
AuronDatasetAggregatorSuite
AuronTypedImperativeAggregateSuite
AuronDataFrameSuite
AuronParquetAvroCompatibilitySuite
AuronParquetColumnIndexSuite
AuronParquetEncodingSuite
AuronParquetFieldIdIOSuite
AuronParquetFileFormatSuite
AuronParquetFileFormatV1Suite
AuronParquetFileFormatV2Suite
AuronParquetIOSuite
AuronParquetInteroperabilitySuite
AuronParquetPartitionDiscoverySuite
AuronParquetProtobufCompatibilitySuite
AuronParquetQuerySuite
AuronParquetRebaseDatetimeSuite
AuronParquetRebaseDatetimeV1Suite
AuronParquetRebaseDatetimeV2Suite
AuronParquetSchemaPruningSuite
AuronParquetSchemaSuite
AuronParquetThriftCompatibilitySuite
AuronParquetV1FilterSuite
AuronParquetV1PartitionDiscoverySuite
AuronParquetV1QuerySuite
AuronParquetV1SchemaPruningSuite
AuronParquetV2FilterSuite
AuronParquetV2PartitionDiscoverySuite
AuronParquetV2QuerySuite
AuronParquetV2SchemaPruningSuite
Notes
Larger groups can be split into smaller issues or PRs.
Background
Follow-up for #2281.
PR #2281 enables Parquet correctness suite wrappers and Spark correctness tests for additional Spark versions. To keep those jobs green, it also adds version-specific
exclude,excludeByPrefix, anddisableentries inAuronSparkTestSettings.scala.Scope
The goal of this issue is to track those exclusions so they are not lost after #2281 is merged. Each entry should eventually be reviewed and either fixed, re-enabled, or kept excluded with a clear reason.
Tracked entries by version:
Total tracked entries: 448.
Tracking
The list below was extracted from the #2281 diff for
AuronSparkTestSettings.scala. It includes Parquet suite exclusions and the Spark correctness suite exclusions added when enabling those jobs.Full exclusion list by Spark version
spark31
AuronParquetIOSuiteexclude: read dictionary encoded decimals written as INT32exclude: read dictionary encoded decimals written as INT64exclude: read dictionary encoded decimals written as FIXED_LEN_BYTE_ARRAYexclude: read dictionary and plain encoded timestamp_millis written as INT64exclude: SPARK-31159: compatibility with Spark 2.4 in reading dates/timestampsexclude: SPARK-31159: rebasing timestamps in writeexclude: SPARK-31159: rebasing dates in writeAuronParquetInteroperabilitySuiteexclude: parquet timestamp conversionAuronParquetProtobufCompatibilitySuiteexclude: unannotated array of primitive typeexclude: unannotated array of structexclude: struct with unannotated arrayexclude: unannotated array of struct with unannotated arrayexclude: unannotated array of stringAuronParquetQuerySuiteexclude: SPARK-10634 timestamp written and read as INT64 - truncationexclude: Enabling/disabling ignoreCorruptFilesexclude: SPARK-26677: negated null-safe equality comparison should not filter matched row groupsexclude: Migration from INT96 to TIMESTAMP_MICROS timestamp typeexclude: SPARK-34212 Parquet should read decimals correctlyAuronParquetSchemaSuiteexclude: schema mismatch failure error message for parquet readerexclude: schema mismatch failure error message for parquet vectorized readerAuronParquetThriftCompatibilitySuiteexclude: Read Parquet file generated by parquet-thriftAuronParquetV1FilterSuiteexcludeByPrefix: filter pushdown -exclude: Filters should be pushed down for vectorized Parquet reader at row group levelexclude: SPARK-31026: Parquet predicate pushdown for fields having dots in the namesexclude: Filters should be pushed down for Parquet readers at row group levelexclude: SPARK-23852: Broken Parquet push-down for partially-written statsexclude: SPARK-17091: Convert IN predicate to Parquet filter push-downexclude: SPARK-25207: exception when duplicate fields in case-insensitive modeAuronParquetV1PartitionDiscoverySuiteexclude: read partitioned table - partition key included in Parquet fileexclude: read partitioned table - with nulls and partition keys are included in Parquet fileexclude: SPARK-18108 Parquet reader fails when data column types conflict with partition onesexclude: SPARK-21463: MetadataLogFileIndex should respect userSpecifiedSchema for partition colsAuronParquetV1QuerySuiteexclude: SPARK-10634 timestamp written and read as INT64 - truncationexclude: Enabling/disabling ignoreCorruptFilesexclude: SPARK-26677: negated null-safe equality comparison should not filter matched row groupsexclude: Migration from INT96 to TIMESTAMP_MICROS timestamp typeexclude: SPARK-34212 Parquet should read decimals correctlyexclude: returning batch for wide tableAuronParquetV2FilterSuiteexclude: SPARK-31026: Parquet predicate pushdown for fields having dots in the namesexclude: Filters should be pushed down for Parquet readers at row group levelexclude: SPARK-23852: Broken Parquet push-down for partially-written statsexclude: SPARK-17091: Convert IN predicate to Parquet filter push-downexclude: SPARK-25207: exception when duplicate fields in case-insensitive modeAuronParquetV2QuerySuiteexclude: SPARK-10634 timestamp written and read as INT64 - truncationexclude: SPARK-26677: negated null-safe equality comparison should not filter matched row groupsexclude: Migration from INT96 to TIMESTAMP_MICROS timestamp typeexclude: returning batch for wide tablespark32
AuronDataFrameAggregateSuiteexclude: SPARK-34837: Support ANSI SQL intervals by the aggregate functionavgAuronParquetIOSuiteexclude: SPARK-34817: Read UINT_64 as Decimal from parquetexclude: SPARK-35640: read binary as timestamp should throw schema incompatible errorexclude: SPARK-35640: int as long should throw schema incompatible errorexclude: read dictionary encoded decimals written as INT32exclude: read dictionary encoded decimals written as INT64exclude: read dictionary encoded decimals written as FIXED_LEN_BYTE_ARRAYexclude: read dictionary and plain encoded timestamp_millis written as INT64exclude: SPARK-36726: test incorrect Parquet row group file offsetexclude: SPARK-34167: read LongDecimals with precision < 10, VectorizedReader trueexclude: SPARK-34167: read LongDecimals with precision < 10, VectorizedReader falseAuronParquetInteroperabilitySuiteexclude: parquet timestamp conversionAuronParquetProtobufCompatibilitySuiteexclude: unannotated array of primitive typeexclude: unannotated array of structexclude: struct with unannotated arrayexclude: unannotated array of struct with unannotated arrayexclude: unannotated array of stringAuronParquetQuerySuiteexclude: SPARK-10634 timestamp written and read as INT64 - truncationexclude: Enabling/disabling ignoreCorruptFilesexclude: SPARK-26677: negated null-safe equality comparison should not filter matched row groupsexclude: Migration from INT96 to TIMESTAMP_MICROS timestamp typeexclude: SPARK-34212 Parquet should read decimals correctlyAuronParquetRebaseDatetimeSuiteexclude: SPARK-31159, SPARK-37705: compatibility with Spark 2.4/3.2 in reading dates/timestampsexclude: SPARK-31159, SPARK-37705: rebasing timestamps in writeexclude: SPARK-31159: rebasing dates in writeexclude: SPARK-35427: datetime rebasing in the EXCEPTION modeAuronParquetRebaseDatetimeV1Suiteexclude: SPARK-31159, SPARK-37705: compatibility with Spark 2.4/3.2 in reading dates/timestampsexclude: SPARK-31159, SPARK-37705: rebasing timestamps in writeexclude: SPARK-31159: rebasing dates in writeexclude: SPARK-35427: datetime rebasing in the EXCEPTION modeAuronParquetRebaseDatetimeV2Suiteexclude: SPARK-31159, SPARK-37705: compatibility with Spark 2.4/3.2 in reading dates/timestampsexclude: SPARK-35427: datetime rebasing in the EXCEPTION modeAuronParquetSchemaSuiteexclude: schema mismatch failure error message for parquet readerexclude: schema mismatch failure error message for parquet vectorized readerexclude: SPARK-40819: parquet file with TIMESTAMP(NANOS, true) (with nanosAsLong=true)exclude: SPARK-40819: parquet file with TIMESTAMP(NANOS, true) (with default nanosAsLong=false)AuronParquetThriftCompatibilitySuiteexclude: Read Parquet file generated by parquet-thriftAuronParquetV1FilterSuiteexcludeByPrefix: SPARK-40280: filter pushdown -excludeByPrefix: filter pushdown -exclude: Filters should be pushed down for vectorized Parquet reader at row group levelexclude: SPARK-31026: Parquet predicate pushdown for fields having dots in the namesexclude: Filters should be pushed down for Parquet readers at row group levelexclude: SPARK-23852: Broken Parquet push-down for partially-written statsexclude: SPARK-17091: Convert IN predicate to Parquet filter push-downexclude: SPARK-25207: exception when duplicate fields in case-insensitive modeexclude: Support Parquet column indexexclude: SPARK-34562: Bloom filter push downAuronParquetV1PartitionDiscoverySuiteexclude: read partitioned table - partition key included in Parquet fileexclude: read partitioned table - with nulls and partition keys are included in Parquet fileexclude: SPARK-18108 Parquet reader fails when data column types conflict with partition onesexclude: SPARK-21463: MetadataLogFileIndex should respect userSpecifiedSchema for partition colsAuronParquetV1QuerySuiteexclude: SPARK-10634 timestamp written and read as INT64 - truncationexclude: Enabling/disabling ignoreCorruptFilesexclude: SPARK-26677: negated null-safe equality comparison should not filter matched row groupsexclude: Migration from INT96 to TIMESTAMP_MICROS timestamp typeexclude: SPARK-34212 Parquet should read decimals correctlyexclude: returning batch for wide tableexclude: SPARK-39833: pushed filters with count()exclude: SPARK-39833: pushed filters with project without filter columnsAuronParquetV2FilterSuiteexcludeByPrefix: SPARK-40280: filter pushdown -exclude: SPARK-31026: Parquet predicate pushdown for fields having dots in the namesexclude: Filters should be pushed down for Parquet readers at row group levelexclude: SPARK-23852: Broken Parquet push-down for partially-written statsexclude: SPARK-17091: Convert IN predicate to Parquet filter push-downexclude: SPARK-25207: exception when duplicate fields in case-insensitive modeexclude: Support Parquet column indexAuronParquetV2QuerySuiteexclude: SPARK-10634 timestamp written and read as INT64 - truncationexclude: SPARK-26677: negated null-safe equality comparison should not filter matched row groupsexclude: Migration from INT96 to TIMESTAMP_MICROS timestamp typeexclude: returning batch for wide tablespark34
AuronDataFrameSuiteexclude: SPARK-41048: Improve output partitioning and ordering with AQE cacheAuronParquetFieldIdIOSuiteexclude: Parquet reads infer fields using field ids correctlyexclude: absence of field idsexclude: SPARK-38094: absence of field ids: reading nested schemaexclude: multiple id matchesexclude: read parquet file without idsexclude: global read/write flag should work correctlyAuronParquetIOSuiteexclude: vectorized reader: missing all struct fieldsexclude: SPARK-34817: Read UINT_64 as Decimal from parquetexclude: SPARK-35640: read binary as timestamp should throw schema incompatible errorexclude: SPARK-35640: int as long should throw schema incompatible errorexclude: read dictionary encoded decimals written as INT32exclude: read dictionary encoded decimals written as INT64exclude: read dictionary encoded decimals written as FIXED_LEN_BYTE_ARRAYexclude: read dictionary and plain encoded timestamp_millis written as INT64exclude: SPARK-40128 read DELTA_LENGTH_BYTE_ARRAY encoded stringsexclude: SPARK-36726: test incorrect Parquet row group file offsetexclude: SPARK-34167: read LongDecimals with precision < 10, VectorizedReader trueexclude: SPARK-34167: read LongDecimals with precision < 10, VectorizedReader falseAuronParquetInteroperabilitySuiteexclude: parquet timestamp conversionAuronParquetProtobufCompatibilitySuiteexclude: unannotated array of primitive typeexclude: unannotated array of structexclude: struct with unannotated arrayexclude: unannotated array of struct with unannotated arrayexclude: unannotated array of stringAuronParquetQuerySuiteexclude: SPARK-10634 timestamp written and read as INT64 - truncationexclude: Enabling/disabling ignoreCorruptFilesexclude: SPARK-26677: negated null-safe equality comparison should not filter matched row groupsexclude: Migration from INT96 to TIMESTAMP_MICROS timestamp typeexclude: SPARK-34212 Parquet should read decimals correctlyexclude: row group skipping doesn't overflow when reading into larger typeAuronParquetRebaseDatetimeSuiteexclude: SPARK-31159, SPARK-37705: compatibility with Spark 2.4/3.2 in reading dates/timestampsexclude: SPARK-31159, SPARK-37705: rebasing timestamps in writeexclude: SPARK-31159: rebasing dates in writeexclude: SPARK-35427: datetime rebasing in the EXCEPTION modeAuronParquetRebaseDatetimeV1Suiteexclude: SPARK-31159, SPARK-37705: compatibility with Spark 2.4/3.2 in reading dates/timestampsexclude: SPARK-31159, SPARK-37705: rebasing timestamps in writeexclude: SPARK-31159: rebasing dates in writeexclude: SPARK-35427: datetime rebasing in the EXCEPTION modeAuronParquetRebaseDatetimeV2Suiteexclude: SPARK-31159, SPARK-37705: compatibility with Spark 2.4/3.2 in reading dates/timestampsexclude: SPARK-35427: datetime rebasing in the EXCEPTION modeAuronParquetSchemaSuiteexclude: schema mismatch failure error message for parquet readerexclude: schema mismatch failure error message for parquet vectorized readerexclude: SPARK-40819: parquet file with TIMESTAMP(NANOS, true) (with nanosAsLong=true)exclude: SPARK-40819: parquet file with TIMESTAMP(NANOS, true) (with default nanosAsLong=false)AuronParquetThriftCompatibilitySuiteexclude: Read Parquet file generated by parquet-thriftAuronParquetV1FilterSuiteexcludeByPrefix: SPARK-40280: filter pushdown -excludeByPrefix: filter pushdown -exclude: Filters should be pushed down for vectorized Parquet reader at row group levelexclude: SPARK-31026: Parquet predicate pushdown for fields having dots in the namesexclude: Filters should be pushed down for Parquet readers at row group levelexclude: SPARK-23852: Broken Parquet push-down for partially-written statsexclude: SPARK-17091: Convert IN predicate to Parquet filter push-downexclude: SPARK-25207: exception when duplicate fields in case-insensitive modeexclude: Support Parquet column indexexclude: SPARK-34562: Bloom filter push downAuronParquetV1PartitionDiscoverySuiteexclude: read partitioned table - partition key included in Parquet fileexclude: read partitioned table - with nulls and partition keys are included in Parquet fileexclude: SPARK-18108 Parquet reader fails when data column types conflict with partition onesexclude: SPARK-21463: MetadataLogFileIndex should respect userSpecifiedSchema for partition colsAuronParquetV1QuerySuiteexclude: SPARK-10634 timestamp written and read as INT64 - truncationexclude: Enabling/disabling ignoreCorruptFilesexclude: SPARK-26677: negated null-safe equality comparison should not filter matched row groupsexclude: Migration from INT96 to TIMESTAMP_MICROS timestamp typeexclude: SPARK-34212 Parquet should read decimals correctlyexclude: row group skipping doesn't overflow when reading into larger typeexclude: returning batch for wide tableexclude: SPARK-39833: pushed filters with count()exclude: SPARK-39833: pushed filters with project without filter columnsAuronParquetV2FilterSuiteexclude: SPARK-31026: Parquet predicate pushdown for fields having dots in the namesexclude: Filters should be pushed down for Parquet readers at row group levelexclude: SPARK-23852: Broken Parquet push-down for partially-written statsexclude: SPARK-17091: Convert IN predicate to Parquet filter push-downexclude: SPARK-25207: exception when duplicate fields in case-insensitive modeexcludeByPrefix: SPARK-40280: filter pushdown -exclude: Support Parquet column indexAuronParquetV2QuerySuiteexclude: SPARK-10634 timestamp written and read as INT64 - truncationexclude: SPARK-26677: negated null-safe equality comparison should not filter matched row groupsexclude: Migration from INT96 to TIMESTAMP_MICROS timestamp typeexclude: returning batch for wide tablespark35
AuronDataFrameAggregateSuiteexclude: SPARK-16484: hll_*_agg + hll_union negative testsexclude: SPARK-43876: Enable fast hashmap for distinct queriesAuronDataFrameSuiteexclude: SPARK-41048: Improve output partitioning and ordering with AQE cacheAuronParquetFieldIdIOSuiteexclude: Parquet reads infer fields using field ids correctlyexclude: absence of field idsexclude: SPARK-38094: absence of field ids: reading nested schemaexclude: multiple id matchesexclude: read parquet file without idsexclude: global read/write flag should work correctlyAuronParquetIOSuiteexclude: vectorized reader: missing all struct fieldsexclude: SPARK-34817: Read UINT_64 as Decimal from parquetexclude: SPARK-35640: read binary as timestamp should throw schema incompatible errorexclude: SPARK-35640: int as long should throw schema incompatible errorexclude: read dictionary encoded decimals written as INT32exclude: explode nested lists crossing a rowgroup boundaryexclude: read dictionary encoded decimals written as INT64exclude: read dictionary encoded decimals written as FIXED_LEN_BYTE_ARRAYexclude: read dictionary and plain encoded timestamp_millis written as INT64exclude: SPARK-40128 read DELTA_LENGTH_BYTE_ARRAY encoded stringsexclude: SPARK-36726: test incorrect Parquet row group file offsetexclude: SPARK-34167: read LongDecimals with precision < 10, VectorizedReader trueexclude: SPARK-34167: read LongDecimals with precision < 10, VectorizedReader falseAuronParquetInteroperabilitySuiteexclude: parquet timestamp conversionAuronParquetProtobufCompatibilitySuiteexclude: unannotated array of primitive typeexclude: unannotated array of structexclude: struct with unannotated arrayexclude: unannotated array of struct with unannotated arrayexclude: unannotated array of stringAuronParquetQuerySuiteexclude: SPARK-10634 timestamp written and read as INT64 - truncationexclude: Enabling/disabling ignoreCorruptFilesexclude: SPARK-26677: negated null-safe equality comparison should not filter matched row groupsexclude: Migration from INT96 to TIMESTAMP_MICROS timestamp typeexclude: SPARK-34212 Parquet should read decimals correctlyexclude: row group skipping doesn't overflow when reading into larger typeAuronParquetRebaseDatetimeSuiteexclude: SPARK-31159, SPARK-37705: compatibility with Spark 2.4/3.2 in reading dates/timestampsexclude: SPARK-31159, SPARK-37705: rebasing timestamps in writeexclude: SPARK-31159: rebasing dates in writeexclude: SPARK-35427: datetime rebasing in the EXCEPTION modeAuronParquetRebaseDatetimeV1Suiteexclude: SPARK-31159, SPARK-37705: compatibility with Spark 2.4/3.2 in reading dates/timestampsexclude: SPARK-31159, SPARK-37705: rebasing timestamps in writeexclude: SPARK-31159: rebasing dates in writeexclude: SPARK-35427: datetime rebasing in the EXCEPTION modeAuronParquetRebaseDatetimeV2Suiteexclude: SPARK-31159, SPARK-37705: compatibility with Spark 2.4/3.2 in reading dates/timestampsexclude: SPARK-35427: datetime rebasing in the EXCEPTION modeAuronParquetSchemaSuiteexclude: schema mismatch failure error message for parquet readerexclude: schema mismatch failure error message for parquet vectorized readerexclude: SPARK-40819: parquet file with TIMESTAMP(NANOS, true) (with nanosAsLong=true)exclude: SPARK-40819: parquet file with TIMESTAMP(NANOS, true) (with default nanosAsLong=false)AuronParquetThriftCompatibilitySuiteexclude: Read Parquet file generated by parquet-thriftAuronParquetV1FilterSuiteexcludeByPrefix: SPARK-40280: filter pushdown -excludeByPrefix: filter pushdown -exclude: Filters should be pushed down for vectorized Parquet reader at row group levelexclude: SPARK-31026: Parquet predicate pushdown for fields having dots in the namesexclude: Filters should be pushed down for Parquet readers at row group levelexclude: SPARK-23852: Broken Parquet push-down for partially-written statsexclude: SPARK-17091: Convert IN predicate to Parquet filter push-downexclude: SPARK-25207: exception when duplicate fields in case-insensitive modeexclude: Support Parquet column indexexclude: SPARK-34562: Bloom filter push downAuronParquetV1PartitionDiscoverySuiteexclude: read partitioned table - partition key included in Parquet fileexclude: read partitioned table - with nulls and partition keys are included in Parquet fileexclude: SPARK-18108 Parquet reader fails when data column types conflict with partition onesexclude: SPARK-21463: MetadataLogFileIndex should respect userSpecifiedSchema for partition colsAuronParquetV1QuerySuiteexclude: SPARK-10634 timestamp written and read as INT64 - truncationexclude: Enabling/disabling ignoreCorruptFilesexclude: SPARK-26677: negated null-safe equality comparison should not filter matched row groupsexclude: Migration from INT96 to TIMESTAMP_MICROS timestamp typeexclude: SPARK-34212 Parquet should read decimals correctlyexclude: row group skipping doesn't overflow when reading into larger typeexclude: returning batch for wide tableexclude: SPARK-39833: pushed filters with count()exclude: SPARK-39833: pushed filters with project without filter columnsAuronParquetV2FilterSuiteexcludeByPrefix: SPARK-40280: filter pushdown -exclude: SPARK-31026: Parquet predicate pushdown for fields having dots in the namesexclude: Filters should be pushed down for Parquet readers at row group levelexclude: SPARK-23852: Broken Parquet push-down for partially-written statsexclude: SPARK-17091: Convert IN predicate to Parquet filter push-downexclude: SPARK-25207: exception when duplicate fields in case-insensitive modeexclude: Support Parquet column indexAuronParquetV2QuerySuiteexclude: SPARK-10634 timestamp written and read as INT64 - truncationexclude: SPARK-26677: negated null-safe equality comparison should not filter matched row groupsexclude: Migration from INT96 to TIMESTAMP_MICROS timestamp typeexclude: returning batch for wide tablespark40
AuronDataFrameFunctionsSuitedisable: Native execution can crash after ParquetQuery in Spark 4AuronDateFunctionsSuiteexclude: SPARK-30668: use legacy timestamp parser in to_timestampAuronMathFunctionsSuitedisable: Native execution can crash in Spark 4AuronMiscFunctionsSuiteexclude: reflect and java_methodAuronStringFunctionsSuiteexclude: string concatexclude: string concat_wsexclude: UTF-8 string validateexclude: RegExpReplace throws the right exception when replace fails on a particular rowAuronDataFrameAggregateSuitedisable: Native execution can crash in Spark 4AuronDatasetAggregatorSuitedisable: Native dataset aggregators fail in Spark 4AuronTypedImperativeAggregateSuitedisable: Native execution can crash after ParquetQuery in Spark 4AuronDataFrameSuitedisable: Native execution can crash in Spark 4AuronParquetAvroCompatibilitySuiteexclude: required primitivesexclude: optional primitivesexclude: non-nullable arraysexclude: SPARK-10136 array of primitive arrayexclude: map of primitive arrayexclude: various complex typesexclude: SPARK-9407 Push down predicates involving Parquet ENUM columnsAuronParquetColumnIndexSuiteexclude: reading from unaligned pages - test filtersexclude: test reading unaligned pages - test all types (dict encode)exclude: SPARK-36123: reading from unaligned pages - test filters with nullsexclude: test reading unaligned pages - test all typesexclude: reading unaligned pages - struct typeAuronParquetEncodingSuitedisable: Native execution can crash in Spark 4AuronParquetFieldIdIOSuitedisable: Native parquet field id reads fail in Spark 4AuronParquetIOSuitedisable: Native execution can crash in Spark 4AuronParquetInteroperabilitySuitedisable: Native execution can crash in Spark 4AuronParquetPartitionDiscoverySuiteexclude: read partitioned table - normal caseexclude: Resolve type conflicts - decimals, dates and timestamps in partition columnAuronParquetProtobufCompatibilitySuiteexclude: unannotated array of primitive typeexclude: unannotated array of structexclude: struct with unannotated arrayexclude: unannotated array of struct with unannotated arrayexclude: unannotated array of stringAuronParquetQuerySuiteexclude: simple select queriesexclude: appendingexclude: SPARK-10634 timestamp written and read as INT64 - truncationexclude: Enabling/disabling ignoreCorruptFilesexclude: SPARK-26677: negated null-safe equality comparison should not filter matched row groupsexclude: Migration from INT96 to TIMESTAMP_MICROS timestamp typeexclude: SPARK-34212 Parquet should read decimals correctlyAuronParquetRebaseDatetimeSuiteexclude: SPARK-31159, SPARK-37705: compatibility with Spark 2.4/3.2 in reading dates/timestampsexclude: SPARK-31159, SPARK-37705: rebasing timestamps in writeexclude: SPARK-31159: rebasing dates in writeexclude: SPARK-35427: datetime rebasing in the EXCEPTION modeAuronParquetRebaseDatetimeV1Suitedisable: Spark 4 test resources use jar paths unsupported by Hadoop PathAuronParquetRebaseDatetimeV2Suitedisable: Spark 4 test resources use jar paths unsupported by Hadoop PathAuronParquetSchemaPruningSuitedisable: Native parquet schema pruning reads fail in Spark 4AuronParquetSchemaSuitedisable: Native execution can crash in Spark 4AuronParquetThriftCompatibilitySuitedisable: Spark 4 test resources use jar paths unsupported by Hadoop PathAuronParquetV1FilterSuitedisable: Native execution can crash in Spark 4AuronParquetV1PartitionDiscoverySuiteexclude: read partitioned table - normal caseexclude: read partitioned table - partition key included in Parquet fileexclude: read partitioned table - with nulls and partition keys are included in Parquet fileexclude: SPARK-18108 Parquet reader fails when data column types conflict with partition onesexclude: SPARK-21463: MetadataLogFileIndex should respect userSpecifiedSchema for partition colsAuronParquetV1QuerySuiteexclude: simple select queriesexclude: appendingexclude: SPARK-10634 timestamp written and read as INT64 - truncationexclude: Enabling/disabling ignoreCorruptFilesexclude: SPARK-26677: negated null-safe equality comparison should not filter matched row groupsexclude: Migration from INT96 to TIMESTAMP_MICROS timestamp typeexclude: SPARK-34212 Parquet should read decimals correctlyexclude: returning batch for wide tableexclude: SPARK-39833: pushed filters with count()exclude: SPARK-39833: pushed filters with project without filter columnsAuronParquetV1SchemaPruningSuitedisable: Native parquet schema pruning reads fail in Spark 4AuronParquetV2FilterSuitedisable: Native execution can crash in Spark 4AuronParquetV2PartitionDiscoverySuiteexclude: read partitioned table - normal caseexclude: SPARK-22109: Resolve type conflicts between strings and timestamps in partition columnAuronParquetV2QuerySuiteexclude: simple select queriesexclude: appendingexclude: self-joinexclude: SPARK-10634 timestamp written and read as INT64 - truncationexclude: SPARK-26677: negated null-safe equality comparison should not filter matched row groupsexclude: Migration from INT96 to TIMESTAMP_MICROS timestamp typeexclude: returning batch for wide tableAuronParquetV2SchemaPruningSuitedisable: Native parquet schema pruning reads fail in Spark 4spark41
AuronDataFrameFunctionsSuitedisable: Native execution can crash after ParquetQuery in Spark 4AuronDateFunctionsSuiteexclude: SPARK-30668: use legacy timestamp parser in to_timestampAuronMathFunctionsSuitedisable: Native execution can crash in Spark 4AuronMiscFunctionsSuiteexclude: reflect and java_methodAuronStringFunctionsSuiteexclude: string concatexclude: string concat_wsexclude: UTF-8 string validateexclude: RegExpReplace throws the right exception when replace fails on a particular rowAuronDataFrameAggregateSuitedisable: Native execution can crash in Spark 4AuronDatasetAggregatorSuitedisable: Native dataset aggregators fail in Spark 4AuronTypedImperativeAggregateSuitedisable: Native execution can crash after ParquetQuery in Spark 4AuronDataFrameSuitedisable: Native execution can crash in Spark 4AuronParquetAvroCompatibilitySuiteexclude: required primitivesexclude: optional primitivesexclude: non-nullable arraysexclude: SPARK-10136 array of primitive arrayexclude: map of primitive arrayexclude: various complex typesexclude: SPARK-9407 Push down predicates involving Parquet ENUM columnsAuronParquetColumnIndexSuiteexclude: reading from unaligned pages - test filtersexclude: test reading unaligned pages - test all types (dict encode)exclude: SPARK-36123: reading from unaligned pages - test filters with nullsexclude: test reading unaligned pages - test all typesexclude: reading unaligned pages - struct typeAuronParquetEncodingSuitedisable: Native execution can crash in Spark 4AuronParquetFieldIdIOSuitedisable: Native parquet field id reads fail in Spark 4AuronParquetFileFormatSuiteexclude: Write and read back TIME valuesAuronParquetFileFormatV1Suiteexclude: Write and read back TIME valuesAuronParquetFileFormatV2Suiteexclude: Write and read back TIME valuesAuronParquetIOSuitedisable: Native execution can crash in Spark 4AuronParquetInteroperabilitySuitedisable: Native execution can crash in Spark 4AuronParquetPartitionDiscoverySuiteexclude: read partitioned table - normal caseexclude: Infer the TIME data type from partition valuesAuronParquetProtobufCompatibilitySuiteexclude: unannotated array of primitive typeexclude: unannotated array of structexclude: struct with unannotated arrayexclude: unannotated array of struct with unannotated arrayexclude: unannotated array of stringAuronParquetQuerySuiteexclude: simple select queriesexclude: appendingexclude: SPARK-10634 timestamp written and read as INT64 - truncationexclude: Enabling/disabling ignoreCorruptFilesexclude: SPARK-26677: negated null-safe equality comparison should not filter matched row groupsexclude: Migration from INT96 to TIMESTAMP_MICROS timestamp typeexclude: SPARK-34212 Parquet should read decimals correctlyexclude: create table with TIMEAuronParquetRebaseDatetimeSuiteexclude: SPARK-31159, SPARK-37705: compatibility with Spark 2.4/3.2 in reading dates/timestampsexclude: SPARK-31159, SPARK-37705: rebasing timestamps in writeexclude: SPARK-31159: rebasing dates in writeexclude: SPARK-35427: datetime rebasing in the EXCEPTION modeAuronParquetRebaseDatetimeV1Suitedisable: Spark 4 test resources use jar paths unsupported by Hadoop PathAuronParquetRebaseDatetimeV2Suitedisable: Spark 4 test resources use jar paths unsupported by Hadoop PathAuronParquetSchemaPruningSuitedisable: Native parquet schema pruning reads fail in Spark 4AuronParquetSchemaSuitedisable: Native execution can crash in Spark 4AuronParquetThriftCompatibilitySuitedisable: Spark 4 test resources use jar paths unsupported by Hadoop PathAuronParquetV1FilterSuitedisable: Native execution can crash in Spark 4AuronParquetV1PartitionDiscoverySuiteexclude: read partitioned table - normal caseexclude: Infer the TIME data type from partition valuesexclude: read partitioned table - partition key included in Parquet fileexclude: read partitioned table - with nulls and partition keys are included in Parquet fileexclude: SPARK-18108 Parquet reader fails when data column types conflict with partition onesexclude: SPARK-21463: MetadataLogFileIndex should respect userSpecifiedSchema for partition colsAuronParquetV1QuerySuiteexclude: simple select queriesexclude: appendingexclude: create table with TIMEexclude: SPARK-10634 timestamp written and read as INT64 - truncationexclude: Enabling/disabling ignoreCorruptFilesexclude: SPARK-26677: negated null-safe equality comparison should not filter matched row groupsexclude: Migration from INT96 to TIMESTAMP_MICROS timestamp typeexclude: SPARK-34212 Parquet should read decimals correctlyexclude: returning batch for wide tableexclude: SPARK-39833: pushed filters with count()exclude: SPARK-39833: pushed filters with project without filter columnsAuronParquetV1SchemaPruningSuitedisable: Native parquet schema pruning reads fail in Spark 4AuronParquetV2FilterSuitedisable: Native execution can crash in Spark 4AuronParquetV2PartitionDiscoverySuiteexclude: read partitioned table - normal caseexclude: Infer the TIME data type from partition valuesexclude: _SUCCESS should not break partitioning discoveryexclude: Resolve type conflicts - decimals, dates and timestamps in partition columnexclude: SPARK-22109: Resolve type conflicts between strings and timestamps in partition columnAuronParquetV2QuerySuiteexclude: simple select queriesexclude: appendingexclude: self-joinexclude: create table with TIMEexclude: SPARK-10634 timestamp written and read as INT64 - truncationexclude: SPARK-26677: negated null-safe equality comparison should not filter matched row groupsexclude: Migration from INT96 to TIMESTAMP_MICROS timestamp typeexclude: returning batch for wide tableAuronParquetV2SchemaPruningSuitedisable: Native parquet schema pruning reads fail in Spark 4Notes
Larger groups can be split into smaller issues or PRs.