Skip to content
Permalink
Branch: master
Commits on Sep 24, 2019
  1. [SPARK-28292][SQL] Enable Injection of User-defined Hint

    gatorsmile authored and cloud-fan committed Sep 24, 2019
    ### What changes were proposed in this pull request?
    Move the rule `RemoveAllHints` after the batch `Resolution`.
    
    ### Why are the changes needed?
    User-defined hints can be resolved by the rules injected via `extendedResolutionRules` or `postHocResolutionRules`.
    
    ### Does this PR introduce any user-facing change?
    No
    
    ### How was this patch tested?
    Added a test case
    
    Closes #25746 from gatorsmile/moveRemoveAllHints.
    
    Authored-by: Xiao Li <gatorsmile@gmail.com>
    Signed-off-by: Wenchen Fan <wenchen@databricks.com>
Commits on Sep 3, 2019
  1. [SPARK-28961][HOT-FIX][BUILD] Upgrade Maven from 3.6.1 to 3.6.2

    gatorsmile committed Sep 3, 2019
    ### What changes were proposed in this pull request?
    This PR is to upgrade the maven dependence from 3.6.1 to 3.6.2.
    
    ### Why are the changes needed?
    All the builds are broken because 3.6.1 is not available.  http://ftp.wayne.edu/apache//maven/maven-3/
    
    - https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Compile/job/spark-master-compile-maven-hadoop-3.2/485/
    - https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Compile/job/spark-master-compile-maven-hadoop-2.7/10536/
    
    ![image](https://user-images.githubusercontent.com/11567269/64196667-36d69100-ce39-11e9-8f93-40eb333d595d.png)
    
    ### Does this PR introduce any user-facing change?
    No
    
    ### How was this patch tested?
    N/A
    
    Closes #25665 from gatorsmile/upgradeMVN.
    
    Authored-by: Xiao Li <gatorsmile@gmail.com>
    Signed-off-by: Xiao Li <gatorsmile@gmail.com>
Commits on Aug 23, 2019
  1. Revert "[SPARK-25474][SQL] Support `spark.sql.statistics.fallBackToHd…

    gatorsmile authored and dongjoon-hyun committed Aug 23, 2019
    …fs` in data source tables"
    
    This reverts commit 485ae6d.
    
    Closes #25563 from gatorsmile/revert.
    
    Authored-by: Xiao Li <gatorsmile@gmail.com>
    Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
Commits on Aug 2, 2019
  1. [SPARK-28532][SPARK-28530][SQL][FOLLOWUP] Inline doc for FixedPoint(1…

    gatorsmile and yeshengm committed Aug 2, 2019
    …) batches "Subquery" and "Join Reorder"
    
    ## What changes were proposed in this pull request?
    Explained why "Subquery" and "Join Reorder" optimization batches should be `FixedPoint(1)`, which was introduced in SPARK-28532 and SPARK-28530.
    
    ## How was this patch tested?
    
    Existing UTs.
    
    Closes #25320 from yeshengm/SPARK-28530-followup.
    
    Lead-authored-by: Xiao Li <gatorsmile@gmail.com>
    Co-authored-by: Yesheng Ma <kimi.ysma@gmail.com>
    Signed-off-by: gatorsmile <gatorsmile@gmail.com>
Commits on Jul 12, 2019
  1. [SPARK-28361][SQL][TEST] Test equality of generated code with id in c…

    gatorsmile authored and dongjoon-hyun committed Jul 12, 2019
    …lass name
    
    A code gen test in WholeStageCodeGenSuite was flaky because it used the codegen metrics class to test if the generated code for equivalent plans was identical under a particular flag. This patch switches the test to compare the generated code directly.
    
    N/A
    
    Closes #25131 from gatorsmile/WholeStageCodegenSuite.
    
    Authored-by: gatorsmile <gatorsmile@gmail.com>
    Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
Commits on May 31, 2019
  1. [SPARK-27773][FOLLOW-UP] Fix Checkstyle failure

    gatorsmile authored and dongjoon-hyun committed May 31, 2019
    ## What changes were proposed in this pull request?
    
    https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Compile/job/spark-master-lint/
    
    ```
    Checkstyle checks failed at following occurrences:
    [ERROR] src/main/java/org/apache/spark/network/yarn/YarnShuffleServiceMetrics.java:[99] (sizes) LineLength: Line is longer than 100 characters (found 104).
    [ERROR] src/main/java/org/apache/spark/network/yarn/YarnShuffleServiceMetrics.java:[101] (sizes) LineLength: Line is longer than 100 characters (found 101).
    [ERROR] src/main/java/org/apache/spark/network/yarn/YarnShuffleServiceMetrics.java:[103] (sizes) LineLength: Line is longer than 100 characters (found 102).
    [ERROR] src/main/java/org/apache/spark/network/yarn/YarnShuffleServiceMetrics.java:[105] (sizes) LineLength: Line is longer than 100 characters (found 103).
    ```
    
    ## How was this patch tested?
    N/A
    
    Closes #24760 from gatorsmile/updateYarnShuffleServiceMetrics.
    
    Authored-by: gatorsmile <gatorsmile@gmail.com>
    Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
Commits on May 23, 2019
  1. [SPARK-27770][SQL][PART 1] Port AGGREGATES.sql

    gatorsmile authored and jiangxb1987 committed May 23, 2019
    ## What changes were proposed in this pull request?
    
    This PR is to port AGGREGATES.sql from PostgreSQL regression tests. https://github.com/postgres/postgres/blob/02ddd499322ab6f2f0d58692955dc9633c2150fc/src/test/regress/sql/aggregates.sql#L1-L143
    
    The expected results can be found in the link: https://github.com/postgres/postgres/blob/master/src/test/regress/expected/aggregates.out
    
    When porting the test cases, found three PostgreSQL specific features that do not exist in Spark SQL.
    - https://issues.apache.org/jira/browse/SPARK-27765: Type Casts: expression::type
    - https://issues.apache.org/jira/browse/SPARK-27766: Data type: POINT(x, y)
    - https://issues.apache.org/jira/browse/SPARK-27767: Built-in function: generate_series
    
    Also, found two bugs:
    - https://issues.apache.org/jira/browse/SPARK-27768: Infinity, -Infinity, NaN should be recognized in a case insensitive manner
    - https://issues.apache.org/jira/browse/SPARK-27769: Handling of sublinks within outer-level aggregates.
    
    This PR also fixes the error message when the column can't be resolved.
    
    For running the regression tests, this PR also added three tables `aggtest`, `onek` and `tenk1` from the postgreSQL data sets: https://github.com/postgres/postgres/tree/02ddd499322ab6f2f0d58692955dc9633c2150fc/src/test/regress/data
    
    ## How was this patch tested?
    N/A
    
    Closes #24640 from gatorsmile/addTestCase.
    
    Authored-by: gatorsmile <gatorsmile@gmail.com>
    Signed-off-by: Xingbo Jiang <xingbo.jiang@databricks.com>
Commits on May 13, 2019
  1. [MINOR][REPL] Remove dead code of Spark Repl in Scala 2.11

    gatorsmile authored and cloud-fan committed May 13, 2019
    ## What changes were proposed in this pull request?
    
    Remove the dead code of Spark Repl in Scala 2.11. We only support Scala 2.12
    
    ## How was this patch tested?
    N/A
    
    Closes #24588 from gatorsmile/removeDeadCode.
    
    Authored-by: gatorsmile <gatorsmile@gmail.com>
    Signed-off-by: Wenchen Fan <wenchen@databricks.com>
Commits on May 2, 2019
  1. [SPARK-27618][SQL][FOLLOW-UP] Unnecessary access to externalCatalog

    gatorsmile authored and dongjoon-hyun committed May 2, 2019
    ## What changes were proposed in this pull request?
    This PR is to add test cases for ensuring that we do not have unnecessary access to externalCatalog.
    
    In the future, we can follow these examples to improve our test coverage in this area.
    
    ## How was this patch tested?
    N/A
    
    Closes #24511 from gatorsmile/addTestcaseSpark-27618.
    
    Authored-by: gatorsmile <gatorsmile@gmail.com>
    Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
Commits on Apr 24, 2019
  1. [SPARK-27460][FOLLOW-UP][TESTS] Fix flaky tests

    2 people authored and cloud-fan committed Apr 24, 2019
    ## What changes were proposed in this pull request?
    
    This patch makes several test flakiness fixes.
    
    ## How was this patch tested?
    N/A
    
    Closes #24434 from gatorsmile/fixFlakyTest.
    
    Lead-authored-by: gatorsmile <gatorsmile@gmail.com>
    Co-authored-by: Hyukjin Kwon <gurwls223@gmail.com>
    Signed-off-by: Wenchen Fan <wenchen@databricks.com>
Commits on Apr 17, 2019
  1. [SPARK-27479][BUILD] Hide API docs for org.apache.spark.util.kvstore

    gatorsmile committed Apr 17, 2019
    ## What changes were proposed in this pull request?
    
    The API docs should not include the "org.apache.spark.util.kvstore" package because they are internal private APIs. See the doc link: https://spark.apache.org/docs/latest/api/java/org/apache/spark/util/kvstore/LevelDB.html
    
    ## How was this patch tested?
    N/A
    
    Closes #24386 from gatorsmile/rmDoc.
    
    Authored-by: gatorsmile <gatorsmile@gmail.com>
    Signed-off-by: gatorsmile <gatorsmile@gmail.com>
Commits on Apr 5, 2019
  1. [SPARK-27393][SQL] Show ReusedSubquery in the plan when the subquery …

    gatorsmile committed Apr 5, 2019
    …is reused
    
    ## What changes were proposed in this pull request?
    With this change, we can easily identify the plan difference when subquery is reused.
    
    When the reuse is enabled, the plan looks like
    ```
    == Physical Plan ==
    CollectLimit 1
    +- *(1) Project [(Subquery subquery240 + ReusedSubquery Subquery subquery240) AS (scalarsubquery() + scalarsubquery())#253]
       :  :- Subquery subquery240
       :  :  +- *(2) HashAggregate(keys=[], functions=[avg(cast(key#13 as bigint))], output=[avg(key)#250])
       :  :     +- Exchange SinglePartition
       :  :        +- *(1) HashAggregate(keys=[], functions=[partial_avg(cast(key#13 as bigint))], output=[sum#256, count#257L])
       :  :           +- *(1) SerializeFromObject [knownnotnull(assertnotnull(input[0, org.apache.spark.sql.test.SQLTestData$TestData, true])).key AS key#13]
       :  :              +- Scan[obj#12]
       :  +- ReusedSubquery Subquery subquery240
       +- *(1) SerializeFromObject
          +- Scan[obj#12]
    ```
    
    When the reuse is disabled, the plan looks like
    ```
    == Physical Plan ==
    CollectLimit 1
    +- *(1) Project [(Subquery subquery286 + Subquery subquery287) AS (scalarsubquery() + scalarsubquery())#299]
       :  :- Subquery subquery286
       :  :  +- *(2) HashAggregate(keys=[], functions=[avg(cast(key#13 as bigint))], output=[avg(key)#296])
       :  :     +- Exchange SinglePartition
       :  :        +- *(1) HashAggregate(keys=[], functions=[partial_avg(cast(key#13 as bigint))], output=[sum#302, count#303L])
       :  :           +- *(1) SerializeFromObject [knownnotnull(assertnotnull(input[0, org.apache.spark.sql.test.SQLTestData$TestData, true])).key AS key#13]
       :  :              +- Scan[obj#12]
       :  +- Subquery subquery287
       :     +- *(2) HashAggregate(keys=[], functions=[avg(cast(key#13 as bigint))], output=[avg(key)#298])
       :        +- Exchange SinglePartition
       :           +- *(1) HashAggregate(keys=[], functions=[partial_avg(cast(key#13 as bigint))], output=[sum#306, count#307L])
       :              +- *(1) SerializeFromObject [knownnotnull(assertnotnull(input[0, org.apache.spark.sql.test.SQLTestData$TestData, true])).key AS key#13]
       :                 +- Scan[obj#12]
       +- *(1) SerializeFromObject
          +- Scan[obj#12]
    ```
    
    ## How was this patch tested?
    Modified the existing test.
    
    Closes #24258 from gatorsmile/followupSPARK-27279.
    
    Authored-by: gatorsmile <gatorsmile@gmail.com>
    Signed-off-by: gatorsmile <gatorsmile@gmail.com>
Commits on Mar 31, 2019
  1. [SPARK-27244][CORE][TEST][FOLLOWUP] toDebugString redacts sensitive i…

    gatorsmile authored and dongjoon-hyun committed Mar 31, 2019
    …nformation
    
    ## What changes were proposed in this pull request?
    This PR is a FollowUp of #24196. It improves the test case by using the parameters that are being used in the actual scenarios.
    
    ## How was this patch tested?
    N/A
    
    Closes #24257 from gatorsmile/followupSPARK-27244.
    
    Authored-by: gatorsmile <gatorsmile@gmail.com>
    Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
Commits on Dec 17, 2018
  1. [SPARK-20636] Add the rule TransposeWindow to the optimization batch

    gatorsmile committed Dec 17, 2018
    ## What changes were proposed in this pull request?
    
    This PR is a follow-up of the PR #17899. It is to add the rule TransposeWindow the optimizer batch.
    
    ## How was this patch tested?
    The existing tests.
    
    Closes #23222 from gatorsmile/followupSPARK-20636.
    
    Authored-by: gatorsmile <gatorsmile@gmail.com>
    Signed-off-by: gatorsmile <gatorsmile@gmail.com>
  2. [SPARK-26327][SQL][FOLLOW-UP] Refactor the code and restore the metri…

    gatorsmile committed Dec 17, 2018
    …cs name
    
    ## What changes were proposed in this pull request?
    
    - The original comment about `updateDriverMetrics` is not right.
    - Refactor the code to ensure `selectedPartitions `  has been set before sending the driver-side metrics.
    - Restore the original name, which is more general and extendable.
    
    ## How was this patch tested?
    The existing tests.
    
    Closes #23328 from gatorsmile/followupSpark-26142.
    
    Authored-by: gatorsmile <gatorsmile@gmail.com>
    Signed-off-by: gatorsmile <gatorsmile@gmail.com>
Commits on Dec 10, 2018
  1. [SPARK-26307][SQL] Fix CTAS when INSERT a partitioned table using Hiv…

    gatorsmile authored and cloud-fan committed Dec 10, 2018
    …e serde
    
    ## What changes were proposed in this pull request?
    
    This is a  Spark 2.3 regression introduced in #20521. We should add the partition info for InsertIntoHiveTable in CreateHiveTableAsSelectCommand. Otherwise, we will hit the following error by running the newly added test case:
    
    ```
    [info] - CTAS: INSERT a partitioned table using Hive serde *** FAILED *** (829 milliseconds)
    [info]   org.apache.spark.SparkException: Requested partitioning does not match the tab1 table:
    [info] Requested partitions:
    [info] Table partitions: part
    [info]   at org.apache.spark.sql.hive.execution.InsertIntoHiveTable.processInsert(InsertIntoHiveTable.scala:179)
    [info]   at org.apache.spark.sql.hive.execution.InsertIntoHiveTable.run(InsertIntoHiveTable.scala:107)
    ```
    
    ## How was this patch tested?
    
    Added a test case.
    
    Closes #23255 from gatorsmile/fixCTAS.
    
    Authored-by: gatorsmile <gatorsmile@gmail.com>
    Signed-off-by: Wenchen Fan <wenchen@databricks.com>
Commits on Nov 27, 2018
  1. [SPARK-25860][SPARK-26107][FOLLOW-UP] Rule ReplaceNullWithFalseInPred…

    gatorsmile authored and dbtsai committed Nov 27, 2018
    …icate
    
    ## What changes were proposed in this pull request?
    
    Based on #22857 and #23079, this PR did a few updates
    
    - Limit the data types of NULL to Boolean.
    - Limit the input data type of replaceNullWithFalse to Boolean; throw an exception in the testing mode.
    - Create a new file for the rule ReplaceNullWithFalseInPredicate
    - Update the description of this rule.
    
    ## How was this patch tested?
    Added a test case
    
    Closes #23139 from gatorsmile/followupSpark-25860.
    
    Authored-by: gatorsmile <gatorsmile@gmail.com>
    Signed-off-by: DB Tsai <d_tsai@apple.com>
Commits on Nov 26, 2018
  1. [SPARK-26168][SQL] Update the code comments in Expression and Aggregate

    gatorsmile authored and cloud-fan committed Nov 26, 2018
    ## What changes were proposed in this pull request?
    This PR is to improve the code comments to document some common traits and traps about the expression.
    
    ## How was this patch tested?
    N/A
    
    Closes #23135 from gatorsmile/addcomments.
    
    Authored-by: gatorsmile <gatorsmile@gmail.com>
    Signed-off-by: Wenchen Fan <wenchen@databricks.com>
  2. [SPARK-26169] Create DataFrameSetOperationsSuite

    gatorsmile authored and cloud-fan committed Nov 26, 2018
    ## What changes were proposed in this pull request?
    
    Create a new suite DataFrameSetOperationsSuite for the test cases of DataFrame/Dataset's set operations.
    
    Also, add test cases of NULL handling for Array Except and Array Intersect.
    
    ## How was this patch tested?
    N/A
    
    Closes #23137 from gatorsmile/setOpsTest.
    
    Authored-by: gatorsmile <gatorsmile@gmail.com>
    Signed-off-by: Wenchen Fan <wenchen@databricks.com>
Commits on Nov 25, 2018
  1. [SPARK-25908][SQL][FOLLOW-UP] Add back unionAll

    gatorsmile committed Nov 25, 2018
    ## What changes were proposed in this pull request?
    This PR is to add back `unionAll`, which is widely used. The name is also consistent with our ANSI SQL. We also have the corresponding `intersectAll` and `exceptAll`, which were introduced in Spark 2.4.
    
    ## How was this patch tested?
    Added a test case in DataFrameSuite
    
    Closes #23131 from gatorsmile/addBackUnionAll.
    
    Authored-by: gatorsmile <gatorsmile@gmail.com>
    Signed-off-by: gatorsmile <gatorsmile@gmail.com>
Commits on Nov 12, 2018
  1. [SPARK-26005][SQL] Upgrade ANTRL from 4.7 to 4.7.1

    gatorsmile committed Nov 12, 2018
    ## What changes were proposed in this pull request?
    Based on the release description of ANTRL 4.7.1., https://github.com/antlr/antlr4/releases, let us upgrade our parser to 4.7.1.
    
    ## How was this patch tested?
    N/A
    
    Closes #23005 from gatorsmile/upgradeAntlr4.7.
    
    Authored-by: gatorsmile <gatorsmile@gmail.com>
    Signed-off-by: gatorsmile <gatorsmile@gmail.com>
Commits on Nov 9, 2018
  1. [SPARK-25988][SQL] Keep names unchanged when deduplicating the column…

    gatorsmile committed Nov 9, 2018
    … names in Analyzer
    
    ## What changes were proposed in this pull request?
    When the queries do not use the column names with the same case, users might hit various errors. Below is a typical test failure they can hit.
    ```
    Expected only partition pruning predicates: ArrayBuffer(isnotnull(tdate#237), (cast(tdate#237 as string) >= 2017-08-15));
    org.apache.spark.sql.AnalysisException: Expected only partition pruning predicates: ArrayBuffer(isnotnull(tdate#237), (cast(tdate#237 as string) >= 2017-08-15));
    	at org.apache.spark.sql.catalyst.catalog.ExternalCatalogUtils$.prunePartitionsByFilter(ExternalCatalogUtils.scala:146)
    	at org.apache.spark.sql.catalyst.catalog.InMemoryCatalog.listPartitionsByFilter(InMemoryCatalog.scala:560)
    	at org.apache.spark.sql.catalyst.catalog.SessionCatalog.listPartitionsByFilter(SessionCatalog.scala:925)
    ```
    
    ## How was this patch tested?
    Added two test cases.
    
    Closes #22990 from gatorsmile/fix1283.
    
    Authored-by: gatorsmile <gatorsmile@gmail.com>
    Signed-off-by: gatorsmile <gatorsmile@gmail.com>
Commits on Oct 16, 2018
  1. [SPARK-25674][FOLLOW-UP] Update the stats for each ColumnarBatch

    gatorsmile authored and cloud-fan committed Oct 16, 2018
    ## What changes were proposed in this pull request?
    This PR is a follow-up of #22594 . This alternative can avoid the unneeded computation in the hot code path.
    
    - For row-based scan, we keep the original way.
    - For the columnar scan, we just need to update the stats after each batch.
    
    ## How was this patch tested?
    N/A
    
    Closes #22731 from gatorsmile/udpateStatsFileScanRDD.
    
    Authored-by: gatorsmile <gatorsmile@gmail.com>
    Signed-off-by: Wenchen Fan <wenchen@databricks.com>
Commits on Oct 14, 2018
  1. [SPARK-25372][YARN][K8S][FOLLOW-UP] Deprecate and generalize keytab /…

    gatorsmile authored and HyukjinKwon committed Oct 14, 2018
    … principal config
    
    ## What changes were proposed in this pull request?
    Update the next version of Spark from 2.5 to 3.0
    
    ## How was this patch tested?
    N/A
    
    Closes #22717 from gatorsmile/followupSPARK-25372.
    
    Authored-by: gatorsmile <gatorsmile@gmail.com>
    Signed-off-by: hyukjinkwon <gurwls223@apache.org>
  2. [SPARK-25727][SQL] Add outputOrdering to otherCopyArgs in InMemoryRel…

    gatorsmile authored and dongjoon-hyun committed Oct 14, 2018
    …ation
    
    ## What changes were proposed in this pull request?
    Add `outputOrdering ` to `otherCopyArgs` in InMemoryRelation so that this field will be copied when we doing the tree transformation.
    
    ```
        val data = Seq(100).toDF("count").cache()
        data.queryExecution.optimizedPlan.toJSON
    ```
    
    The above code can generate the following error:
    
    ```
    assertion failed: InMemoryRelation fields: output, cacheBuilder, statsOfPlanToCache, outputOrdering, values: List(count#178), CachedRDDBuilder(true,10000,StorageLevel(disk, memory, deserialized, 1 replicas),*(1) Project [value#176 AS count#178]
    +- LocalTableScan [value#176]
    ,None), Statistics(sizeInBytes=12.0 B, hints=none)
    java.lang.AssertionError: assertion failed: InMemoryRelation fields: output, cacheBuilder, statsOfPlanToCache, outputOrdering, values: List(count#178), CachedRDDBuilder(true,10000,StorageLevel(disk, memory, deserialized, 1 replicas),*(1) Project [value#176 AS count#178]
    +- LocalTableScan [value#176]
    ,None), Statistics(sizeInBytes=12.0 B, hints=none)
    	at scala.Predef$.assert(Predef.scala:170)
    	at org.apache.spark.sql.catalyst.trees.TreeNode.jsonFields(TreeNode.scala:611)
    	at org.apache.spark.sql.catalyst.trees.TreeNode.org$apache$spark$sql$catalyst$trees$TreeNode$$collectJsonValue$1(TreeNode.scala:599)
    	at org.apache.spark.sql.catalyst.trees.TreeNode.jsonValue(TreeNode.scala:604)
    	at org.apache.spark.sql.catalyst.trees.TreeNode.toJSON(TreeNode.scala:590)
    ```
    
    ## How was this patch tested?
    
    Added a test
    
    Closes #22715 from gatorsmile/copyArgs1.
    
    Authored-by: gatorsmile <gatorsmile@gmail.com>
    Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
Commits on Oct 13, 2018
  1. [SPARK-25714] Fix Null Handling in the Optimizer rule BooleanSimplifi…

    gatorsmile committed Oct 13, 2018
    …cation
    
    ## What changes were proposed in this pull request?
    ```Scala
        val df1 = Seq(("abc", 1), (null, 3)).toDF("col1", "col2")
        df1.write.mode(SaveMode.Overwrite).parquet("/tmp/test1")
        val df2 = spark.read.parquet("/tmp/test1")
        df2.filter("col1 = 'abc' OR (col1 != 'abc' AND col2 == 3)").show()
    ```
    
    Before the PR, it returns both rows. After the fix, it returns `Row ("abc", 1))`. This is to fix the bug in NULL handling in BooleanSimplification. This is a bug introduced in Spark 1.6 release.
    
    ## How was this patch tested?
    Added test cases
    
    Closes #22702 from gatorsmile/fixBooleanSimplify2.
    
    Authored-by: gatorsmile <gatorsmile@gmail.com>
    Signed-off-by: gatorsmile <gatorsmile@gmail.com>
Commits on Oct 9, 2018
  1. [SPARK-25559][FOLLOW-UP] Add comments for partial pushdown of conjunc…

    gatorsmile authored and dbtsai committed Oct 9, 2018
    …ts in Parquet
    
    ## What changes were proposed in this pull request?
    This is a follow up of #22574. Renamed the parameter and added comments.
    
    ## How was this patch tested?
    N/A
    
    Closes #22679 from gatorsmile/followupSPARK-25559.
    
    Authored-by: gatorsmile <gatorsmile@gmail.com>
    Signed-off-by: DB Tsai <d_tsai@apple.com>
Commits on Oct 6, 2018
  1. [SPARK-25671] Build external/spark-ganglia-lgpl in Jenkins Test

    gatorsmile committed Oct 6, 2018
    ## What changes were proposed in this pull request?
    Currently, we do not build external/spark-ganglia-lgpl in Jenkins tests when the code is changed.
    
    ## How was this patch tested?
    N/A
    
    Closes #22658 from gatorsmile/buildGanglia.
    
    Authored-by: gatorsmile <gatorsmile@gmail.com>
    Signed-off-by: gatorsmile <gatorsmile@gmail.com>
  2. [MINOR] Clean up the joinCriteria in SQL parser

    gatorsmile committed Oct 6, 2018
    ## What changes were proposed in this pull request?
    Clean up the joinCriteria parsing in the parser by directly using identifierList
    
    ## How was this patch tested?
    N/A
    
    Closes #22648 from gatorsmile/cleanupJoinCriteria.
    
    Authored-by: gatorsmile <gatorsmile@gmail.com>
    Signed-off-by: gatorsmile <gatorsmile@gmail.com>
  3. [SPARK-25655][BUILD] Add -Pspark-ganglia-lgpl to the scala style check.

    gatorsmile authored and HyukjinKwon committed Oct 6, 2018
    ## What changes were proposed in this pull request?
    Our lint failed due to the following errors:
    ```
    [INFO] --- scalastyle-maven-plugin:1.0.0:check (default)  spark-ganglia-lgpl_2.11 ---
    error file=/home/jenkins/workspace/spark-master-maven-snapshots/spark/external/spark-ganglia-lgpl/src/main/scala/org/apache/spark/metrics/sink/GangliaSink.scala message=
          Are you sure that you want to use toUpperCase or toLowerCase without the root locale? In most cases, you
          should use toUpperCase(Locale.ROOT) or toLowerCase(Locale.ROOT) instead.
          If you must use toUpperCase or toLowerCase without the root locale, wrap the code block with
          // scalastyle:off caselocale
          .toUpperCase
          .toLowerCase
          // scalastyle:on caselocale
         line=67 column=49
    error file=/home/jenkins/workspace/spark-master-maven-snapshots/spark/external/spark-ganglia-lgpl/src/main/scala/org/apache/spark/metrics/sink/GangliaSink.scala message=
          Are you sure that you want to use toUpperCase or toLowerCase without the root locale? In most cases, you
          should use toUpperCase(Locale.ROOT) or toLowerCase(Locale.ROOT) instead.
          If you must use toUpperCase or toLowerCase without the root locale, wrap the code block with
          // scalastyle:off caselocale
          .toUpperCase
          .toLowerCase
          // scalastyle:on caselocale
         line=71 column=32
    Saving to outputFile=/home/jenkins/workspace/spark-master-maven-snapshots/spark/external/spark-ganglia-lgpl/target/scalastyle-output.xml
    ```
    
    See https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Compile/job/spark-master-lint/8890/
    
    ## How was this patch tested?
    N/A
    
    Closes #22647 from gatorsmile/fixLint.
    
    Authored-by: gatorsmile <gatorsmile@gmail.com>
    Signed-off-by: hyukjinkwon <gurwls223@apache.org>
Commits on Oct 2, 2018
  1. [SPARK-25592] Setting version to 3.0.0-SNAPSHOT

    gatorsmile committed Oct 2, 2018
    ## What changes were proposed in this pull request?
    
    This patch is to bump the master branch version to 3.0.0-SNAPSHOT.
    
    ## How was this patch tested?
    N/A
    
    Closes #22606 from gatorsmile/bump3.0.
    
    Authored-by: gatorsmile <gatorsmile@gmail.com>
    Signed-off-by: gatorsmile <gatorsmile@gmail.com>
Commits on Sep 26, 2018
  1. [SPARK-24324][PYTHON][FOLLOW-UP] Rename the Conf to spark.sql.legacy.…

    gatorsmile and HyukjinKwon committed Sep 26, 2018
    …execution.pandas.groupedMap.assignColumnsByName
    
    ## What changes were proposed in this pull request?
    
    Add the legacy prefix for spark.sql.execution.pandas.groupedMap.assignColumnsByPosition and rename it to spark.sql.legacy.execution.pandas.groupedMap.assignColumnsByName
    
    ## How was this patch tested?
    The existing tests.
    
    Closes #22540 from gatorsmile/renameAssignColumnsByPosition.
    
    Lead-authored-by: gatorsmile <gatorsmile@gmail.com>
    Co-authored-by: Hyukjin Kwon <gurwls223@gmail.com>
    Signed-off-by: hyukjinkwon <gurwls223@apache.org>
Commits on Sep 23, 2018
  1. [MINOR][PYSPARK] Always Close the tempFile in _serialize_to_jvm

    gatorsmile authored and HyukjinKwon committed Sep 23, 2018
    ## What changes were proposed in this pull request?
    
    Always close the tempFile after `serializer.dump_stream(data, tempFile)` in _serialize_to_jvm
    
    ## How was this patch tested?
    
    N/A
    
    Closes #22523 from gatorsmile/fixMinor.
    
    Authored-by: gatorsmile <gatorsmile@gmail.com>
    Signed-off-by: hyukjinkwon <gurwls223@apache.org>
Older
You can’t perform that action at this time.