Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-25135][SQL] Insert datasource table may all null when select from view #22124

Closed
wants to merge 7 commits into from
Closed

Conversation

wangyum
Copy link
Member

@wangyum wangyum commented Aug 16, 2018

What changes were proposed in this pull request?

How to reproduce:

val path = "/tmp/spark/parquet"
val cnt = 30
spark.range(cnt).selectExpr("id as col1").write.mode("overwrite").parquet(path)
spark.sql(s"CREATE TABLE table1(col1 bigint) using parquet location '$path'")
spark.sql("create view view1 as select col1 from table1 where col1 > -20")
// The column name of table2 is inconsistent with the column name of view1.
spark.sql("create table table2 (COL1 BIGINT) using parquet")
// When querying the view, ensure that the column name of the query matches the column name of the target table.
spark.sql("insert overwrite table table2 select COL1 from view1")
// At this time, the column name of the target table is uppercase, but the column name of the Parquet file is consistent with the case of the previous view(It's lower case). so all nulls.
spark.table("table2").show

The root cause is when insert a table contains view. for example:

insert overwrite table table2 select COL1 from view1

the execution plan optimized to:

=== Applying Rule org.apache.spark.sql.catalyst.optimizer.RemoveRedundantAliases ===
 InsertIntoHadoopFsRelationCommand file:/private/var/folders/tg/f5mz46090wg7swzgdc69f8q   InsertIntoHadoopFsRelationCommand file:/private/var/folders/tg/f5mz46090wg7swzgdc69f8q03965_0/T/warehouse-c0
 Database: default                                                                        Database: default
 Table: table2                                                                            Table: table2
 Owner: yumwang                                                                           Owner: yumwang
 Created Time: Sun Aug 19 22:54:11 PDT 2018                                               Created Time: Sun Aug 19 22:54:11 PDT 2018
 Last Access: Wed Dec 31 16:00:00 PST 1969                                                Last Access: Wed Dec 31 16:00:00 PST 1969
 Created By: Spark 2.4.0-SNAPSHOT                                                         Created By: Spark 2.4.0-SNAPSHOT
 Type: MANAGED                                                                            Type: MANAGED
 Provider: parquet                                                                        Provider: parquet
 Table Properties: [transient_lastDdlTime=1534744451]                                     Table Properties: [transient_lastDdlTime=1534744451]
 Location: file:/private/var/folders/tg/f5mz46090wg7swzgdc69f8q03965_0/T/warehouse-c044   Location: file:/private/var/folders/tg/f5mz46090wg7swzgdc69f8q03965_0/T/warehouse-c0444fd8-772b-4841-9182-3c
 Serde Library: org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe               Serde Library: org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe
 InputFormat: org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat               InputFormat: org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat
 OutputFormat: org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat             OutputFormat: org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat
 Storage Properties: [serialization.format=1]                                             Storage Properties: [serialization.format=1]
 Schema: root                                                                             Schema: root
-- COL1: long (nullable = true)                                                         |-- COL1: long (nullable = true)
!), org.apache.spark.sql.execution.datasources.InMemoryFileIndex@5c0e0003, [COL1#6L]      ), org.apache.spark.sql.execution.datasources.InMemoryFileIndex@5c0e0003, [col1#5L]
!+- Project [col1#5L AS col1#6L]                                                          +- Project [col1#5L]
    +- Filter (col1#5L > -20)                                                                +- Filter (col1#5L > -20)
       +- Relation[col1#5L] parquet 

At this time the outputColumns is col1#5L. So spark will use col1 as the column name of the parquet.

Before SPARK-22834. The allColumns is queryExecution.analyzed.output, it's not optimized.

I have three ways to solve this issue:

  1. Read the column name in the view with the actual column name. but it is difficult to handle all cases. Please see the third commit.

  2. Do not remove redundant alias if plan is Command. Because alias may be useful.

  3. Change this line from case a if resolver(a.name, name) => a.withName(name) to case a if resolver(a.name, name) => a. But this change will cause some test failures:

[info] - order-by-nulls-ordering.sql *** FAILED *** (4 seconds, 789 milliseconds)
[info]   Expected "struct<[COL1:int,COL2:int,COL]3:int>", but got "struct<[col1:int,col2:int,col]3:int>" Schema did not match for query #6
[info]   SELECT COL1, COL2, COL3 FROM spark_10747 ORDER BY COL3 ASC NULLS FIRST, COL2 (SQLQueryTestSuite.scala:246)

This pr use the second way to fix this issue.

How was this patch tested?

unit tests

@wangyum
Copy link
Member Author

wangyum commented Aug 16, 2018

cc @gengliangwang

@SparkQA
Copy link

SparkQA commented Aug 16, 2018

Test build #94861 has finished for PR 22124 at commit 276879c.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@@ -490,7 +490,8 @@ object DDLPreprocessingUtils {
case (expected, actual) =>
if (expected.dataType.sameType(actual.dataType) &&
expected.name == actual.name &&
expected.metadata == actual.metadata) {
expected.metadata == actual.metadata &&
expected.exprId.id == actual.exprId.id) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why does this fix the problem?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not a correct change. please ignore this.

@@ -901,6 +901,12 @@ class Analyzer(
// If the projection list contains Stars, expand it.
case p: Project if containsStar(p.projectList) =>
p.copy(projectList = buildExpandedProjectList(p.projectList, p.child))
case p @ Project(projectList, _ @ SubqueryAlias(_, view: View))
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replace [COL1#10L, COL2#11L] to col1#10L,col2#11L. thus DDLPreprocessingUtils.castAndRenameQueryOutput will take effect and rename column name same to target table column name.
Then parquet column name same to table column name and we can read it.

image

@cloud-fan
Copy link
Contributor

can you explain how this bug happens and what's the root cause?

@wangyum
Copy link
Member Author

wangyum commented Aug 17, 2018

Thanks @cloud-fan I updated it in description.

@SparkQA
Copy link

SparkQA commented Aug 17, 2018

Test build #94882 has finished for PR 22124 at commit bb93ca0.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan
Copy link
Contributor

So DDLPreprocessingUtils.castAndRenameQueryOutput() can matched, will not do any rename.
So the target table parquet file column is lower case.

why the parquet file columns is lower-cased? the root project has names upper-cased, doesn't it?

@wangyum
Copy link
Member Author

wangyum commented Aug 18, 2018

The root project should be consistent with the schema of the target table. But it is inconsistent now.

Before this PR:
dataColumns:
col1#8L,col2#9L
plan:

*(1) Project [col1#8L, col2#9L]
+- *(1) Filter (isnotnull(col1#8L) && (col1#8L > -20))
   +- *(1) FileScan parquet default.table1[col1#8L,col2#9L] Batched: true, Format: Parquet, Location: InMemoryFileIndex[file:/tmp/yumwang/spark/parquet], PartitionFilters: [], PushedFilters: [IsNotNull(col1), GreaterThan(col1,-20)], ReadSchema: struct<col1:bigint,col2:bigint>

After this PR:
dataColumns:
COL1#14L,COL2#15L
plan:

*(1) Project [col1#8L AS COL1#14L, col2#9L AS COL2#15L]
+- *(1) Filter (isnotnull(col1#8L) && (col1#8L > -20))
   +- *(1) FileScan parquet default.table1[col1#8L,col2#9L] Batched: true, Format: Parquet, Location: InMemoryFileIndex[file:/tmp/yumwang/spark/parquet], PartitionFilters: [], PushedFilters: [IsNotNull(col1), GreaterThan(col1,-20)], ReadSchema: struct<col1:bigint,col2:bigint>

Before SPARK-22834
dataColumns:
COL1#19L,COL2#20L

queryExecution:

== Parsed Logical Plan ==
Project [COL1#19L, COL2#20L]
+- SubqueryAlias view1
   +- View (`default`.`view1`, [col1#19L,col2#20L])
      +- Project [col1#15L, col2#16L]
         +- Filter (col1#15L > cast(-20 as bigint))
            +- SubqueryAlias table1
               +- Relation[col1#15L,col2#16L] parquet

== Analyzed Logical Plan ==
COL1: bigint, COL2: bigint
Project [COL1#19L, COL2#20L]
+- SubqueryAlias view1
   +- View (`default`.`view1`, [col1#19L,col2#20L])
      +- Project [cast(col1#15L as bigint) AS col1#19L, cast(col2#16L as bigint) AS col2#20L]
         +- Project [col1#15L, col2#16L]
            +- Filter (col1#15L > cast(-20 as bigint))
               +- SubqueryAlias table1
                  +- Relation[col1#15L,col2#16L] parquet

== Optimized Logical Plan ==
Filter (isnotnull(col1#15L) && (col1#15L > -20))
+- Relation[col1#15L,col2#16L] parquet

== Physical Plan ==
*Project [col1#15L, col2#16L]
+- *Filter (isnotnull(col1#15L) && (col1#15L > -20))
   +- *FileScan parquet default.table1[col1#15L,col2#16L] Batched: true, Format: Parquet, Location: InMemoryFileIndex[file:/tmp/yumwang/spark/parquet], PartitionFilters: [], PushedFilters: [IsNotNull(col1), GreaterThan(col1,-20)], ReadSchema: struct<col1:bigint,col2:bigint>

@SparkQA
Copy link

SparkQA commented Aug 18, 2018

Test build #94921 has finished for PR 22124 at commit 9b16ff0.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan
Copy link
Contributor

But it is inconsistent now.

Can you point out in the codebase where the inconsistency comes from?

@wangyum
Copy link
Member Author

wangyum commented Aug 18, 2018

@gengliangwang
Copy link
Member

Hi @wangyum , thanks for working on this.
Can you simplify the reproducing case? E.g. Select only one column should be enough.
Also, in the PR description, somehow there are column names SITE_ID and LEAF_CATEG_ID comes from nowhere.

@cloud-fan
Copy link
Contributor

@wangyum I know it's from #20020, but do you know which line of the code/which method cause it? We must fully understand the bug before fixing it.

def apply(plan: LogicalPlan): LogicalPlan = {
plan match {
case c: Command => c
case _ => removeRedundantAliases(plan, AttributeSet.empty)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't get it. For the query

*(1) Project [col1#8L AS COL1#14L, col2#9L AS COL2#15L]
+- *(1) Filter (isnotnull(col1#8L) && (col1#8L > -20))
   +- *(1) FileScan parquet default.table1[col1#8L,col2#9L] Batched: true, Format: Parquet, Location: InMemoryFileIndex[file:/tmp/yumwang/spark/parquet], PartitionFilters: [], PushedFilters: [IsNotNull(col1), GreaterThan(col1,-20)], ReadSchema: struct<col1:bigint,col2:bigint>

Why is the alias treated as redundant? The name does change, isn't it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is correct. Without this PR, RemoveRedundantAliases works like this:

=== Applying Rule org.apache.spark.sql.catalyst.optimizer.RemoveRedundantAliases ===
 InsertIntoHadoopFsRelationCommand file:/private/var/folders/tg/f5mz46090wg7swzgdc69f8q03965_0/T/warehouse-ae504f50-9543-49fb-a  InsertIntoHadoopFsRelationCommand file:/private/var/folders/tg/f5mz46090wg7swzgdc69f8q03965_0/T/warehouse-ae504f50-9543-49fb-acf
 Database: default                                                                                                               Database: default
 Table: table2                                                                                                                   Table: table2
 Owner: yumwang                                                                                                                  Owner: yumwang
 Created Time: Mon Aug 20 03:03:52 PDT 2018                                                                                      Created Time: Mon Aug 20 03:03:52 PDT 2018
 Last Access: Wed Dec 31 16:00:00 PST 1969                                                                                       Last Access: Wed Dec 31 16:00:00 PST 1969
 Created By: Spark 2.4.0-SNAPSHOT                                                                                                Created By: Spark 2.4.0-SNAPSHOT
 Type: MANAGED                                                                                                                   Type: MANAGED
 Provider: hive                                                                                                                  Provider: hive
 Table Properties: [transient_lastDdlTime=1534759432]                                                                            Table Properties: [transient_lastDdlTime=1534759432]
 Location: file:/private/var/folders/tg/f5mz46090wg7swzgdc69f8q03965_0/T/warehouse-ae504f50-9543-49fb-acf0-8b2736665d26/table2   Location: file:/private/var/folders/tg/f5mz46090wg7swzgdc69f8q03965_0/T/warehouse-ae504f50-9543-49fb-acf0-8b2736665d26/table2
 Serde Library: org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe                                                      Serde Library: org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe
 InputFormat: org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat                                                      InputFormat: org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat
 OutputFormat: org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat                                                    OutputFormat: org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat
 Storage Properties: [serialization.format=1]                                                                                    Storage Properties: [serialization.format=1]
 Partition Provider: Catalog                                                                                                     Partition Provider: Catalog
 Schema: root                                                                                                                    Schema: root
-- COL1: long (nullable = true)                                                                                                |-- COL1: long (nullable = true)
-- COL2: long (nullable = true)                                                                                                |-- COL2: long (nullable = true)
!), org.apache.spark.sql.execution.datasources.InMemoryFileIndex@60582d55, [COL1#10L, COL2#11L]                                  ), org.apache.spark.sql.execution.datasources.InMemoryFileIndex@60582d55, [col1#8L, col2#9L]
!+- Project [col1#8L AS col1#10L, col2#9L AS col2#11L]                                                                           +- Project [col1#8L, col2#9L]
    +- Filter (col1#8L > -20)                                                                                                       +- Filter (col1#8L > -20)
       +- Relation[col1#8L,col2#9L] parquet                                                                                            +- Relation[col1#8L,col2#9L] parquet

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example:

val path = "/tmp/spark/parquet"
val cnt = 30
spark.range(cnt).selectExpr("id as col1").write.mode("overwrite").parquet(path)
spark.sql(s"CREATE TABLE table1(col1 bigint) using parquet location '$path'")
spark.sql("create view view1 as select col1 from table1 where col1 > -20")
// The column name of table2 is inconsistent with the column name of view1.
spark.sql("create table table2 (COL1 BIGINT) using parquet")
// When querying the view, ensure that the column name of the query matches the column name of the target table.
spark.sql("insert overwrite table table2 select COL1 from view1")

The execution plan change track:

=== Applying Rule org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveReferences ===
!'Project ['id AS col1#2]                   Project [id#0L AS col1#2L]
 +- Range (0, 30, step=1, splits=Some(1))   +- Range (0, 30, step=1, splits=Some(1))
                
17:02:55.061 WARN org.apache.spark.sql.hive.HiveSessionStateBuilder$$anon$1: 
=== Applying Rule org.apache.spark.sql.catalyst.analysis.CleanupAliases ===
 Project [id#0L AS col1#2L]                 Project [id#0L AS col1#2L]
 +- Range (0, 30, step=1, splits=Some(1))   +- Range (0, 30, step=1, splits=Some(1))
                
17:02:59.174 WARN org.apache.spark.sql.hive.HiveSessionStateBuilder$$anon$1: 
=== Applying Rule org.apache.spark.sql.execution.datasources.DataSourceAnalysis ===
!'CreateTable `table1`, ErrorIfExists   CreateDataSourceTableCommand `table1`, false
                
17:02:59.909 WARN org.apache.hadoop.hive.metastore.ObjectStore: Failed to get database global_temp, returning NoSuchObjectException
17:03:00.094 WARN org.apache.spark.sql.hive.HiveSessionStateBuilder$$anon$1: 
=== Applying Rule org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations ===
 'Project ['col1]                     'Project ['col1]
 +- 'Filter ('col1 > -20)             +- 'Filter ('col1 > -20)
!   +- 'UnresolvedRelation `table1`      +- 'SubqueryAlias `default`.`table1`
!                                           +- 'UnresolvedCatalogRelation `default`.`table1`, org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe
                
17:03:00.254 WARN org.apache.spark.sql.hive.HiveSessionStateBuilder$$anon$1: 
=== Applying Rule org.apache.spark.sql.execution.datasources.FindDataSourceTable ===
 'Project ['col1]                                                                                                      'Project ['col1]
 +- 'Filter ('col1 > -20)                                                                                              +- 'Filter ('col1 > -20)
!   +- 'SubqueryAlias `default`.`table1`                                                                                  +- SubqueryAlias `default`.`table1`
!      +- 'UnresolvedCatalogRelation `default`.`table1`, org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe         +- Relation[col1#5L] parquet
                
17:03:00.267 WARN org.apache.spark.sql.hive.HiveSessionStateBuilder$$anon$1: 
=== Applying Rule org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveReferences ===
 'Project ['col1]                         'Project ['col1]
!+- 'Filter ('col1 > -20)                 +- 'Filter (col1#5L > -20)
    +- SubqueryAlias `default`.`table1`      +- SubqueryAlias `default`.`table1`
       +- Relation[col1#5L] parquet             +- Relation[col1#5L] parquet
                
17:03:00.306 WARN org.apache.spark.sql.hive.HiveSessionStateBuilder$$anon$1: 
=== Applying Rule org.apache.spark.sql.catalyst.analysis.TypeCoercion$ImplicitTypeCasts ===
 'Project ['col1]                         'Project ['col1]
!+- 'Filter (col1#5L > -20)               +- Filter (col1#5L > cast(-20 as bigint))
    +- SubqueryAlias `default`.`table1`      +- SubqueryAlias `default`.`table1`
       +- Relation[col1#5L] parquet             +- Relation[col1#5L] parquet
                
17:03:00.309 WARN org.apache.spark.sql.hive.HiveSessionStateBuilder$$anon$1: 
=== Applying Rule org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveReferences ===
!'Project ['col1]                            Project [col1#5L]
 +- Filter (col1#5L > cast(-20 as bigint))   +- Filter (col1#5L > cast(-20 as bigint))
    +- SubqueryAlias `default`.`table1`         +- SubqueryAlias `default`.`table1`
       +- Relation[col1#5L] parquet                +- Relation[col1#5L] parquet
                
17:03:00.314 WARN org.apache.spark.sql.hive.HiveSessionStateBuilder$$anon$1: 
=== Applying Rule org.apache.spark.sql.catalyst.analysis.ResolveTimeZone ===
 Project [col1#5L]                           Project [col1#5L]
 +- Filter (col1#5L > cast(-20 as bigint))   +- Filter (col1#5L > cast(-20 as bigint))
    +- SubqueryAlias `default`.`table1`         +- SubqueryAlias `default`.`table1`
       +- Relation[col1#5L] parquet                +- Relation[col1#5L] parquet
                
17:03:00.383 WARN org.apache.spark.sql.hive.HiveSessionStateBuilder$$anon$1: 
=== Applying Rule org.apache.spark.sql.execution.datasources.DataSourceAnalysis ===
!'CreateTable `table2`, ErrorIfExists   CreateDataSourceTableCommand `table2`, false
                
17:03:00.729 WARN org.apache.spark.sql.hive.HiveSessionStateBuilder$$anon$1: 
=== Applying Rule org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations ===
 'Project ['col1]                     'Project ['col1]
 +- 'Filter ('col1 > -20)             +- 'Filter ('col1 > -20)
!   +- 'UnresolvedRelation `table1`      +- 'SubqueryAlias `default`.`table1`
!                                           +- 'UnresolvedCatalogRelation `default`.`table1`, org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe
                
17:03:00.730 WARN org.apache.spark.sql.hive.HiveSessionStateBuilder$$anon$1: 
=== Applying Rule org.apache.spark.sql.execution.datasources.FindDataSourceTable ===
 'Project ['col1]                                                                                                      'Project ['col1]
 +- 'Filter ('col1 > -20)                                                                                              +- 'Filter ('col1 > -20)
!   +- 'SubqueryAlias `default`.`table1`                                                                                  +- SubqueryAlias `default`.`table1`
!      +- 'UnresolvedCatalogRelation `default`.`table1`, org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe         +- Relation[col1#5L] parquet
                
17:03:00.731 WARN org.apache.spark.sql.hive.HiveSessionStateBuilder$$anon$1: 
=== Applying Rule org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveReferences ===
 'Project ['col1]                         'Project ['col1]
!+- 'Filter ('col1 > -20)                 +- 'Filter (col1#5L > -20)
    +- SubqueryAlias `default`.`table1`      +- SubqueryAlias `default`.`table1`
       +- Relation[col1#5L] parquet             +- Relation[col1#5L] parquet
                
17:03:00.734 WARN org.apache.spark.sql.hive.HiveSessionStateBuilder$$anon$1: 
=== Applying Rule org.apache.spark.sql.catalyst.analysis.TypeCoercion$ImplicitTypeCasts ===
 'Project ['col1]                         'Project ['col1]
!+- 'Filter (col1#5L > -20)               +- Filter (col1#5L > cast(-20 as bigint))
    +- SubqueryAlias `default`.`table1`      +- SubqueryAlias `default`.`table1`
       +- Relation[col1#5L] parquet             +- Relation[col1#5L] parquet
                
17:03:00.735 WARN org.apache.spark.sql.hive.HiveSessionStateBuilder$$anon$1: 
=== Applying Rule org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveReferences ===
!'Project ['col1]                            Project [col1#5L]
 +- Filter (col1#5L > cast(-20 as bigint))   +- Filter (col1#5L > cast(-20 as bigint))
    +- SubqueryAlias `default`.`table1`         +- SubqueryAlias `default`.`table1`
       +- Relation[col1#5L] parquet                +- Relation[col1#5L] parquet
                
17:03:00.737 WARN org.apache.spark.sql.hive.HiveSessionStateBuilder$$anon$1: 
=== Applying Rule org.apache.spark.sql.catalyst.analysis.ResolveTimeZone ===
 Project [col1#5L]                           Project [col1#5L]
 +- Filter (col1#5L > cast(-20 as bigint))   +- Filter (col1#5L > cast(-20 as bigint))
    +- SubqueryAlias `default`.`table1`         +- SubqueryAlias `default`.`table1`
       +- Relation[col1#5L] parquet                +- Relation[col1#5L] parquet
                
17:03:00.742 WARN org.apache.spark.sql.hive.HiveSessionStateBuilder$$anon$1: 
=== Applying Rule org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations ===
 'InsertIntoTable 'UnresolvedRelation `table2`, true, false   'InsertIntoTable 'UnresolvedRelation `table2`, true, false
 +- 'Project ['COL1]                                          +- 'Project ['COL1]
!   +- 'UnresolvedRelation `view1`                               +- SubqueryAlias `default`.`view1`
!                                                                   +- View (`default`.`view1`, [col1#6L])
!                                                                      +- Project [col1#5L]
!                                                                         +- Filter (col1#5L > cast(-20 as bigint))
!                                                                            +- SubqueryAlias `default`.`table1`
!                                                                               +- Relation[col1#5L] parquet
                
17:03:00.744 WARN org.apache.spark.sql.hive.HiveSessionStateBuilder$$anon$1: 
=== Applying Rule org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveReferences ===
 'InsertIntoTable 'UnresolvedRelation `table2`, true, false   'InsertIntoTable 'UnresolvedRelation `table2`, true, false
!+- 'Project ['COL1]                                          +- Project [COL1#6L]
    +- SubqueryAlias `default`.`view1`                           +- SubqueryAlias `default`.`view1`
       +- View (`default`.`view1`, [col1#6L])                       +- View (`default`.`view1`, [col1#6L])
          +- Project [col1#5L]                                         +- Project [col1#5L]
             +- Filter (col1#5L > cast(-20 as bigint))                    +- Filter (col1#5L > cast(-20 as bigint))
                +- SubqueryAlias `default`.`table1`                          +- SubqueryAlias `default`.`table1`
                   +- Relation[col1#5L] parquet                                 +- Relation[col1#5L] parquet
                
17:03:00.768 WARN org.apache.spark.sql.hive.HiveSessionStateBuilder$$anon$1: 
=== Applying Rule org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations ===
!'InsertIntoTable 'UnresolvedRelation `table2`, true, false   'InsertIntoTable 'UnresolvedCatalogRelation `default`.`table2`, org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe, true, false
 +- Project [COL1#6L]                                         +- Project [COL1#6L]
    +- SubqueryAlias `default`.`view1`                           +- SubqueryAlias `default`.`view1`
       +- View (`default`.`view1`, [col1#6L])                       +- View (`default`.`view1`, [col1#6L])
          +- Project [col1#5L]                                         +- Project [col1#5L]
             +- Filter (col1#5L > cast(-20 as bigint))                    +- Filter (col1#5L > cast(-20 as bigint))
                +- SubqueryAlias `default`.`table1`                          +- SubqueryAlias `default`.`table1`
                   +- Relation[col1#5L] parquet                                 +- Relation[col1#5L] parquet
                
17:03:00.852 WARN org.apache.spark.sql.hive.HiveSessionStateBuilder$$anon$1: 
=== Applying Rule org.apache.spark.sql.execution.datasources.FindDataSourceTable ===
!'InsertIntoTable 'UnresolvedCatalogRelation `default`.`table2`, org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe, true, false   'InsertIntoTable Relation[COL1#7L] parquet, true, false
 +- Project [COL1#6L]                                                                                                                       +- Project [COL1#6L]
    +- SubqueryAlias `default`.`view1`                                                                                                         +- SubqueryAlias `default`.`view1`
       +- View (`default`.`view1`, [col1#6L])                                                                                                     +- View (`default`.`view1`, [col1#6L])
          +- Project [col1#5L]                                                                                                                       +- Project [col1#5L]
             +- Filter (col1#5L > cast(-20 as bigint))                                                                                                  +- Filter (col1#5L > cast(-20 as bigint))
                +- SubqueryAlias `default`.`table1`                                                                                                        +- SubqueryAlias `default`.`table1`
                   +- Relation[col1#5L] parquet                                                                                                               +- Relation[col1#5L] parquet
                
DataSourceStrategy 1:COL1#8L
DataSourceStrategy 2:COL1#6L
17:03:00.896 WARN org.apache.spark.sql.hive.HiveSessionStateBuilder$$anon$1: 
=== Applying Rule org.apache.spark.sql.execution.datasources.DataSourceAnalysis ===
!'InsertIntoTable Relation[COL1#7L] parquet, true, false   InsertIntoHadoopFsRelationCommand file:/private/var/folders/tg/f5mz46090wg7swzgdc69f8q03965_0/T/warehouse-04d554d2-7ddb-4e13-b065-164afe065972/table2, false, Parquet, Map(serialization.format -> 1, path -> file:/private/var/folders/tg/f5mz46090wg7swzgdc69f8q03965_0/T/warehouse-04d554d2-7ddb-4e13-b065-164afe065972/table2), Overwrite, CatalogTable(
!+- Project [COL1#6L]                                      Database: default
!   +- SubqueryAlias `default`.`view1`                     Table: table2
!      +- View (`default`.`view1`, [col1#6L])              Owner: yumwang
!         +- Project [col1#5L]                             Created Time: Mon Aug 20 17:03:00 PDT 2018
!            +- Filter (col1#5L > cast(-20 as bigint))     Last Access: Wed Dec 31 16:00:00 PST 1969
!               +- SubqueryAlias `default`.`table1`        Created By: Spark 2.4.0-SNAPSHOT
!                  +- Relation[col1#5L] parquet            Type: MANAGED
!                                                          Provider: parquet
!                                                          Table Properties: [transient_lastDdlTime=1534809780]
!                                                          Location: file:/private/var/folders/tg/f5mz46090wg7swzgdc69f8q03965_0/T/warehouse-04d554d2-7ddb-4e13-b065-164afe065972/table2
!                                                          Serde Library: org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe
!                                                          InputFormat: org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat
!                                                          OutputFormat: org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat
!                                                          Storage Properties: [serialization.format=1]
!                                                          Schema: root
!                                                           |-- COL1: long (nullable = true)
!                                                          ), org.apache.spark.sql.execution.datasources.InMemoryFileIndex@c613921e, [COL1#6L]
!                                                          +- Project [COL1#6L]
!                                                             +- SubqueryAlias `default`.`view1`
!                                                                +- View (`default`.`view1`, [col1#6L])
!                                                                   +- Project [col1#5L]
!                                                                      +- Filter (col1#5L > cast(-20 as bigint))
!                                                                         +- SubqueryAlias `default`.`table1`
!                                                                            +- Relation[col1#5L] parquet
                
17:03:00.916 WARN org.apache.spark.sql.hive.HiveSessionStateBuilder$$anon$1: 
=== Applying Rule org.apache.spark.sql.catalyst.analysis.AliasViewChild ===
 InsertIntoHadoopFsRelationCommand file:/private/var/folders/tg/f5mz46090wg7swzgdc69f8q03965_0/T/warehouse-04d554d2-7ddb-4e13-b065-164afe065972/table2, false, Parquet, Map(serialization.format -> 1, path -> file:/private/var/folders/tg/f5mz46090wg7swzgdc69f8q03965_0/T/warehouse-04d554d2-7ddb-4e13-b065-164afe065972/table2), Overwrite, CatalogTable(   InsertIntoHadoopFsRelationCommand file:/private/var/folders/tg/f5mz46090wg7swzgdc69f8q03965_0/T/warehouse-04d554d2-7ddb-4e13-b065-164afe065972/table2, false, Parquet, Map(serialization.format -> 1, path -> file:/private/var/folders/tg/f5mz46090wg7swzgdc69f8q03965_0/T/warehouse-04d554d2-7ddb-4e13-b065-164afe065972/table2), Overwrite, CatalogTable(
 Database: default                                                                                                                                                                                                                                                                                                                                              Database: default
 Table: table2                                                                                                                                                                                                                                                                                                                                                  Table: table2
 Owner: yumwang                                                                                                                                                                                                                                                                                                                                                 Owner: yumwang
 Created Time: Mon Aug 20 17:03:00 PDT 2018                                                                                                                                                                                                                                                                                                                     Created Time: Mon Aug 20 17:03:00 PDT 2018
 Last Access: Wed Dec 31 16:00:00 PST 1969                                                                                                                                                                                                                                                                                                                      Last Access: Wed Dec 31 16:00:00 PST 1969
 Created By: Spark 2.4.0-SNAPSHOT                                                                                                                                                                                                                                                                                                                               Created By: Spark 2.4.0-SNAPSHOT
 Type: MANAGED                                                                                                                                                                                                                                                                                                                                                  Type: MANAGED
 Provider: parquet                                                                                                                                                                                                                                                                                                                                              Provider: parquet
 Table Properties: [transient_lastDdlTime=1534809780]                                                                                                                                                                                                                                                                                                           Table Properties: [transient_lastDdlTime=1534809780]
 Location: file:/private/var/folders/tg/f5mz46090wg7swzgdc69f8q03965_0/T/warehouse-04d554d2-7ddb-4e13-b065-164afe065972/table2                                                                                                                                                                                                                                  Location: file:/private/var/folders/tg/f5mz46090wg7swzgdc69f8q03965_0/T/warehouse-04d554d2-7ddb-4e13-b065-164afe065972/table2
 Serde Library: org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe                                                                                                                                                                                                                                                                                     Serde Library: org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe
 InputFormat: org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat                                                                                                                                                                                                                                                                                     InputFormat: org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat
 OutputFormat: org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat                                                                                                                                                                                                                                                                                   OutputFormat: org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat
 Storage Properties: [serialization.format=1]                                                                                                                                                                                                                                                                                                                   Storage Properties: [serialization.format=1]
 Schema: root                                                                                                                                                                                                                                                                                                                                                   Schema: root
-- COL1: long (nullable = true)                                                                                                                                                                                                                                                                                                                               |-- COL1: long (nullable = true)
 ), org.apache.spark.sql.execution.datasources.InMemoryFileIndex@c613921e, [COL1#6L]                                                                                                                                                                                                                                                                            ), org.apache.spark.sql.execution.datasources.InMemoryFileIndex@c613921e, [COL1#6L]
 +- Project [COL1#6L]                                                                                                                                                                                                                                                                                                                                           +- Project [COL1#6L]
    +- SubqueryAlias `default`.`view1`                                                                                                                                                                                                                                                                                                                             +- SubqueryAlias `default`.`view1`
       +- View (`default`.`view1`, [col1#6L])                                                                                                                                                                                                                                                                                                                         +- View (`default`.`view1`, [col1#6L])
!         +- Project [col1#5L]                                                                                                                                                                                                                                                                                                                                           +- Project [cast(col1#5L as bigint) AS col1#6L]
!            +- Filter (col1#5L > cast(-20 as bigint))                                                                                                                                                                                                                                                                                                                      +- Project [col1#5L]
!               +- SubqueryAlias `default`.`table1`                                                                                                                                                                                                                                                                                                                            +- Filter (col1#5L > cast(-20 as bigint))
!                  +- Relation[col1#5L] parquet                                                                                                                                                                                                                                                                                                                                   +- SubqueryAlias `default`.`table1`
!                                                                                                                                                                                                                                                                                                                                                                                    +- Relation[col1#5L] parquet
                
yumwang123:COL1#6L
17:03:00.949 WARN org.apache.spark.sql.internal.BaseSessionStateBuilder$$anon$2: 
=== Applying Rule org.apache.spark.sql.catalyst.analysis.EliminateSubqueryAliases ===
 InsertIntoHadoopFsRelationCommand file:/private/var/folders/tg/f5mz46090wg7swzgdc69f8q03965_0/T/warehouse-04d554d2-7ddb-4e13-b065-164afe065972/table2, false, Parquet, Map(serialization.format -> 1, path -> file:/private/var/folders/tg/f5mz46090wg7swzgdc69f8q03965_0/T/warehouse-04d554d2-7ddb-4e13-b065-164afe065972/table2), Overwrite, CatalogTable(   InsertIntoHadoopFsRelationCommand file:/private/var/folders/tg/f5mz46090wg7swzgdc69f8q03965_0/T/warehouse-04d554d2-7ddb-4e13-b065-164afe065972/table2, false, Parquet, Map(serialization.format -> 1, path -> file:/private/var/folders/tg/f5mz46090wg7swzgdc69f8q03965_0/T/warehouse-04d554d2-7ddb-4e13-b065-164afe065972/table2), Overwrite, CatalogTable(
 Database: default                                                                                                                                                                                                                                                                                                                                              Database: default
 Table: table2                                                                                                                                                                                                                                                                                                                                                  Table: table2
 Owner: yumwang                                                                                                                                                                                                                                                                                                                                                 Owner: yumwang
 Created Time: Mon Aug 20 17:03:00 PDT 2018                                                                                                                                                                                                                                                                                                                     Created Time: Mon Aug 20 17:03:00 PDT 2018
 Last Access: Wed Dec 31 16:00:00 PST 1969                                                                                                                                                                                                                                                                                                                      Last Access: Wed Dec 31 16:00:00 PST 1969
 Created By: Spark 2.4.0-SNAPSHOT                                                                                                                                                                                                                                                                                                                               Created By: Spark 2.4.0-SNAPSHOT
 Type: MANAGED                                                                                                                                                                                                                                                                                                                                                  Type: MANAGED
 Provider: parquet                                                                                                                                                                                                                                                                                                                                              Provider: parquet
 Table Properties: [transient_lastDdlTime=1534809780]                                                                                                                                                                                                                                                                                                           Table Properties: [transient_lastDdlTime=1534809780]
 Location: file:/private/var/folders/tg/f5mz46090wg7swzgdc69f8q03965_0/T/warehouse-04d554d2-7ddb-4e13-b065-164afe065972/table2                                                                                                                                                                                                                                  Location: file:/private/var/folders/tg/f5mz46090wg7swzgdc69f8q03965_0/T/warehouse-04d554d2-7ddb-4e13-b065-164afe065972/table2
 Serde Library: org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe                                                                                                                                                                                                                                                                                     Serde Library: org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe
 InputFormat: org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat                                                                                                                                                                                                                                                                                     InputFormat: org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat
 OutputFormat: org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat                                                                                                                                                                                                                                                                                   OutputFormat: org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat
 Storage Properties: [serialization.format=1]                                                                                                                                                                                                                                                                                                                   Storage Properties: [serialization.format=1]
 Schema: root                                                                                                                                                                                                                                                                                                                                                   Schema: root
-- COL1: long (nullable = true)                                                                                                                                                                                                                                                                                                                               |-- COL1: long (nullable = true)
 ), org.apache.spark.sql.execution.datasources.InMemoryFileIndex@c613921e, [COL1#6L]                                                                                                                                                                                                                                                                            ), org.apache.spark.sql.execution.datasources.InMemoryFileIndex@c613921e, [COL1#6L]
 +- Project [COL1#6L]                                                                                                                                                                                                                                                                                                                                           +- Project [COL1#6L]
!   +- SubqueryAlias `default`.`view1`                                                                                                                                                                                                                                                                                                                             +- View (`default`.`view1`, [col1#6L])
!      +- View (`default`.`view1`, [col1#6L])                                                                                                                                                                                                                                                                                                                         +- Project [cast(col1#5L as bigint) AS col1#6L]
!         +- Project [cast(col1#5L as bigint) AS col1#6L]                                                                                                                                                                                                                                                                                                                +- Project [col1#5L]
!            +- Project [col1#5L]                                                                                                                                                                                                                                                                                                                                           +- Filter (col1#5L > cast(-20 as bigint))
!               +- Filter (col1#5L > cast(-20 as bigint))                                                                                                                                                                                                                                                                                                                      +- Relation[col1#5L] parquet
!                  +- SubqueryAlias `default`.`table1`                                                                                                                                                                                                                                                                                                          
!                     +- Relation[col1#5L] parquet                                                                                                                                                                                                                                                                                                              
                
17:03:00.959 WARN org.apache.spark.sql.internal.BaseSessionStateBuilder$$anon$2: 
=== Applying Rule org.apache.spark.sql.catalyst.analysis.EliminateView ===
 InsertIntoHadoopFsRelationCommand file:/private/var/folders/tg/f5mz46090wg7swzgdc69f8q03965_0/T/warehouse-04d554d2-7ddb-4e13-b065-164afe065972/table2, false, Parquet, Map(serialization.format -> 1, path -> file:/private/var/folders/tg/f5mz46090wg7swzgdc69f8q03965_0/T/warehouse-04d554d2-7ddb-4e13-b065-164afe065972/table2), Overwrite, CatalogTable(   InsertIntoHadoopFsRelationCommand file:/private/var/folders/tg/f5mz46090wg7swzgdc69f8q03965_0/T/warehouse-04d554d2-7ddb-4e13-b065-164afe065972/table2, false, Parquet, Map(serialization.format -> 1, path -> file:/private/var/folders/tg/f5mz46090wg7swzgdc69f8q03965_0/T/warehouse-04d554d2-7ddb-4e13-b065-164afe065972/table2), Overwrite, CatalogTable(
 Database: default                                                                                                                                                                                                                                                                                                                                              Database: default
 Table: table2                                                                                                                                                                                                                                                                                                                                                  Table: table2
 Owner: yumwang                                                                                                                                                                                                                                                                                                                                                 Owner: yumwang
 Created Time: Mon Aug 20 17:03:00 PDT 2018                                                                                                                                                                                                                                                                                                                     Created Time: Mon Aug 20 17:03:00 PDT 2018
 Last Access: Wed Dec 31 16:00:00 PST 1969                                                                                                                                                                                                                                                                                                                      Last Access: Wed Dec 31 16:00:00 PST 1969
 Created By: Spark 2.4.0-SNAPSHOT                                                                                                                                                                                                                                                                                                                               Created By: Spark 2.4.0-SNAPSHOT
 Type: MANAGED                                                                                                                                                                                                                                                                                                                                                  Type: MANAGED
 Provider: parquet                                                                                                                                                                                                                                                                                                                                              Provider: parquet
 Table Properties: [transient_lastDdlTime=1534809780]                                                                                                                                                                                                                                                                                                           Table Properties: [transient_lastDdlTime=1534809780]
 Location: file:/private/var/folders/tg/f5mz46090wg7swzgdc69f8q03965_0/T/warehouse-04d554d2-7ddb-4e13-b065-164afe065972/table2                                                                                                                                                                                                                                  Location: file:/private/var/folders/tg/f5mz46090wg7swzgdc69f8q03965_0/T/warehouse-04d554d2-7ddb-4e13-b065-164afe065972/table2
 Serde Library: org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe                                                                                                                                                                                                                                                                                     Serde Library: org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe
 InputFormat: org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat                                                                                                                                                                                                                                                                                     InputFormat: org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat
 OutputFormat: org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat                                                                                                                                                                                                                                                                                   OutputFormat: org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat
 Storage Properties: [serialization.format=1]                                                                                                                                                                                                                                                                                                                   Storage Properties: [serialization.format=1]
 Schema: root                                                                                                                                                                                                                                                                                                                                                   Schema: root
-- COL1: long (nullable = true)                                                                                                                                                                                                                                                                                                                               |-- COL1: long (nullable = true)
 ), org.apache.spark.sql.execution.datasources.InMemoryFileIndex@c613921e, [COL1#6L]                                                                                                                                                                                                                                                                            ), org.apache.spark.sql.execution.datasources.InMemoryFileIndex@c613921e, [COL1#6L]
 +- Project [COL1#6L]                                                                                                                                                                                                                                                                                                                                           +- Project [COL1#6L]
!   +- View (`default`.`view1`, [col1#6L])                                                                                                                                                                                                                                                                                                                         +- Project [cast(col1#5L as bigint) AS col1#6L]
!      +- Project [cast(col1#5L as bigint) AS col1#6L]                                                                                                                                                                                                                                                                                                                +- Project [col1#5L]
!         +- Project [col1#5L]                                                                                                                                                                                                                                                                                                                                           +- Filter (col1#5L > cast(-20 as bigint))
!            +- Filter (col1#5L > cast(-20 as bigint))                                                                                                                                                                                                                                                                                                                      +- Relation[col1#5L] parquet
!               +- Relation[col1#5L] parquet                                                                                                                                                                                                                                                                                                                    
                
17:03:00.975 WARN org.apache.spark.sql.internal.BaseSessionStateBuilder$$anon$2: 
=== Applying Rule org.apache.spark.sql.catalyst.optimizer.ColumnPruning ===
 InsertIntoHadoopFsRelationCommand file:/private/var/folders/tg/f5mz46090wg7swzgdc69f8q03965_0/T/warehouse-04d554d2-7ddb-4e13-b065-164afe065972/table2, false, Parquet, Map(serialization.format -> 1, path -> file:/private/var/folders/tg/f5mz46090wg7swzgdc69f8q03965_0/T/warehouse-04d554d2-7ddb-4e13-b065-164afe065972/table2), Overwrite, CatalogTable(   InsertIntoHadoopFsRelationCommand file:/private/var/folders/tg/f5mz46090wg7swzgdc69f8q03965_0/T/warehouse-04d554d2-7ddb-4e13-b065-164afe065972/table2, false, Parquet, Map(serialization.format -> 1, path -> file:/private/var/folders/tg/f5mz46090wg7swzgdc69f8q03965_0/T/warehouse-04d554d2-7ddb-4e13-b065-164afe065972/table2), Overwrite, CatalogTable(
 Database: default                                                                                                                                                                                                                                                                                                                                              Database: default
 Table: table2                                                                                                                                                                                                                                                                                                                                                  Table: table2
 Owner: yumwang                                                                                                                                                                                                                                                                                                                                                 Owner: yumwang
 Created Time: Mon Aug 20 17:03:00 PDT 2018                                                                                                                                                                                                                                                                                                                     Created Time: Mon Aug 20 17:03:00 PDT 2018
 Last Access: Wed Dec 31 16:00:00 PST 1969                                                                                                                                                                                                                                                                                                                      Last Access: Wed Dec 31 16:00:00 PST 1969
 Created By: Spark 2.4.0-SNAPSHOT                                                                                                                                                                                                                                                                                                                               Created By: Spark 2.4.0-SNAPSHOT
 Type: MANAGED                                                                                                                                                                                                                                                                                                                                                  Type: MANAGED
 Provider: parquet                                                                                                                                                                                                                                                                                                                                              Provider: parquet
 Table Properties: [transient_lastDdlTime=1534809780]                                                                                                                                                                                                                                                                                                           Table Properties: [transient_lastDdlTime=1534809780]
 Location: file:/private/var/folders/tg/f5mz46090wg7swzgdc69f8q03965_0/T/warehouse-04d554d2-7ddb-4e13-b065-164afe065972/table2                                                                                                                                                                                                                                  Location: file:/private/var/folders/tg/f5mz46090wg7swzgdc69f8q03965_0/T/warehouse-04d554d2-7ddb-4e13-b065-164afe065972/table2
 Serde Library: org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe                                                                                                                                                                                                                                                                                     Serde Library: org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe
 InputFormat: org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat                                                                                                                                                                                                                                                                                     InputFormat: org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat
 OutputFormat: org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat                                                                                                                                                                                                                                                                                   OutputFormat: org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat
 Storage Properties: [serialization.format=1]                                                                                                                                                                                                                                                                                                                   Storage Properties: [serialization.format=1]
 Schema: root                                                                                                                                                                                                                                                                                                                                                   Schema: root
-- COL1: long (nullable = true)                                                                                                                                                                                                                                                                                                                               |-- COL1: long (nullable = true)
 ), org.apache.spark.sql.execution.datasources.InMemoryFileIndex@c613921e, [COL1#6L]                                                                                                                                                                                                                                                                            ), org.apache.spark.sql.execution.datasources.InMemoryFileIndex@c613921e, [COL1#6L]
!+- Project [COL1#6L]                                                                                                                                                                                                                                                                                                                                           +- Project [cast(col1#5L as bigint) AS col1#6L]
!   +- Project [cast(col1#5L as bigint) AS col1#6L]                                                                                                                                                                                                                                                                                                                +- Filter (col1#5L > cast(-20 as bigint))
!      +- Project [col1#5L]                                                                                                                                                                                                                                                                                                                                           +- Relation[col1#5L] parquet
!         +- Filter (col1#5L > cast(-20 as bigint))                                                                                                                                                                                                                                                                                                             
!            +- Relation[col1#5L] parquet                                                                                                                                                                                                                                                                                                                       
                
17:03:00.980 WARN org.apache.spark.sql.internal.BaseSessionStateBuilder$$anon$2: 
=== Applying Rule org.apache.spark.sql.catalyst.optimizer.ConstantFolding ===
 InsertIntoHadoopFsRelationCommand file:/private/var/folders/tg/f5mz46090wg7swzgdc69f8q03965_0/T/warehouse-04d554d2-7ddb-4e13-b065-164afe065972/table2, false, Parquet, Map(serialization.format -> 1, path -> file:/private/var/folders/tg/f5mz46090wg7swzgdc69f8q03965_0/T/warehouse-04d554d2-7ddb-4e13-b065-164afe065972/table2), Overwrite, CatalogTable(   InsertIntoHadoopFsRelationCommand file:/private/var/folders/tg/f5mz46090wg7swzgdc69f8q03965_0/T/warehouse-04d554d2-7ddb-4e13-b065-164afe065972/table2, false, Parquet, Map(serialization.format -> 1, path -> file:/private/var/folders/tg/f5mz46090wg7swzgdc69f8q03965_0/T/warehouse-04d554d2-7ddb-4e13-b065-164afe065972/table2), Overwrite, CatalogTable(
 Database: default                                                                                                                                                                                                                                                                                                                                              Database: default
 Table: table2                                                                                                                                                                                                                                                                                                                                                  Table: table2
 Owner: yumwang                                                                                                                                                                                                                                                                                                                                                 Owner: yumwang
 Created Time: Mon Aug 20 17:03:00 PDT 2018                                                                                                                                                                                                                                                                                                                     Created Time: Mon Aug 20 17:03:00 PDT 2018
 Last Access: Wed Dec 31 16:00:00 PST 1969                                                                                                                                                                                                                                                                                                                      Last Access: Wed Dec 31 16:00:00 PST 1969
 Created By: Spark 2.4.0-SNAPSHOT                                                                                                                                                                                                                                                                                                                               Created By: Spark 2.4.0-SNAPSHOT
 Type: MANAGED                                                                                                                                                                                                                                                                                                                                                  Type: MANAGED
 Provider: parquet                                                                                                                                                                                                                                                                                                                                              Provider: parquet
 Table Properties: [transient_lastDdlTime=1534809780]                                                                                                                                                                                                                                                                                                           Table Properties: [transient_lastDdlTime=1534809780]
 Location: file:/private/var/folders/tg/f5mz46090wg7swzgdc69f8q03965_0/T/warehouse-04d554d2-7ddb-4e13-b065-164afe065972/table2                                                                                                                                                                                                                                  Location: file:/private/var/folders/tg/f5mz46090wg7swzgdc69f8q03965_0/T/warehouse-04d554d2-7ddb-4e13-b065-164afe065972/table2
 Serde Library: org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe                                                                                                                                                                                                                                                                                     Serde Library: org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe
 InputFormat: org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat                                                                                                                                                                                                                                                                                     InputFormat: org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat
 OutputFormat: org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat                                                                                                                                                                                                                                                                                   OutputFormat: org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat
 Storage Properties: [serialization.format=1]                                                                                                                                                                                                                                                                                                                   Storage Properties: [serialization.format=1]
 Schema: root                                                                                                                                                                                                                                                                                                                                                   Schema: root
-- COL1: long (nullable = true)                                                                                                                                                                                                                                                                                                                               |-- COL1: long (nullable = true)
 ), org.apache.spark.sql.execution.datasources.InMemoryFileIndex@c613921e, [COL1#6L]                                                                                                                                                                                                                                                                            ), org.apache.spark.sql.execution.datasources.InMemoryFileIndex@c613921e, [COL1#6L]
 +- Project [cast(col1#5L as bigint) AS col1#6L]                                                                                                                                                                                                                                                                                                                +- Project [cast(col1#5L as bigint) AS col1#6L]
!   +- Filter (col1#5L > cast(-20 as bigint))                                                                                                                                                                                                                                                                                                                      +- Filter (col1#5L > -20)
       +- Relation[col1#5L] parquet                                                                                                                                                                                                                                                                                                                                   +- Relation[col1#5L] parquet
                
17:03:01.047 WARN org.apache.spark.sql.internal.BaseSessionStateBuilder$$anon$2: 
=== Applying Rule org.apache.spark.sql.catalyst.optimizer.SimplifyCasts ===
 InsertIntoHadoopFsRelationCommand file:/private/var/folders/tg/f5mz46090wg7swzgdc69f8q03965_0/T/warehouse-04d554d2-7ddb-4e13-b065-164afe065972/table2, false, Parquet, Map(serialization.format -> 1, path -> file:/private/var/folders/tg/f5mz46090wg7swzgdc69f8q03965_0/T/warehouse-04d554d2-7ddb-4e13-b065-164afe065972/table2), Overwrite, CatalogTable(   InsertIntoHadoopFsRelationCommand file:/private/var/folders/tg/f5mz46090wg7swzgdc69f8q03965_0/T/warehouse-04d554d2-7ddb-4e13-b065-164afe065972/table2, false, Parquet, Map(serialization.format -> 1, path -> file:/private/var/folders/tg/f5mz46090wg7swzgdc69f8q03965_0/T/warehouse-04d554d2-7ddb-4e13-b065-164afe065972/table2), Overwrite, CatalogTable(
 Database: default                                                                                                                                                                                                                                                                                                                                              Database: default
 Table: table2                                                                                                                                                                                                                                                                                                                                                  Table: table2
 Owner: yumwang                                                                                                                                                                                                                                                                                                                                                 Owner: yumwang
 Created Time: Mon Aug 20 17:03:00 PDT 2018                                                                                                                                                                                                                                                                                                                     Created Time: Mon Aug 20 17:03:00 PDT 2018
 Last Access: Wed Dec 31 16:00:00 PST 1969                                                                                                                                                                                                                                                                                                                      Last Access: Wed Dec 31 16:00:00 PST 1969
 Created By: Spark 2.4.0-SNAPSHOT                                                                                                                                                                                                                                                                                                                               Created By: Spark 2.4.0-SNAPSHOT
 Type: MANAGED                                                                                                                                                                                                                                                                                                                                                  Type: MANAGED
 Provider: parquet                                                                                                                                                                                                                                                                                                                                              Provider: parquet
 Table Properties: [transient_lastDdlTime=1534809780]                                                                                                                                                                                                                                                                                                           Table Properties: [transient_lastDdlTime=1534809780]
 Location: file:/private/var/folders/tg/f5mz46090wg7swzgdc69f8q03965_0/T/warehouse-04d554d2-7ddb-4e13-b065-164afe065972/table2                                                                                                                                                                                                                                  Location: file:/private/var/folders/tg/f5mz46090wg7swzgdc69f8q03965_0/T/warehouse-04d554d2-7ddb-4e13-b065-164afe065972/table2
 Serde Library: org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe                                                                                                                                                                                                                                                                                     Serde Library: org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe
 InputFormat: org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat                                                                                                                                                                                                                                                                                     InputFormat: org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat
 OutputFormat: org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat                                                                                                                                                                                                                                                                                   OutputFormat: org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat
 Storage Properties: [serialization.format=1]                                                                                                                                                                                                                                                                                                                   Storage Properties: [serialization.format=1]
 Schema: root                                                                                                                                                                                                                                                                                                                                                   Schema: root
-- COL1: long (nullable = true)                                                                                                                                                                                                                                                                                                                               |-- COL1: long (nullable = true)
 ), org.apache.spark.sql.execution.datasources.InMemoryFileIndex@c613921e, [COL1#6L]                                                                                                                                                                                                                                                                            ), org.apache.spark.sql.execution.datasources.InMemoryFileIndex@c613921e, [COL1#6L]
!+- Project [cast(col1#5L as bigint) AS col1#6L]                                                                                                                                                                                                                                                                                                                +- Project [col1#5L AS col1#6L]
    +- Filter (col1#5L > -20)                                                                                                                                                                                                                                                                                                                                      +- Filter (col1#5L > -20)
       +- Relation[col1#5L] parquet                                                                                                                                                                                                                                                                                                                                   +- Relation[col1#5L] parquet
                
17:03:01.058 WARN org.apache.spark.sql.internal.BaseSessionStateBuilder$$anon$2: 
=== Applying Rule org.apache.spark.sql.catalyst.optimizer.RemoveRedundantAliases ===
 InsertIntoHadoopFsRelationCommand file:/private/var/folders/tg/f5mz46090wg7swzgdc69f8q03965_0/T/warehouse-04d554d2-7ddb-4e13-b065-164afe065972/table2, false, Parquet, Map(serialization.format -> 1, path -> file:/private/var/folders/tg/f5mz46090wg7swzgdc69f8q03965_0/T/warehouse-04d554d2-7ddb-4e13-b065-164afe065972/table2), Overwrite, CatalogTable(   InsertIntoHadoopFsRelationCommand file:/private/var/folders/tg/f5mz46090wg7swzgdc69f8q03965_0/T/warehouse-04d554d2-7ddb-4e13-b065-164afe065972/table2, false, Parquet, Map(serialization.format -> 1, path -> file:/private/var/folders/tg/f5mz46090wg7swzgdc69f8q03965_0/T/warehouse-04d554d2-7ddb-4e13-b065-164afe065972/table2), Overwrite, CatalogTable(
 Database: default                                                                                                                                                                                                                                                                                                                                              Database: default
 Table: table2                                                                                                                                                                                                                                                                                                                                                  Table: table2
 Owner: yumwang                                                                                                                                                                                                                                                                                                                                                 Owner: yumwang
 Created Time: Mon Aug 20 17:03:00 PDT 2018                                                                                                                                                                                                                                                                                                                     Created Time: Mon Aug 20 17:03:00 PDT 2018
 Last Access: Wed Dec 31 16:00:00 PST 1969                                                                                                                                                                                                                                                                                                                      Last Access: Wed Dec 31 16:00:00 PST 1969
 Created By: Spark 2.4.0-SNAPSHOT                                                                                                                                                                                                                                                                                                                               Created By: Spark 2.4.0-SNAPSHOT
 Type: MANAGED                                                                                                                                                                                                                                                                                                                                                  Type: MANAGED
 Provider: parquet                                                                                                                                                                                                                                                                                                                                              Provider: parquet
 Table Properties: [transient_lastDdlTime=1534809780]                                                                                                                                                                                                                                                                                                           Table Properties: [transient_lastDdlTime=1534809780]
 Location: file:/private/var/folders/tg/f5mz46090wg7swzgdc69f8q03965_0/T/warehouse-04d554d2-7ddb-4e13-b065-164afe065972/table2                                                                                                                                                                                                                                  Location: file:/private/var/folders/tg/f5mz46090wg7swzgdc69f8q03965_0/T/warehouse-04d554d2-7ddb-4e13-b065-164afe065972/table2
 Serde Library: org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe                                                                                                                                                                                                                                                                                     Serde Library: org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe
 InputFormat: org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat                                                                                                                                                                                                                                                                                     InputFormat: org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat
 OutputFormat: org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat                                                                                                                                                                                                                                                                                   OutputFormat: org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat
 Storage Properties: [serialization.format=1]                                                                                                                                                                                                                                                                                                                   Storage Properties: [serialization.format=1]
 Schema: root                                                                                                                                                                                                                                                                                                                                                   Schema: root
-- COL1: long (nullable = true)                                                                                                                                                                                                                                                                                                                               |-- COL1: long (nullable = true)
!), org.apache.spark.sql.execution.datasources.InMemoryFileIndex@c613921e, [COL1#6L]                                                                                                                                                                                                                                                                            ), org.apache.spark.sql.execution.datasources.InMemoryFileIndex@c613921e, [col1#5L]
!+- Project [col1#5L AS col1#6L]                                                                                                                                                                                                                                                                                                                                +- Project [col1#5L]
    +- Filter (col1#5L > -20)                                                                                                                                                                                                                                                                                                                                      +- Filter (col1#5L > -20)
       +- Relation[col1#5L] parquet                                                                                                                                                                                                                                                                                                                                   +- Relation[col1#5L] parquet
                
17:03:01.061 WARN org.apache.spark.sql.internal.BaseSessionStateBuilder$$anon$2: 
=== Applying Rule org.apache.spark.sql.catalyst.optimizer.ColumnPruning ===
 InsertIntoHadoopFsRelationCommand file:/private/var/folders/tg/f5mz46090wg7swzgdc69f8q03965_0/T/warehouse-04d554d2-7ddb-4e13-b065-164afe065972/table2, false, Parquet, Map(serialization.format -> 1, path -> file:/private/var/folders/tg/f5mz46090wg7swzgdc69f8q03965_0/T/warehouse-04d554d2-7ddb-4e13-b065-164afe065972/table2), Overwrite, CatalogTable(   InsertIntoHadoopFsRelationCommand file:/private/var/folders/tg/f5mz46090wg7swzgdc69f8q03965_0/T/warehouse-04d554d2-7ddb-4e13-b065-164afe065972/table2, false, Parquet, Map(serialization.format -> 1, path -> file:/private/var/folders/tg/f5mz46090wg7swzgdc69f8q03965_0/T/warehouse-04d554d2-7ddb-4e13-b065-164afe065972/table2), Overwrite, CatalogTable(
 Database: default                                                                                                                                                                                                                                                                                                                                              Database: default
 Table: table2                                                                                                                                                                                                                                                                                                                                                  Table: table2
 Owner: yumwang                                                                                                                                                                                                                                                                                                                                                 Owner: yumwang
 Created Time: Mon Aug 20 17:03:00 PDT 2018                                                                                                                                                                                                                                                                                                                     Created Time: Mon Aug 20 17:03:00 PDT 2018
 Last Access: Wed Dec 31 16:00:00 PST 1969                                                                                                                                                                                                                                                                                                                      Last Access: Wed Dec 31 16:00:00 PST 1969
 Created By: Spark 2.4.0-SNAPSHOT                                                                                                                                                                                                                                                                                                                               Created By: Spark 2.4.0-SNAPSHOT
 Type: MANAGED                                                                                                                                                                                                                                                                                                                                                  Type: MANAGED
 Provider: parquet                                                                                                                                                                                                                                                                                                                                              Provider: parquet
 Table Properties: [transient_lastDdlTime=1534809780]                                                                                                                                                                                                                                                                                                           Table Properties: [transient_lastDdlTime=1534809780]
 Location: file:/private/var/folders/tg/f5mz46090wg7swzgdc69f8q03965_0/T/warehouse-04d554d2-7ddb-4e13-b065-164afe065972/table2                                                                                                                                                                                                                                  Location: file:/private/var/folders/tg/f5mz46090wg7swzgdc69f8q03965_0/T/warehouse-04d554d2-7ddb-4e13-b065-164afe065972/table2
 Serde Library: org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe                                                                                                                                                                                                                                                                                     Serde Library: org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe
 InputFormat: org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat                                                                                                                                                                                                                                                                                     InputFormat: org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat
 OutputFormat: org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat                                                                                                                                                                                                                                                                                   OutputFormat: org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat
 Storage Properties: [serialization.format=1]                                                                                                                                                                                                                                                                                                                   Storage Properties: [serialization.format=1]
 Schema: root                                                                                                                                                                                                                                                                                                                                                   Schema: root
-- COL1: long (nullable = true)                                                                                                                                                                                                                                                                                                                               |-- COL1: long (nullable = true)
 ), org.apache.spark.sql.execution.datasources.InMemoryFileIndex@c613921e, [col1#5L]                                                                                                                                                                                                                                                                            ), org.apache.spark.sql.execution.datasources.InMemoryFileIndex@c613921e, [col1#5L]
!+- Project [col1#5L]                                                                                                                                                                                                                                                                                                                                           +- Filter (col1#5L > -20)
!   +- Filter (col1#5L > -20)                                                                                                                                                                                                                                                                                                                                      +- Relation[col1#5L] parquet
!      +- Relation[col1#5L] parquet                                                                                                                                                                                                                                                                                                                             
                
17:03:01.116 WARN org.apache.spark.sql.internal.BaseSessionStateBuilder$$anon$2: 
=== Applying Rule org.apache.spark.sql.catalyst.optimizer.InferFiltersFromConstraints ===
 InsertIntoHadoopFsRelationCommand file:/private/var/folders/tg/f5mz46090wg7swzgdc69f8q03965_0/T/warehouse-04d554d2-7ddb-4e13-b065-164afe065972/table2, false, Parquet, Map(serialization.format -> 1, path -> file:/private/var/folders/tg/f5mz46090wg7swzgdc69f8q03965_0/T/warehouse-04d554d2-7ddb-4e13-b065-164afe065972/table2), Overwrite, CatalogTable(   InsertIntoHadoopFsRelationCommand file:/private/var/folders/tg/f5mz46090wg7swzgdc69f8q03965_0/T/warehouse-04d554d2-7ddb-4e13-b065-164afe065972/table2, false, Parquet, Map(serialization.format -> 1, path -> file:/private/var/folders/tg/f5mz46090wg7swzgdc69f8q03965_0/T/warehouse-04d554d2-7ddb-4e13-b065-164afe065972/table2), Overwrite, CatalogTable(
 Database: default                                                                                                                                                                                                                                                                                                                                              Database: default
 Table: table2                                                                                                                                                                                                                                                                                                                                                  Table: table2
 Owner: yumwang                                                                                                                                                                                                                                                                                                                                                 Owner: yumwang
 Created Time: Mon Aug 20 17:03:00 PDT 2018                                                                                                                                                                                                                                                                                                                     Created Time: Mon Aug 20 17:03:00 PDT 2018
 Last Access: Wed Dec 31 16:00:00 PST 1969                                                                                                                                                                                                                                                                                                                      Last Access: Wed Dec 31 16:00:00 PST 1969
 Created By: Spark 2.4.0-SNAPSHOT                                                                                                                                                                                                                                                                                                                               Created By: Spark 2.4.0-SNAPSHOT
 Type: MANAGED                                                                                                                                                                                                                                                                                                                                                  Type: MANAGED
 Provider: parquet                                                                                                                                                                                                                                                                                                                                              Provider: parquet
 Table Properties: [transient_lastDdlTime=1534809780]                                                                                                                                                                                                                                                                                                           Table Properties: [transient_lastDdlTime=1534809780]
 Location: file:/private/var/folders/tg/f5mz46090wg7swzgdc69f8q03965_0/T/warehouse-04d554d2-7ddb-4e13-b065-164afe065972/table2                                                                                                                                                                                                                                  Location: file:/private/var/folders/tg/f5mz46090wg7swzgdc69f8q03965_0/T/warehouse-04d554d2-7ddb-4e13-b065-164afe065972/table2
 Serde Library: org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe                                                                                                                                                                                                                                                                                     Serde Library: org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe
 InputFormat: org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat                                                                                                                                                                                                                                                                                     InputFormat: org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat
 OutputFormat: org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat                                                                                                                                                                                                                                                                                   OutputFormat: org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat
 Storage Properties: [serialization.format=1]                                                                                                                                                                                                                                                                                                                   Storage Properties: [serialization.format=1]
 Schema: root                                                                                                                                                                                                                                                                                                                                                   Schema: root
-- COL1: long (nullable = true)                                                                                                                                                                                                                                                                                                                               |-- COL1: long (nullable = true)
 ), org.apache.spark.sql.execution.datasources.InMemoryFileIndex@c613921e, [col1#5L]                                                                                                                                                                                                                                                                            ), org.apache.spark.sql.execution.datasources.InMemoryFileIndex@c613921e, [col1#5L]
!+- Filter (col1#5L > -20)                                                                                                                                                                                                                                                                                                                                      +- Filter (isnotnull(col1#5L) && (col1#5L > -20))
    +- Relation[col1#5L] parquet                                                                                                                                                                                                                                                                                                                                   +- Relation[col1#5L] parquet
                
queryExecution:== Parsed Logical Plan ==
'InsertIntoTable 'UnresolvedRelation `table2`, true, false
+- 'Project ['COL1]
   +- 'UnresolvedRelation `view1`

== Analyzed Logical Plan ==
InsertIntoHadoopFsRelationCommand file:/private/var/folders/tg/f5mz46090wg7swzgdc69f8q03965_0/T/warehouse-04d554d2-7ddb-4e13-b065-164afe065972/table2, false, Parquet, Map(serialization.format -> 1, path -> file:/private/var/folders/tg/f5mz46090wg7swzgdc69f8q03965_0/T/warehouse-04d554d2-7ddb-4e13-b065-164afe065972/table2), Overwrite, CatalogTable(
Database: default
Table: table2
Owner: yumwang
Created Time: Mon Aug 20 17:03:00 PDT 2018
Last Access: Wed Dec 31 16:00:00 PST 1969
Created By: Spark 2.4.0-SNAPSHOT
Type: MANAGED
Provider: parquet
Table Properties: [transient_lastDdlTime=1534809780]
Location: file:/private/var/folders/tg/f5mz46090wg7swzgdc69f8q03965_0/T/warehouse-04d554d2-7ddb-4e13-b065-164afe065972/table2
Serde Library: org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe
InputFormat: org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat
OutputFormat: org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat
Storage Properties: [serialization.format=1]
Schema: root
-- COL1: long (nullable = true)
), org.apache.spark.sql.execution.datasources.InMemoryFileIndex@c613921e, [COL1#6L]
+- Project [COL1#6L]
   +- SubqueryAlias `default`.`view1`
      +- View (`default`.`view1`, [col1#6L])
         +- Project [cast(col1#5L as bigint) AS col1#6L]
            +- Project [col1#5L]
               +- Filter (col1#5L > cast(-20 as bigint))
                  +- SubqueryAlias `default`.`table1`
                     +- Relation[col1#5L] parquet

== Optimized Logical Plan ==
InsertIntoHadoopFsRelationCommand file:/private/var/folders/tg/f5mz46090wg7swzgdc69f8q03965_0/T/warehouse-04d554d2-7ddb-4e13-b065-164afe065972/table2, false, Parquet, Map(serialization.format -> 1, path -> file:/private/var/folders/tg/f5mz46090wg7swzgdc69f8q03965_0/T/warehouse-04d554d2-7ddb-4e13-b065-164afe065972/table2), Overwrite, CatalogTable(
Database: default
Table: table2
Owner: yumwang
Created Time: Mon Aug 20 17:03:00 PDT 2018
Last Access: Wed Dec 31 16:00:00 PST 1969
Created By: Spark 2.4.0-SNAPSHOT
Type: MANAGED
Provider: parquet
Table Properties: [transient_lastDdlTime=1534809780]
Location: file:/private/var/folders/tg/f5mz46090wg7swzgdc69f8q03965_0/T/warehouse-04d554d2-7ddb-4e13-b065-164afe065972/table2
Serde Library: org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe
InputFormat: org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat
OutputFormat: org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat
Storage Properties: [serialization.format=1]
Schema: root
-- COL1: long (nullable = true)
), org.apache.spark.sql.execution.datasources.InMemoryFileIndex@c613921e, [col1#5L]
+- Filter (isnotnull(col1#5L) && (col1#5L > -20))
   +- Relation[col1#5L] parquet

== Physical Plan ==
Execute InsertIntoHadoopFsRelationCommand InsertIntoHadoopFsRelationCommand file:/private/var/folders/tg/f5mz46090wg7swzgdc69f8q03965_0/T/warehouse-04d554d2-7ddb-4e13-b065-164afe065972/table2, false, Parquet, Map(serialization.format -> 1, path -> file:/private/var/folders/tg/f5mz46090wg7swzgdc69f8q03965_0/T/warehouse-04d554d2-7ddb-4e13-b065-164afe065972/table2), Overwrite, CatalogTable(
Database: default
Table: table2
Owner: yumwang
Created Time: Mon Aug 20 17:03:00 PDT 2018
Last Access: Wed Dec 31 16:00:00 PST 1969
Created By: Spark 2.4.0-SNAPSHOT
Type: MANAGED
Provider: parquet
Table Properties: [transient_lastDdlTime=1534809780]
Location: file:/private/var/folders/tg/f5mz46090wg7swzgdc69f8q03965_0/T/warehouse-04d554d2-7ddb-4e13-b065-164afe065972/table2
Serde Library: org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe
InputFormat: org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat
OutputFormat: org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat
Storage Properties: [serialization.format=1]
Schema: root
-- COL1: long (nullable = true)
), org.apache.spark.sql.execution.datasources.InMemoryFileIndex@c613921e, [col1#5L]
+- *(1) Project [col1#5L]
   +- *(1) Filter (isnotnull(col1#5L) && (col1#5L > -20))
      +- *(1) FileScan parquet default.table1[col1#5L] Batched: true, Format: Parquet, Location: InMemoryFileIndex[file:/tmp/spark/parquet], PartitionFilters: [], PushedFilters: [IsNotNull(col1), GreaterThan(col1,-20)], ReadSchema: struct<col1:bigint>

The main 3 changes are:

=== Applying Rule org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveReferences ===
 'InsertIntoTable 'UnresolvedRelation `table2`, true, false   'InsertIntoTable 'UnresolvedRelation `table2`, true, false
!+- 'Project ['COL1]                                          +- Project [COL1#6L]
    +- SubqueryAlias `default`.`view1`                           +- SubqueryAlias `default`.`view1`
       +- View (`default`.`view1`, [col1#6L])                       +- View (`default`.`view1`, [col1#6L])
          +- Project [col1#5L]                                         +- Project [col1#5L]
             +- Filter (col1#5L > cast(-20 as bigint))                    +- Filter (col1#5L > cast(-20 as bigint))
                +- SubqueryAlias `default`.`table1`                          +- SubqueryAlias `default`.`table1`
                   +- Relation[col1#5L] parquet                                 +- Relation[col1#5L] parquet
=== Applying Rule org.apache.spark.sql.catalyst.analysis.AliasViewChild ===
 InsertIntoHadoopFsRelationCommand file:/private/var/folders/tg/f5mz46090wg7swzgdc69f8q03965_0/T/warehouse-04d554d2-7ddb-4e13-b065-164afe065972/table2, false, Parquet, Map(serialization.format -> 1, path -> file:/private/var/folders/tg/f5mz46090wg7swzgdc69f8q03965_0/T/warehouse-04d554d2-7ddb-4e13-b065-164afe065972/table2), Overwrite, CatalogTable(   InsertIntoHadoopFsRelationCommand file:/private/var/folders/tg/f5mz46090wg7swzgdc69f8q03965_0/T/warehouse-04d554d2-7ddb-4e13-b065-164afe065972/table2, false, Parquet, Map(serialization.format -> 1, path -> file:/private/var/folders/tg/f5mz46090wg7swzgdc69f8q03965_0/T/warehouse-04d554d2-7ddb-4e13-b065-164afe065972/table2), Overwrite, CatalogTable(
 Database: default                                                                                                                                                                                                                                                                                                                                              Database: default
 Table: table2                                                                                                                                                                                                                                                                                                                                                  Table: table2
 Owner: yumwang                                                                                                                                                                                                                                                                                                                                                 Owner: yumwang
 Created Time: Mon Aug 20 17:03:00 PDT 2018                                                                                                                                                                                                                                                                                                                     Created Time: Mon Aug 20 17:03:00 PDT 2018
 Last Access: Wed Dec 31 16:00:00 PST 1969                                                                                                                                                                                                                                                                                                                      Last Access: Wed Dec 31 16:00:00 PST 1969
 Created By: Spark 2.4.0-SNAPSHOT                                                                                                                                                                                                                                                                                                                               Created By: Spark 2.4.0-SNAPSHOT
 Type: MANAGED                                                                                                                                                                                                                                                                                                                                                  Type: MANAGED
 Provider: parquet                                                                                                                                                                                                                                                                                                                                              Provider: parquet
 Table Properties: [transient_lastDdlTime=1534809780]                                                                                                                                                                                                                                                                                                           Table Properties: [transient_lastDdlTime=1534809780]
 Location: file:/private/var/folders/tg/f5mz46090wg7swzgdc69f8q03965_0/T/warehouse-04d554d2-7ddb-4e13-b065-164afe065972/table2                                                                                                                                                                                                                                  Location: file:/private/var/folders/tg/f5mz46090wg7swzgdc69f8q03965_0/T/warehouse-04d554d2-7ddb-4e13-b065-164afe065972/table2
 Serde Library: org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe                                                                                                                                                                                                                                                                                     Serde Library: org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe
 InputFormat: org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat                                                                                                                                                                                                                                                                                     InputFormat: org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat
 OutputFormat: org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat                                                                                                                                                                                                                                                                                   OutputFormat: org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat
 Storage Properties: [serialization.format=1]                                                                                                                                                                                                                                                                                                                   Storage Properties: [serialization.format=1]
 Schema: root                                                                                                                                                                                                                                                                                                                                                   Schema: root
-- COL1: long (nullable = true)                                                                                                                                                                                                                                                                                                                               |-- COL1: long (nullable = true)
 ), org.apache.spark.sql.execution.datasources.InMemoryFileIndex@c613921e, [COL1#6L]                                                                                                                                                                                                                                                                            ), org.apache.spark.sql.execution.datasources.InMemoryFileIndex@c613921e, [COL1#6L]
 +- Project [COL1#6L]                                                                                                                                                                                                                                                                                                                                           +- Project [COL1#6L]
    +- SubqueryAlias `default`.`view1`                                                                                                                                                                                                                                                                                                                             +- SubqueryAlias `default`.`view1`
       +- View (`default`.`view1`, [col1#6L])                                                                                                                                                                                                                                                                                                                         +- View (`default`.`view1`, [col1#6L])
!         +- Project [col1#5L]                                                                                                                                                                                                                                                                                                                                           +- Project [cast(col1#5L as bigint) AS col1#6L]
!            +- Filter (col1#5L > cast(-20 as bigint))                                                                                                                                                                                                                                                                                                                      +- Project [col1#5L]
!               +- SubqueryAlias `default`.`table1`                                                                                                                                                                                                                                                                                                                            +- Filter (col1#5L > cast(-20 as bigint))
!                  +- Relation[col1#5L] parquet                                                                                                                                                                                                                                                                                                                                   +- SubqueryAlias `default`.`table1`
!                                                                                                                                                                                                                                                                                                                                                                                    +- Relation[col1#5L] parquet
=== Applying Rule org.apache.spark.sql.catalyst.optimizer.RemoveRedundantAliases ===
 InsertIntoHadoopFsRelationCommand file:/private/var/folders/tg/f5mz46090wg7swzgdc69f8q03965_0/T/warehouse-04d554d2-7ddb-4e13-b065-164afe065972/table2, false, Parquet, Map(serialization.format -> 1, path -> file:/private/var/folders/tg/f5mz46090wg7swzgdc69f8q03965_0/T/warehouse-04d554d2-7ddb-4e13-b065-164afe065972/table2), Overwrite, CatalogTable(   InsertIntoHadoopFsRelationCommand file:/private/var/folders/tg/f5mz46090wg7swzgdc69f8q03965_0/T/warehouse-04d554d2-7ddb-4e13-b065-164afe065972/table2, false, Parquet, Map(serialization.format -> 1, path -> file:/private/var/folders/tg/f5mz46090wg7swzgdc69f8q03965_0/T/warehouse-04d554d2-7ddb-4e13-b065-164afe065972/table2), Overwrite, CatalogTable(
 Database: default                                                                                                                                                                                                                                                                                                                                              Database: default
 Table: table2                                                                                                                                                                                                                                                                                                                                                  Table: table2
 Owner: yumwang                                                                                                                                                                                                                                                                                                                                                 Owner: yumwang
 Created Time: Mon Aug 20 17:03:00 PDT 2018                                                                                                                                                                                                                                                                                                                     Created Time: Mon Aug 20 17:03:00 PDT 2018
 Last Access: Wed Dec 31 16:00:00 PST 1969                                                                                                                                                                                                                                                                                                                      Last Access: Wed Dec 31 16:00:00 PST 1969
 Created By: Spark 2.4.0-SNAPSHOT                                                                                                                                                                                                                                                                                                                               Created By: Spark 2.4.0-SNAPSHOT
 Type: MANAGED                                                                                                                                                                                                                                                                                                                                                  Type: MANAGED
 Provider: parquet                                                                                                                                                                                                                                                                                                                                              Provider: parquet
 Table Properties: [transient_lastDdlTime=1534809780]                                                                                                                                                                                                                                                                                                           Table Properties: [transient_lastDdlTime=1534809780]
 Location: file:/private/var/folders/tg/f5mz46090wg7swzgdc69f8q03965_0/T/warehouse-04d554d2-7ddb-4e13-b065-164afe065972/table2                                                                                                                                                                                                                                  Location: file:/private/var/folders/tg/f5mz46090wg7swzgdc69f8q03965_0/T/warehouse-04d554d2-7ddb-4e13-b065-164afe065972/table2
 Serde Library: org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe                                                                                                                                                                                                                                                                                     Serde Library: org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe
 InputFormat: org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat                                                                                                                                                                                                                                                                                     InputFormat: org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat
 OutputFormat: org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat                                                                                                                                                                                                                                                                                   OutputFormat: org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat
 Storage Properties: [serialization.format=1]                                                                                                                                                                                                                                                                                                                   Storage Properties: [serialization.format=1]
 Schema: root                                                                                                                                                                                                                                                                                                                                                   Schema: root
-- COL1: long (nullable = true)                                                                                                                                                                                                                                                                                                                               |-- COL1: long (nullable = true)
!), org.apache.spark.sql.execution.datasources.InMemoryFileIndex@c613921e, [COL1#6L]                                                                                                                                                                                                                                                                            ), org.apache.spark.sql.execution.datasources.InMemoryFileIndex@c613921e, [col1#5L]
!+- Project [col1#5L AS col1#6L]                                                                                                                                                                                                                                                                                                                                +- Project [col1#5L]
    +- Filter (col1#5L > -20)                                                                                                                                                                                                                                                                                                                                      +- Filter (col1#5L > -20)
       +- Relation[col1#5L] parquet

We need COL1#6L, but after some optimization, the outputColumns changed to col1#5L.

@SparkQA
Copy link

SparkQA commented Aug 20, 2018

Test build #94948 has finished for PR 22124 at commit c5a015c.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Aug 20, 2018

Test build #94955 has finished for PR 22124 at commit 419a874.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

# Conflicts:
#	sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala
@cloud-fan
Copy link
Contributor

@wangyum please don't rush into code changes, it's more efficient to come up with a good solution before doing any coding work.

We need COL1#6L, but after some optimization ...

This is the key. We must find out which optimizer rule caused it and how.

@SparkQA
Copy link

SparkQA commented Aug 21, 2018

Test build #95006 has finished for PR 22124 at commit 72bde20.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@wangyum
Copy link
Member Author

wangyum commented Aug 30, 2018

close it. I have create a new PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants