Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-18482][SQL] make sure Spark can access the table metadata created by older version of spark #16003

Closed
wants to merge 4 commits into from

Conversation

cloud-fan
Copy link
Contributor

What changes were proposed in this pull request?

In Spark 2.1, we did a lot of refactor for HiveExternalCatalog and related code path. These refactor may introduce external behavior changes and break backward compatibility. e.g. http://issues.apache.org/jira/browse/SPARK-18464

To avoid future compatibility problems of HiveExternalCatalog, this PR dumps some typical table metadata from tables created by 2.0, and test if they can recognized by current version of Spark.

How was this patch tested?

test only change

@@ -1370,47 +1370,4 @@ class MetastoreDataSourcesSuite extends QueryTest with SQLTestUtils with TestHiv
sparkSession.sparkContext.conf.set(DEBUG_MODE, previousValue)
}
}

test("SPARK-17470: support old table that stores table location in storage properties") {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a useless test. It was added by #15024, which tried to change the way we store table location in metastore. After code review, #15024 revert that change, but forgot to remove this test.

}
}

test("SPARK-18464: support old table which doesn't store schema in table properties") {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's covered by the new test suite

@cloud-fan
Copy link
Contributor Author

cc @yhuai @gatorsmile

@SparkQA
Copy link

SparkQA commented Nov 24, 2016

Test build #69134 has finished for PR 16003 at commit 8f8c5da.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • class HiveExternalCatalogCompatibilitySuite extends QueryTest with TestHiveSingleton

import org.apache.spark.util.Utils


class HiveExternalCatalogCompatibilitySuite extends QueryTest with TestHiveSingleton {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To make the name super long...

HiveExternalCatalogBackwardCompatibilitySuite

@SparkQA
Copy link

SparkQA commented Nov 25, 2016

Test build #69145 has finished for PR 16003 at commit 6f785e3.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • class HiveExternalCatalogBackwardCompatibilitySuite extends QueryTest with TestHiveSingleton

@rxin
Copy link
Contributor

rxin commented Nov 25, 2016

The main thing I'd add is to add comment explaining what version of Spark would generate those table props.

@gatorsmile
Copy link
Member

How about Spark 2.1 altering the table metadata created by Spark 2.0?

@SparkQA
Copy link

SparkQA commented Nov 27, 2016

Test build #69200 has started for PR 16003 at commit a42b8b9.

@SparkQA
Copy link

SparkQA commented Nov 27, 2016

Test build #3440 has finished for PR 16003 at commit a42b8b9.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.



// Raw table metadata that are dumped from tables created by Spark 2.0. Note that, all spark
// versions prior to 2.1 would generate same raw table metadata for a specific table.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I briefly checked 1.6. Most of them are the same, but some changes are only available in 2.0. For example, locationUri = Some(defaultTablePath("tbl7") + "-__PLACEHOLDER__"), was added in #13270

@gatorsmile
Copy link
Member

LGTM except a minor comment. We can address it in a separate PR for checking 1.6.

@SparkQA
Copy link

SparkQA commented Nov 28, 2016

Test build #69217 has finished for PR 16003 at commit 117f532.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@rxin
Copy link
Contributor

rxin commented Nov 28, 2016

Merging in master/branch-2.1.

@asfgit asfgit closed this in fc2c13b Nov 28, 2016
asfgit pushed a commit that referenced this pull request Nov 28, 2016
…ted by older version of spark

## What changes were proposed in this pull request?

In Spark 2.1, we did a lot of refactor for `HiveExternalCatalog` and related code path. These refactor may introduce external behavior changes and break backward compatibility. e.g. http://issues.apache.org/jira/browse/SPARK-18464

To avoid future compatibility problems of `HiveExternalCatalog`, this PR dumps some typical table metadata from tables created by 2.0, and test if they can recognized by current version of Spark.

## How was this patch tested?

test only change

Author: Wenchen Fan <wenchen@databricks.com>

Closes #16003 from cloud-fan/test.

(cherry picked from commit fc2c13b)
Signed-off-by: Reynold Xin <rxin@databricks.com>
robert3005 pushed a commit to palantir/spark that referenced this pull request Dec 2, 2016
…ted by older version of spark

## What changes were proposed in this pull request?

In Spark 2.1, we did a lot of refactor for `HiveExternalCatalog` and related code path. These refactor may introduce external behavior changes and break backward compatibility. e.g. http://issues.apache.org/jira/browse/SPARK-18464

To avoid future compatibility problems of `HiveExternalCatalog`, this PR dumps some typical table metadata from tables created by 2.0, and test if they can recognized by current version of Spark.

## How was this patch tested?

test only change

Author: Wenchen Fan <wenchen@databricks.com>

Closes apache#16003 from cloud-fan/test.
uzadude pushed a commit to uzadude/spark that referenced this pull request Jan 27, 2017
…ted by older version of spark

## What changes were proposed in this pull request?

In Spark 2.1, we did a lot of refactor for `HiveExternalCatalog` and related code path. These refactor may introduce external behavior changes and break backward compatibility. e.g. http://issues.apache.org/jira/browse/SPARK-18464

To avoid future compatibility problems of `HiveExternalCatalog`, this PR dumps some typical table metadata from tables created by 2.0, and test if they can recognized by current version of Spark.

## How was this patch tested?

test only change

Author: Wenchen Fan <wenchen@databricks.com>

Closes apache#16003 from cloud-fan/test.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants