Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Spark] Support dropping the CHECK constraints table feature #2987

Merged
merged 4 commits into from
Jun 13, 2024

Conversation

tomvanbussel
Copy link
Contributor

@tomvanbussel tomvanbussel commented Apr 27, 2024

Which Delta project/connector is this regarding?

  • Spark
  • Standalone
  • Flink
  • Kernel
  • Other (fill in here)

Description

This PR adds support for dropping the checkConstraints table feature using the ALTER TABLE ... DROP FEATURE command. It throws an error if the table still contains CHECK constraints (as dropping a feature should never change the logical state of a table).

How was this patch tested?

Added a test to CheckConstraintsSuite

Does this PR introduce any user-facing changes?

Yes, ALTER TABLE ... DROP FEATURE checkConstraints is now supported.

"DELTA_CANNOT_DROP_CHECK_CONSTRAINT_FEATURE" : {
"message" : [
"Cannot drop the CHECK constraints table feature.",
"The following constraints must be dropped first: <constraints>."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking that this a only a writer feature which means that can be safely ignored by readers that do not understand it. In that sense, dropping the feature should be allowed even if there are constraints in the table. They just need to be ignored. However, we do no need add a guard (if it is not already there) to only check constraints when the feature is enabled/supported. That is for writers that understand the feature.

On the other hand, if we do want to force this behavior it means the feature cannot be removed until the history is truncated. History validation and check is only done at the moment for all reader+writer features but there are plans to allow it for selected writer features.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To add to that, it is good to drop constraints before dropping the feature but we should not check the history if there are constraints, i.e. actionUsesFeature.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Requiring users to drop all constraints first before allowing the feature to be dropped is intentional. Otherwise, some users might not realize that dropping the checkConstraints feature will drop all check constraints on their table, or they may not realize that there are check constraints defined on their table. In general it's a good principle that dropping (or adding) a feature should never affect the logical state of a table, only the physical state of a table. For example: Dropping the DV feature does not undelete rows, and dropping the CM feature does not revert name changes of columns.

We do not require all constraints to be removed from the history (see the added test), only that the current state of the table does not contain any check constraints. I have implemented actionUsesFeature, but AFAIK it is never called right now, as historyContainsFeature is only called for reader features (and this is a writer feature).

Copy link
Contributor

@andreaschat-db andreaschat-db left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a couple of comments.

@@ -49,6 +49,13 @@ object Constraints {
/** A SQL expression to check for when writing out data. */
case class Check(name: String, expression: Expression) extends Constraint

def getCheckConstraintNames(metadata: Metadata): Seq[String] = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Brackets can be omitted.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer to keep the brackets here, as this is a multi-line method. This helps avoid unhappy surprises in the future when modifying the code. The rest of this class also use brackets for multi-line methods (even when not needed).

override def removeFeatureTracesIfNeeded(): Boolean = {
val checkConstraintNames = Constraints.getCheckConstraintNames(table.initialSnapshot.metadata)
if (checkConstraintNames.isEmpty) return false
throw DeltaErrors.cannotDropCheckConstraintFeature(checkConstraintNames)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hope in the future we have a nice way of showing a warning message before the execution of commands like this. So the user can understand about the consequences of the command before proceeding. Now we are forced to make him drop the constraints one by one. Not the best CUJ.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed that dropping the constraints one-by-one is not great. On the other hand, I do like asking the user to explicitly drop these constraints themselves. A warning can be easily missed/skipped. Perhaps we can add a ALTER TABLE ... DROP CONSTRAINTS syntax in the future that allows dropping multiple constraints at once?

Copy link
Contributor

@andreaschat-db andreaschat-db left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks!

@allisonport-db allisonport-db merged commit 5a6f382 into delta-io:master Jun 13, 2024
9 of 10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants