-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Spark] Support dropping the CHECK constraints table feature #2987
Changes from 3 commits
01dd813
b265699
4c289e0
223571d
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -24,6 +24,7 @@ import org.apache.spark.sql.delta.catalog.DeltaTableV2 | |
import org.apache.spark.sql.delta.commands.{AlterTableSetPropertiesDeltaCommand, AlterTableUnsetPropertiesDeltaCommand, DeltaReorgTableCommand, DeltaReorgTableMode, DeltaReorgTableSpec} | ||
import org.apache.spark.sql.delta.commands.columnmapping.RemoveColumnMappingCommand | ||
import org.apache.spark.sql.delta.commands.optimize.OptimizeMetrics | ||
import org.apache.spark.sql.delta.constraints.Constraints | ||
import org.apache.spark.sql.delta.managedcommit.ManagedCommitUtils | ||
import org.apache.spark.sql.delta.metering.DeltaLogging | ||
import org.apache.spark.sql.delta.util.{Utils => DeltaUtils} | ||
|
@@ -378,3 +379,12 @@ case class ColumnMappingPreDowngradeCommand(table: DeltaTableV2) | |
true | ||
} | ||
} | ||
|
||
case class CheckConstraintsPreDowngradeTableFeatureCommand(table: DeltaTableV2) | ||
extends PreDowngradeTableFeatureCommand { | ||
override def removeFeatureTracesIfNeeded(): Boolean = { | ||
tomvanbussel marked this conversation as resolved.
Show resolved
Hide resolved
|
||
val checkConstraintNames = Constraints.getCheckConstraintNames(table.initialSnapshot.metadata) | ||
if (checkConstraintNames.isEmpty) return false | ||
throw DeltaErrors.cannotDropCheckConstraintFeature(checkConstraintNames) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I hope in the future we have a nice way of showing a warning message before the execution of commands like this. So the user can understand about the consequences of the command before proceeding. Now we are forced to make him drop the constraints one by one. Not the best CUJ. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Agreed that dropping the constraints one-by-one is not great. On the other hand, I do like asking the user to explicitly drop these constraints themselves. A warning can be easily missed/skipped. Perhaps we can add a |
||
} | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -49,6 +49,13 @@ object Constraints { | |
/** A SQL expression to check for when writing out data. */ | ||
case class Check(name: String, expression: Expression) extends Constraint | ||
|
||
def getCheckConstraintNames(metadata: Metadata): Seq[String] = { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nit: Brackets can be omitted. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I prefer to keep the brackets here, as this is a multi-line method. This helps avoid unhappy surprises in the future when modifying the code. The rest of this class also use brackets for multi-line methods (even when not needed). |
||
metadata.configuration.keys.collect { | ||
case key if key.toLowerCase(Locale.ROOT).startsWith("delta.constraints.") => | ||
key.stripPrefix("delta.constraints.") | ||
}.toSeq | ||
} | ||
|
||
/** | ||
* Extract CHECK constraints from the table properties. Note that some CHECK constraints may also | ||
* come from schema metadata; these constraints were never released in a public API but are | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking that this a only a writer feature which means that can be safely ignored by readers that do not understand it. In that sense, dropping the feature should be allowed even if there are constraints in the table. They just need to be ignored. However, we do no need add a guard (if it is not already there) to only check constraints when the feature is enabled/supported. That is for writers that understand the feature.
On the other hand, if we do want to force this behavior it means the feature cannot be removed until the history is truncated. History validation and check is only done at the moment for all reader+writer features but there are plans to allow it for selected writer features.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To add to that, it is good to drop constraints before dropping the feature but we should not check the history if there are constraints, i.e.
actionUsesFeature.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Requiring users to drop all constraints first before allowing the feature to be dropped is intentional. Otherwise, some users might not realize that dropping the
checkConstraints
feature will drop all check constraints on their table, or they may not realize that there are check constraints defined on their table. In general it's a good principle that dropping (or adding) a feature should never affect the logical state of a table, only the physical state of a table. For example: Dropping the DV feature does not undelete rows, and dropping the CM feature does not revert name changes of columns.We do not require all constraints to be removed from the history (see the added test), only that the current state of the table does not contain any check constraints. I have implemented
actionUsesFeature
, but AFAIK it is never called right now, ashistoryContainsFeature
is only called for reader features (and this is a writer feature).