Validation refactoring + exception redesign #374
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This is a refactoring of the validation/conversion machinery that aims to
It introduces some new features made possible by the redesign and a few bugfixes as well.
Since the diff is pretty big, I'm going to explain most changes below.
List of changes
Combines the validation and conversion code paths. Recursive validation of nested items and submodels is now provided by
MultiType.convert()
(formerlyto_native()
), obviating the need for parallel, validation-specific logic.validate_items()
methods onListType
andDictType
are removed.validate_model()
method onModelType
is removed.Enables simultaneous conversion and validation during data import.
The statement
FooModel(<raw data>, validate=True)
now runs conversion and validation in one go, collecting both types of errors. If the data is likely to clear at least the conversion step, this is more efficient than sequential processing, which requires traversing the entire dataset twice.Here are some timing results from a simple import+validation testcase:
Improves the exception design by implementing a bit more structure (see Redesign the conversion/validation exceptions #369).
Fixes the conversion of
ModelType
fields when the input is already a model instance. Previously, the instance would be returned unchanged even if it contained unconverted fields or submodels.If the process originates from
Model.__init__()
, a new instance is created for each submodel, and the input instance is simply used as the raw data, as would happen with adict
input. Otherwise, such as during validation, the existing instance is updated in place to preserve object identity.Makes the valid part of a failing dataset available to application code as
DataError.partial_data
.This information used to exist as
ModelConversionError.partial_data
but was only accessible to functions callingimport_loop
directly. It also wasn't aggregated to include submodels. Both of these issues are fixed, so the entire tree of valid data is now included with the final error raised fromModel.validate()
or similar.Fixes a bug where the first failing field-level validator would prevent the rest of a compound field's validators from running.
Fixes a bug(?) that caused model-level validation to be skipped entirely if any errors occurred during field-level validation.
From now on, only those fields that failed the initial validation are skipped, while others will have their model-level validators run normally.
Removes
MultiType.validate()
. TheBaseType
implementation now works universally.Removes an obsolete validator
_check_for_unknown_fields()
from thevalidate
module.Fixes a number of tests that were broken due to unreachable assertions inside
with pytest.raises()
statements.Changes
Context
back to a custom class fromnamedtuple
. Turns out strict immutability is highly impractical because sometimes you have to establish a partial or empty context outside ofimport_loop
to retain a reference to it. The newContext
class implements a__setattr__()
that allows setting each attribute just once, which provides the required degree of immutability.The instant conversion of
ModelType
fields when populated through assignment (as inmodel_instance.submodel = {'x': 1}
) is now accomplished through a hook that types themselves may implement. This gets rid of theisinstance
stuff insideFieldDescriptor.__set__()
and eliminates the bidirectional module dependency.Compatibility
This is mostly just a reorganization that should not cause any huge trouble in a typical setting.
Potential issues are likely to be related to the redesigned exceptions:
The switch from
list
todict
as the representation of nestedListType
errors (Return index of invalid objects #307) is clearly a backward-incompatible change, but it's done for a good reason.Since
ValidationError
is a field-specific construct, aggregate error classes are no longer its subclasses. This will not work:Using
ModelValidationError
here will continue to work, as it's declared a synonym ofDataError
, as isModelConversionError
.If a derivative of
ListType
orDictType
has overriddenvalidate_items()
, it would now cause validation to occur twice, since the original types no longer need this method. The same applies toModelType
andvalidate_model()
.If a
Model
subclass has overridden thevalidate()
method in the anticipation that it will be called for every submodel throughout a data tree, it will not work as expected.Model.validate()
only initiates the validation process and is no longer invoked for submodels.Previously, the
data
dictionary passed to model-level validators would be guaranteed to contain entries for all fields becauseUndefined
objects would substitute for missing values (since Add support for undefined values #372).As both of these issues have now been remedied, validators wanting to look at other fields on the model will have to take care not to blindly access keys that might not exist.