You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
the data that cannot be coerced into exiting tables and columns will be dropped (filtered out)
provide an option to load the "bad data" to separate tables (ie. as JSON blob)
The difficulties
to seal the schema, it must be known. we typically let the dlt to infer the schema. are there any requirements to make the sealing easier or we just require to modify the schema at runtime
mimimum requirements from Adrian:
At the very least, we should be able to toggle "contract_mode=On" (let's use something relatable rather than seal?)
In this mode,
If there is nothing yet loaded, schema can evolve and be created
if there is something already loaded, schema may not evolve
this mode can be toggled on/off to allow temporary evolution
What should happen when schema is not allowed to evolve?
Any operation that would cause additions to the original schema should fail
the data should just not be loaded
any operations where the performance hints would change should fail. This includes keys, performance, and nullable hints, basically all changes
this does not relate to dlt's normaliser - it is expected that this normaliser types the data and normalises it - this refers to the schema only.
How does this integrate with providing a schema.yaml
Should we enable sealing / freezing individual table chains with this PR? If so, how should we do it? Via the resource decorator and if so, does this get saved into the stored schema?
If we load "bad data" into an additional destination, should we store the complete data there?
Should the trace indicate wether the schema is sealed and should we maybe add schema change output info to the trace? This would be very nice for the user playing with dlt to see what is going on under the hood imho.
The text was updated successfully, but these errors were encountered:
Sealing a table or the whole schema means that:
The difficulties
dlt
to infer the schema. are there any requirements to make the sealing easier or we just require to modify the schema at runtimemimimum requirements from Adrian:
At the very least, we should be able to toggle "contract_mode=On" (let's use something relatable rather than seal?)
In this mode,
What should happen when schema is not allowed to evolve?
Open questions by @sh-rp :
The text was updated successfully, but these errors were encountered: