Skip to content

Conversation

@abmo-x
Copy link
Contributor

@abmo-x abmo-x commented Nov 28, 2022

Update schema currently does not check if the field is optional/required and adds all new fields as optional. This leads to incorrect schema evolution where a required filed gets added to a table schema as optional.

With this change, a required field can only be added if 'allowIncompatibleChanges' is enabled.

@github-actions github-actions bot added the core label Nov 28, 2022
@rdblue
Copy link
Contributor

rdblue commented Nov 28, 2022

@abmo-x, the behavior you're describing doesn't sound correct. If you union two schemas together, any field not in both schemas should be optional. The only case where a field in a union result would not be optional is if both schemas have it as required, right?

@dramaticlly
Copy link
Contributor

dramaticlly commented Nov 28, 2022

The only case where a field in a union result would not be optional is if both schemas have it as required, right?

If that's the case (I think I kind of agree), looks like "required" new field cannot be added as required later in update-schema call given prior schema will always missing new filed.

@abmo-x
Copy link
Contributor Author

abmo-x commented Nov 29, 2022

@abmo-x, the behavior you're describing doesn't sound correct. If you union two schemas together, any field not in both schemas should be optional. The only case where a field in a union result would not be optional is if both schemas have it as required, right?

I agree that any field not in both should be optional. what's your suggestion when a user makes a incompatible change to their Avro schema v2 by adding a required field and uses table.updateSchema().unionByNameWith(v2) to updated the table schema. In this case the field gets added as optional to the table where as the Avro schema has it as required.

  • should the update fail instead of adding the new required field as optional?
  • should unionByNameWith not be used by user and instead user should directly call table.updateSchema().allowIncompatibleChanges().addRequiredColumn by diffing the v1 vs v2 schema?

@abmo-x abmo-x closed this Dec 1, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants