Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fastavro gives a default value for unions with nulls that don't have a default value #49

Closed
movermeyer opened this issue May 19, 2016 · 4 comments

Comments

@movermeyer
Copy link

@movermeyer movermeyer commented May 19, 2016

This is somewhat related to #48. Given this example schema:

protocol test {
  record a {
    int x;
    union {null, int} y;
  }
}

the following should fail:

>>> data = StringIO.StringIO()
>>> fastavro.writer(data, schema, [{'val': {'x': 1}}])

However, it does not, giving a default value of null to a field ('y') without a default value.

This stems from the use of .get(), which returns None if the key is not found.
The field does not have a default value, so datum.get('default') returns None, which is then used as the default value.

This is just mixing up the lack of a value, with a value of null. Instead, the code should check for the field's existence in the datum, and return false if it doesn't exist.

This is related to #37, in that having write_record call validate() on itself would catch this. Although I imagine the overhead might raise some ire, so there are some options.

Some pull requests will be made in a bit.

@movermeyer
Copy link
Author

@movermeyer movermeyer commented May 19, 2016

In general, this is a problem with not checking to make sure that the datum contains a value for each field or the field has a default.

In this specific case, it results in data being made up.

movermeyer pushed a commit to movermeyer/fastavro that referenced this issue May 19, 2016
Have the record validate itself before writing out.
e-heller added a commit to e-heller/fastavro that referenced this issue May 22, 2016
Create a `NoValue` sentinel object for this case. Previously, `None`
would be deduced, but Python's `None` is conflated with the Avro type
`null` and can causes issues with `union` records that contain a `null`
type. See fastavro#49
@movermeyer
Copy link
Author

@movermeyer movermeyer commented May 24, 2016

I like @e-heller's solution better. Solves this problem and shouldn't have noticeable impact on performance.

e-heller added a commit to e-heller/fastavro that referenced this issue May 27, 2016
For any `x = object()`, `bool(x) == True`. However we do not want the
special `NoValue` sentinel to be mistaken as any valid type. Implement
a `__bool__` method on the `_NoValue` class to raise a TypeError.
See: fastavro#49
tebeka added a commit that referenced this issue Jun 6, 2016
@tebeka
Copy link
Collaborator

@tebeka tebeka commented Jun 12, 2016

0.9.10 should fix this

@tebeka tebeka closed this Jun 12, 2016
@movermeyer
Copy link
Author

@movermeyer movermeyer commented Jun 12, 2016

Looks good. Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
2 participants