Skip to content
This repository has been archived by the owner on Dec 20, 2023. It is now read-only.

Event Serialization code should handle NULL strings correctly #591

Open
mrjerryjohns opened this issue May 14, 2020 · 3 comments
Open

Event Serialization code should handle NULL strings correctly #591

mrjerryjohns opened this issue May 14, 2020 · 3 comments

Comments

@mrjerryjohns
Copy link
Contributor

For WDM events, the expectation on applications that emit them is to memset them to 0 before using them. This protects the application against additions made to those events being made in schema before they may have had a chance to update their code (or they may not even care to set that field).

For fields of type string, the serialization utils in Weave do not correctly handle a NULL pointer and blit out an empty string for that field in the payload. This has caused crashes in application code.

In keeping with the theme that 'memset' is the right way to initialize an event, then this should follow suit as well.

@jaylogue
Copy link
Contributor

Your explanation doesn't make it clear what the correct behavior should be. Are you saying that the code should blit out an empty string in the payload for NULL field?

Why an empty string, instead of an absent field? What about cases where an empty string violates the schema?

If a string field is added to an event, but the application hasn't been updated to set it, then that field must have been marked optional in the schema, and therefore should simply be elided from the encoding.

@mrjerryjohns
Copy link
Contributor Author

The goal of initialization here is to setup safe, clear defaults in the constructed data. Similar to how the default would be 0 for integers, false for booleans, empty for arrays, we need a 'default' for strings such that it would be a NULL-string by default (i.e "").

Why an empty string, instead of an absent field? What about cases where an empty string violates the schema?

That is not for Weave to enforce - by that analogy, we should be setting up every non-optional field as part of 'initialization' to be 'unset', and force the application to set them explicitly. Clearly we're not doing that for some WDL types like integers, booleans, so picking strings to make this enforcement happen is not consistent.

Certainly having the code crash as a way to discover that isn't right. Having a mode where a validator runs to ensure that the application has set every non-optional field might be a better way to catch this (a feature only available in debug builds perhaps).

If a string field is added to an event, but the application hasn't been updated to set it, then that field must have been marked optional in the schema, and therefore should simply be elided from the encoding.

This likely applies more broadly than just strings, (optional bools, integers too). In that case, would it be that all optional fields are by default on initialization, 'not set'? Perhaps an issue for another day...

@jaylogue
Copy link
Contributor

Certainly having the code crash as a way to discover that isn't right.

Couldn't it just return an error?

I can live with sending an empty string here. But I certainly don't agree that silently sending the wrong thing is somehow better than overtly crashing (with a clear explanation why) when the programmer forgets to initialize a field.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants