-
Notifications
You must be signed in to change notification settings - Fork 109
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for tracking unknown fields. #129
Conversation
- Every message has a new field containing a list of unknown fields: ```unknownFields :: Message msg => Lens' msg [TaggedValue]``` - Unknown fields are preserved by `decodeMessage`, `encodeMessage`, and `showMessage` - Unknown fields still cause an error for `readMessage`. A few TODOs: - For now, unknown groups are printed sub-optimally by `showMessage`: the start/end group tags (and everything in between) all get displayed as individual fields, rather than being organized into a sub-struct. - The `discardUnknownFields` function isn't recursive, unlike in other languages. - The Ord instance doesn't try to do anything special, just treating the unknown fields as a list of values. If it really matters then `discardUnknownFields` can help resolve the ambiguity.
@@ -14,6 +14,7 @@ module Data.ProtoLens.Arbitrary | |||
) where | |||
|
|||
import Data.ProtoLens.Message | |||
import Data.ProtoLens.Encoding.Wire |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you need this import?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed.
@@ -53,20 +53,23 @@ parseMessage :: forall msg . Message msg => Parser () -> Parser msg | |||
parseMessage end = do | |||
(msg, unsetFields) <- loop def requiredFields | |||
if Map.null unsetFields | |||
then return $ reverseRepeatedFields fields msg | |||
then return $ over unknownFields reverse |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why reversed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nevermind. it's immediately afterwards :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line causes the list to be accumulated in reversed order:
https://github.com/google/proto-lens/pull/129/files/137e2aab090f96cd7d254531d41cd6537b28ca22#diff-f6d9e63733ce67e5fce8bbe122336aa5R64
It's similar to what we do for repeated fields:
https://github.com/google/proto-lens/pull/129/files/137e2aab090f96cd7d254531d41cd6537b28ca22#diff-f6d9e63733ce67e5fce8bbe122336aa5R126
A little bit of a hack, but at least isolated to this file.
Originally we used Seq
for repeated fields, but it seemed to make the API unnecessarily complicated. (And right now it's tricky to have a different type just for accumulation during decode.)
else fail $ "Missing required fields " | ||
++ show (map fieldDescriptorName | ||
$ Map.elems $ unsetFields) | ||
where | ||
fields = fieldsByTag descriptor | ||
addUnknown :: TaggedValue -> msg -> msg | ||
addUnknown !f = over unknownFields (\(!xs) -> f : xs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this might be cleaner as over unknownFields (f :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It needs the bang pattern or else horrible space leaks happen. I've sent out #130 to add a helper function and use it elsewhere in this file; once that's in I'll use it here too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ahh, missed the bang. makes sense
requiredFields = Map.filter isRequired fields | ||
loop :: msg -> Map.Map Tag (FieldDescriptor msg) | ||
-> Parser (msg, Map.Map Tag (FieldDescriptor msg)) | ||
loop msg unsetFields = ((msg, unsetFields) <$ end) | ||
<|> do | ||
tv@(TaggedValue tag _) <- getTaggedValue | ||
case Map.lookup tag fields of | ||
Nothing -> loop msg unsetFields | ||
Nothing -> (loop $! addUnknown tv msg) unsetFields |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why the strictness? to avoid big thunks if a big message is entirely unknown?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, pretty much. In particular, the addUnknown
thunk might otherwise hold onto the old value of msg
, which can be painful if there are a bunch of unknown fields in a row (e.g.: an unknown, unpacked repeated field).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks great!
- Every message has a new field containing a list of unknown fields: ```unknownFields :: Message msg => Lens' msg [TaggedValue]``` - Unknown fields are preserved by `decodeMessage`, `encodeMessage`, and `showMessage` - Unknown fields still cause an error for `readMessage`. A few TODOs: - For now, unknown groups are printed sub-optimally by `showMessage`: the start/end group tags (and everything in between) all get displayed as individual fields, rather than being organized into a sub-struct. - The `discardUnknownFields` function isn't recursive, unlike in other languages. - The Ord instance doesn't try to do anything special, just treating the unknown fields as a list of values. If it really matters then `discardUnknownFields` can help resolve the ambiguity.
- Every message has a new field containing a list of unknown fields: ```unknownFields :: Message msg => Lens' msg [TaggedValue]``` - Unknown fields are preserved by `decodeMessage`, `encodeMessage`, and `showMessage` - Unknown fields still cause an error for `readMessage`. A few TODOs: - For now, unknown groups are printed sub-optimally by `showMessage`: the start/end group tags (and everything in between) all get displayed as individual fields, rather than being organized into a sub-struct. - The `discardUnknownFields` function isn't recursive, unlike in other languages. - The Ord instance doesn't try to do anything special, just treating the unknown fields as a list of values. If it really matters then `discardUnknownFields` can help resolve the ambiguity.
unknownFields :: Message msg => Lens' msg [TaggedValue]
decodeMessage
,encodeMessage
,and
showMessage
readMessage
.A few TODOs:
showMessage
: thestart/end group tags (and everything in between) all get displayed as
individual fields, rather than being organized into a sub-struct.
discardUnknownFields
function isn't recursive, unlike in otherlanguages.
the unknown fields as a list of values. If it really matters then
discardUnknownFields
can help resolve the ambiguity.