Prophy python codec cleanup #16

kamichal · 2018-08-02T23:53:14Z

A step ahead to:

creating wire pattern stamps that can be converted with md5 into a unique version identifier
making debugabble hex dumps in case of decode fail

aurzenligl · 2018-08-03T00:08:02Z

I'll need a while to read into it, most probably on weekend.

coveralls · 2018-08-03T00:08:09Z

Coverage increased (+0.05%) to 99.336% when pulling 7a2e7cd on wire_interface_versioning into 1c60fa2 on master.

aurzenligl · 2018-08-05T20:44:50Z

prophy/data_types.py

+        _SIZE = padding_size
+
+        @staticmethod
+        def encode_mock():


This seems unused.

aurzenligl · 2018-08-05T20:55:26Z

prophy/data_types.py

+    @classmethod
+    def get_descriptor(cls):
+        """
+            FIXME: I'm afraid it rapes YAGNI rule


It's a public mechanism to introspect struct/union layout. It's needed for users who want to implement non-standard algorithms on messages. I see that bricks_walk is essentially the same thing, but instead of returning descriptor for underlying struct/union, it just flattens everything. I suppose it's better to preserve the structure and return a descriptor for nested type, instead of unpacking it as a set of "bricks".

Rationale: "bricking" may be good for stamping schema of prophy message, but they're not good for custom algorithms which print, transform, check, fill, clone, etc. messages. All current internally implemented algorithms in Python prophy codec use recurrence and e.g. call proper struct method if member of struct is another struct. "Descriptor" or - how you called it - "brick" mechanism should probably allow user to do the same, so that boundaries between structs are not removed.

See ListFields and Descriptor for reference on protobuf reflection API, which this get_descriptor API is supposed to be modelled after:
https://developers.google.com/protocol-buffers/docs/reference/python/google.protobuf.message.Message-class
https://developers.google.com/protocol-buffers/docs/reference/python/

So... I'd propose to stick to the "descriptor" terminology and design, without flattening, and just extend this mechanism, so that it's complete. "Bricking" (meaning: flattening all fields of message) is just one of many algorithms which can be implemented when one has message descriptors publicly available (and then it could be implemented as free function, because all details relevant to "bricking" algorithm would be available publicly in single message type).

1st: In protobuf:

ListFields(self)
Returns a list of (FieldDescriptor, value) tuples for all fields in the message which are not empty.
Fields are returned with its value. The "get_descriptor" returns neither values nor paddings if any is there.

2nd. To determine unique identifier of the class stamp I need to iterate through each brick of each subclass and make it flatten. Consider this:

class Struct(prophy.with_metaclass(prophy.struct_generator, prophy.struct)): _descriptor = [("x", prophy.u32), ("y", prophy.u32)] class Host(prophy.with_metaclass(prophy.struct_generator, prophy.struct)): _descriptor = [("a", Struct), ("b", Struct)]

What if Struct.x changes type to say u64?
Iterating through Host.get_descriptor() will not notice any difference:
[('a', <class 'Struct'>, ('STRUCT', 4)), ('b', <class 'Struct'>, ('STRUCT', 4))]
Yes, then I could go recursively deeper, flatten the list, turn into strings and then meld-up with hashlib.md5 to get an wire interface identifier. It's possible, but not satisfying, because my point is to also dig into bytes-stream decode/encode debugging. Then getting information about paddings becomes essential. Walk across the wire and tell/report what happened.

But going back to the Host.Struct case. What if the Struct becomes optional? The Host.get_descriptor will return exactly the same: [('a', <class 'Struct'>, ('STRUCT', 4)), ('b', <class 'Struct'>, ('STRUCT', 4))]
The optional field wil be not included.

I don't know if the get_descriptor is used by anybody. I assume it is and I decided to make a similar mechanism aside - iterating the "bricks_walk".

ListFields(self)
Returns a list of (FieldDescriptor, value) tuples for all fields in the message which are not empty.
Fields are returned with its value. The "get_descriptor" returns neither values nor paddings if any is there.

Yes, current get_descriptor is quite limited and should be enhanced. In defence of protobuf - they don't have the problem of padding, so they don't have to describe it. And when it comes to optional fields... there is some API in protobuf which allows to get a list of all fields, not only filled ones, but I don't quite remember which ;) I used protobuf as an example - to explain that it exposes publicly details about its schema in Python codec, so that users can do virtually anything with protobuf messages without necessity of changing protobuf implementation (which would be a hassle).

But going back to the Host.Struct case. What if the Struct becomes optional? The Host.get_descriptor will return exactly the same: [('a', <class 'Struct'>, ('STRUCT', 4)), ('b', <class 'Struct'>, ('STRUCT', 4))]
The optional field wil be not included.

This is a shortcoming which can be corrected in get_descriptor mechanism - it can imho return all paddings and details like: whether field is optional or not. I don't see any problem in metadata that you propose in brick mechanism - the only thing that is questionable is flattening. My point is: it's easier to flatten nested structure, than to nest flattened list, so it seems to me keeping get_descriptor (but enhancing it by all necessary details) - and then flattening it would:

fulfil all requirements that bricking algorithm does as it is implemented now,

allow all kinds of usages which require nested descriptors - and that is a nice thing to have.
It's a bit more legwork, but only a bit ;)

I don't know if the get_descriptor is used by anybody. I assume it is and I decided to make a similar mechanism aside - iterating the "bricks_walk".

I think it was added by @Jiish when he was working with Nokia as a trainee - for his master's thesis - I'm not sure if it's used in production, but I'm sure that there is a need for a fully fledged get_descriptor mechanism which would return all greasy details about prophy schema. Otherwise people would (and still do!) depend on private prophy details, which is obviously a wrong thing to do.

All in all - I wouldn't worry about changing get_descriptor API, but I think that enhancing it to a point when other uses (like walking through flattened "bricks") is possible leaves us with a prophy that is more useful for further purposes, while adding a parallel mechanism similar to get_descriptor makes things more confusing and harder to maintain, because there'd be two competing mechanisms for doing essentially the same thing.

aurzenligl · 2018-08-05T21:23:08Z

The refactoring of Python codec is a very good idea, things got cleaner now. I have a couple of suggestions regarding placement of functionalities throughout modules.

I'd probably move base_array to container.py.
I'd probably move (keep?) enum to scalar.py.
I'd probably retain composite.py name instead of data_types.py which doesn't define all data types.
I'd probably rename desc_item.py to descriptor.py, and store all descriptor-related things there (reflection API, etc.).
I'd probably rename descriptor_item_type to field which sounds simpler and to the point (statement "descriptor is composed of fields" rings the bell).
I'd probably move codec_kind to descriptor.py - since that's the place with things common to all data types.

All in all - separation or generators and descriptors from big composite blob is definitely an improvement.

aurzenligl · 2018-08-05T21:26:48Z

prophy/generators.py

+    """
+        Base metaclass type intended to validate, supplement and create all
+        prophy_data_object classes.
+        All of this magic happen at the very moment of the created class' import.


Well, all of this magic happens at the very moment when class definition gets interpreted, which usually happens at codec module import. In unit tests we delay moment of class creation until fixture function is called - which happens way after import.

kamichal · 2018-08-06T11:16:32Z

Sometimes I bother with class names. There are 38 class definitions (excluding tests) in prophy package and only ProphyError is written in camel case.
You asked me to rename descriptor_item_type to field, but that's a common variable name used massively in the package, so it collides.

I can either to call it e.g. field_type or use the PEP8 convention, i.e.: Field. Of course that breaks the convention and even PEP8 recommends keeping existing convention.

So I would like to as to break the convention and rename the classes to CamelCase in whole prophy package. Of course external interface (prophy usage) will not change, because prophy.__init__.py allows to rename exported definitions, (e.g. from .generators import EnumGenerator as enum_generator).

Do you, Krzysztof want me to do that? 💃

aurzenligl · 2018-08-07T20:55:05Z

I can either to call it e.g. field_type or use the PEP8 convention, i.e.: Field. Of course that breaks the convention and even PEP8 recommends keeping existing convention.

That's definitely OK. I don't care whether it's field_type or field or Field or FieldType, I'd just want the core part of this type name to be some form of field word :D

So I would like to as to break the convention and rename the classes to CamelCase in whole prophy package. Of course external interface (prophy usage) will not change, because prophy.init.py allows to rename exported definitions, (e.g. from .generators import EnumGenerator as enum_generator).

Both ways are ok - all in snake case (against PEP8) or all in CamelCase (aligned to PEP8). I agree that keeping backwards compatibility by exposing small case is a good idea.

Do you, Krzysztof want me to do that?

If you feel like it - then do it ;) Code will become yet more clear.

…, kept passin' although.

passin' although.

…you know what's still passin').

…ets much cleaner now.

kamichal · 2019-01-13T12:21:43Z

I decided to discontinue the "wire interface versioning" implementation on this side (in prophy) and to implement it in prophyc. I removed the fishy part with descriptor iterator (called like "wire bricks walk"). So this pull request consists only of refactoring.

aurzenligl

A lot of good work @kamichal :) I've read the diff - looks fine for pulling.

kamichal · 2019-01-14T00:23:17Z

Thanks. It's ready to be merged. Version bump and release can wait for "babel part 1".

kamichal added the enhancement label Aug 2, 2018

kamichal self-assigned this Aug 2, 2018

kamichal requested a review from aurzenligl August 2, 2018 23:53

kamichal force-pushed the wire_interface_versioning branch from 96b0d9c to abea62b Compare August 3, 2018 08:26

kamichal changed the base branch from pep8_propagation to master August 3, 2018 08:28

kamichal force-pushed the wire_interface_versioning branch from bfad93b to 7f1efbc Compare August 3, 2018 14:48

aurzenligl reviewed Aug 5, 2018

View reviewed changes

Michał Kaczmarczyk added 10 commits August 8, 2018 18:08

Heavy refactor. 'prophy.composite' almost rewriten. Tests not touched…

899cc44

…, kept passin' although.

Refactor, turning _descriptor item into class. Tests not touched, kept

67b6b41

passin' although.

Minor refactor.

22a9bbe

Refactoring prophy.

17c00d1

Tightening tests - added assertions on exception messages.

5ea5323

Adding assertions on error messages.

149c128

Rearranged whole prophy package. Refactored and moved the code (yes, …

6f761a5

…you know what's still passin').

Moving classmethods from metaclasses to actual class definition. It g…

0068065

…ets much cleaner now.

Renaming composite to a bit more accurate name.

f3921c2

Moved definitions around. Renamed descriptor item to field.

fa0b9ae

kamichal force-pushed the wire_interface_versioning branch from cd20876 to fa0b9ae Compare August 8, 2018 16:09

removing unused unicode type for python2

c20bbaa

florczakraf mentioned this pull request Aug 19, 2018

Decode dynamic structures in a sensible manner #17

Open

Michał Kaczmarczyk added 4 commits December 14, 2018 18:05

IHNFIWIM

fae2a62

satisfy newer falke8 version

a6603b9

renamed self->cls in all type's superclasses

25d17e4

extending gitignore

213ad7f

Pre-merge cleanup.

f39f3b9

kamichal changed the title ~~Wire interface versioning~~ Prophy python codec cleanup Jan 13, 2019

Michał Kaczmarczyk added 2 commits January 13, 2019 13:42

renamed common parameter name

013b5fd

Minor corrections that help tests to pass.

aae498a

aurzenligl approved these changes Jan 13, 2019

View reviewed changes

Increasing code coverage

7a2e7cd

kamichal merged commit 094e29b into master Jan 15, 2019

kamichal deleted the wire_interface_versioning branch January 15, 2019 21:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Prophy python codec cleanup #16

Prophy python codec cleanup #16

kamichal commented Aug 2, 2018 •

edited

Loading

aurzenligl commented Aug 3, 2018

coveralls commented Aug 3, 2018 •

edited

Loading

aurzenligl Aug 5, 2018

aurzenligl Aug 5, 2018

kamichal Aug 6, 2018

aurzenligl Aug 7, 2018

aurzenligl commented Aug 5, 2018

aurzenligl Aug 5, 2018

kamichal commented Aug 6, 2018 •

edited

Loading

aurzenligl commented Aug 7, 2018

kamichal commented Jan 13, 2019

aurzenligl left a comment

kamichal commented Jan 14, 2019 •

edited

Loading

Prophy python codec cleanup #16

Prophy python codec cleanup #16

Conversation

kamichal commented Aug 2, 2018 • edited Loading

aurzenligl commented Aug 3, 2018

coveralls commented Aug 3, 2018 • edited Loading

aurzenligl Aug 5, 2018

Choose a reason for hiding this comment

aurzenligl Aug 5, 2018

Choose a reason for hiding this comment

kamichal Aug 6, 2018

Choose a reason for hiding this comment

aurzenligl Aug 7, 2018

Choose a reason for hiding this comment

aurzenligl commented Aug 5, 2018

aurzenligl Aug 5, 2018

Choose a reason for hiding this comment

kamichal commented Aug 6, 2018 • edited Loading

aurzenligl commented Aug 7, 2018

kamichal commented Jan 13, 2019

aurzenligl left a comment

Choose a reason for hiding this comment

kamichal commented Jan 14, 2019 • edited Loading

kamichal commented Aug 2, 2018 •

edited

Loading

coveralls commented Aug 3, 2018 •

edited

Loading

kamichal commented Aug 6, 2018 •

edited

Loading

kamichal commented Jan 14, 2019 •

edited

Loading