Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Serialization and rendering #15

Open
mjtamlyn opened this issue Apr 4, 2016 · 14 comments
Open

Serialization and rendering #15

mjtamlyn opened this issue Apr 4, 2016 · 14 comments

Comments

@mjtamlyn
Copy link
Owner

mjtamlyn commented Apr 4, 2016

Serialization is strictly the process of getting from a rich object to a bytestring (encoded in some format). Forms in particular make that bytestring rather complicated, and the deserialization step is funky. I'm not sure of the "right" terminology but I think we should think about two separate steps.

Firstly, we want to deconstruct and normalise our data set to a small subset of types. Some of these types are obvious - list, dict, str, int, float, bool, NoneType etc. Some are less clear whether they should be supported at this level - for example decimal.Decimal, collections.OrderedDict, datetime.datetime, set, or even more domain specific types like prices.Price. It's hard to know where to draw the line here - there's no perfect common set of data types which are supported by all possible encodings (renderers). Given that the HTMLForm/x-www-form-encoded renderer will need to do some custom transformations even for simple types as it depends on how things are being rendered by the widget (e.g. ['on', ''] -> [True, False] for checkboxes), I think it's clear we need to consider this as two steps. Perhaps we have a "core" set of data types which are valid return types from a serialization object and all renderers "must" support those types, but we allow renderers to become aware of other data types so they can support them.

Pseudocode:

s = MyNativeTypesSerializer(initial)
data = s.serialize()
json_blob = JSONRenderer().render(data)
html_form = FormRenderer(widgets).render(data)
JSONRenderer().can_render(data)  # True

s = MyRichSerializer
data = s.serialize()
assert isinstance(data['created'], datetime)
JSONRenderer().can_render(data)  # False

JSONRenderer.register_type_encoder(datetime, some_transformation_func)
JSONRenderer().can_render(data)  # True
# OR
JSONRenderer(type_encoders={datetime: some_transformation_func}).can_render(data)  # True

This obviously has some overlap with DjangoJSONEncoder which knows how to handle various data types such as UUID, datetime, lazy strings etc.

Whilst it would be possible, it should be strongly advised that you do not have type encoders for high level types such as instances of models.

Note that it's entirely possible here I'm just rehashing renderers/parsers from DRF, perhaps with some other extensibility ideas and we can pretty much use those for rendering. I think my main change is that perhaps I'd like to see .render(structure, data) or something so that the form renderer is aware of the underlying configuration of the form for example. I probably need to do more research into how DRF handles these layers, and the content negotiation for them.

@phalt
Copy link

phalt commented Apr 4, 2016

Could the serializer accept the renderer / encoder as an option? Or have a default?

For an API representation:

class APIUserSerializer(Serializer):

    renderer_class = JSONRenderer

...

For a template form:

class FormUserSerializer(Serializer):

    renderer_class = HTMLFormRenderer

...

Then doing something like:

s = APIUserSerializer(data=data)
s.serialize()
>>> # Output is in the format defined by the renderer

This would allow for some neater grouping and also give some customisation, such as providing your own renderer. This allows renderers to handle specific datatypes like Decimal if they choose too.

We're almost definitely doing a rehash of some components of DRF, but for me this whole idea started as I saw benefit of some parts of DRF being useful in Core Django :)

@mjtamlyn
Copy link
Owner Author

mjtamlyn commented Apr 4, 2016

It's an interesting question where the limits of what the core object is responsible for is. My ideas are currently more around the Serializer class knowing the "rules" about itself, but delegating everything it does to other pluggable layers - validation, data extraction, rendering etc.

I think the main thing I'd like to try and do if we can is to make it possible that you can do the above easily if you want - it's about a framework. DRF is a good framework in itself, but in reality validation and data extraction/de-extraction are still quite entangled. In Django.forms, rendering and parsing are also entangled in the same layers.

@MoritzS
Copy link
Collaborator

MoritzS commented Apr 4, 2016

The need to serialize rich objects will definitely arise but I don't think that should be the job of a serializer (i.e. a thing that converts a simple, almost dict-like structure with simple value types (str, int, etc.) to something else like JSON or an HTML form).
It is impossible to account for all different kinds of objects in a serializer so I think the serializers should only deal with simple values. Also you would possibly need to duplicate quite a lot of code if you want all serializers to support a new rich type.
What about another layer called something like "reducers" that only deals with turning rich values into simple ones? That way you just need to write a new "reducer" for your rich type and can then immediately use all serializers. A reducer could maybe also optionally support "unreducing" to make deserialization possible.

@mjtamlyn
Copy link
Owner Author

mjtamlyn commented Apr 4, 2016

The issue with that is that the "rich"er data types supported depends on the transport encoding. In particular, www-form-encoded ONLY supports strings and arrays, no booleans, nested dicts etc.

@phalt
Copy link

phalt commented Apr 4, 2016

The reducer in this case could be a simple type of renderer? Encoding the data from an external data type to an internal version (like your unreducer) could also be encompassed within this.

@MoritzS
Copy link
Collaborator

MoritzS commented Apr 4, 2016

Yes, I guess the reducer would actually be a simple renderer itself. So you would then have something like:

               +------------+                       +------------+
rich data ---> |  reducing  | ---> simple data ---> |  JSON      | ---> JSON
               |  renderer  |                       |  renderer  |
               +------------+                       +------------+

On second thought I'm not sure if that should be the case. Because if you start chaining renderers this violates the idea of "every renderer gets the same kind of predefined input data".

@luzfcb
Copy link

luzfcb commented Apr 4, 2016

I do not know where to post this, but this project reminded me vaguely that other project https://github.com/WiserTogether/django-remote-forms

@LilyFoote
Copy link
Collaborator

In core Django, I think the job of converting simple data to/from rich internal data is done by the Field.to_python and Field.prepare_value methods.

@MoritzS
Copy link
Collaborator

MoritzS commented Apr 4, 2016

Widget.value_from_datadict() and Widget.render() should also be mentioned here. That would be specific to an HTML serializer, though.

@mjtamlyn
Copy link
Owner Author

mjtamlyn commented Apr 4, 2016

@luzfcb Thanks a lot for that! I didn't know about that project (or the 2012 talk). That's definitely one of the problems we hope to tackle - only with an underlying data structure which is more abstract, rather than tied to the implementation of django.forms.Form.

@MoritzS @Ian-Foote Yes, and that Field.to_python stuff is actually kinda HTML form specific. I'm not sure there's any benefit to a separate reducing renderer and JSON encoder, although there is some logic that they are distinct steps with the "rendering" phase. I think in reality the way the reducer works is dependent on the encoding, so it kinda belongs as part of the encoder.

@phalt
Copy link

phalt commented Apr 5, 2016

@MoritzS yes, that flow looks sensible.

I'm worried that by sticking to "forms" we're missing out on the flexibility and potential of this change, and we should consider html form representations as another rendering option for the data, as @mjtamlyn points out.

@MoritzS
Copy link
Collaborator

MoritzS commented Apr 5, 2016

We're not sticking to HTML forms at all, in fact we thought about it to be exactly as you described it: Just another rendering options besides JSON or other formats.

@phalt
Copy link

phalt commented Apr 5, 2016

💃

@LilyFoote
Copy link
Collaborator

A few different form libraries have been mentioned on #17 which I think are actually more in scope for this issue:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants