Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Specifying different serializers for input and output #1563

Closed
foresmac opened this issue May 2, 2014 · 26 comments
Closed

Specifying different serializers for input and output #1563

foresmac opened this issue May 2, 2014 · 26 comments

Comments

@foresmac
Copy link

foresmac commented May 2, 2014

I find that, particularly when it comes to POST methods, that I often need a different serializer for input than for output. E.g., for a a particular model I may need only two or three input values, but the server will calculate/retrieve/whatever some additional values for fields on the model, and all those values need to get back to the client.

So far, my method has been to override get_serializer_class() to specify a separate input serializer for the request, and then override create() to use a different serializer for my output. That pattern works, but it took me some time to figure it out because the docs don't really suggest that option; the assumption that the generic APIViews are built around is that you specify one serializer and that is used for both input and output. I agree this generally works in the common case, but using my method breaks a little bit of the CBV magic. In particular, it can be difficult to troubleshoot if you make a mistake specifying a custom output serializer.

I propose two solutions:

  1. Build a consistent way to optionally specify different input and output serializers, or
  2. Add some additional documentation explaining the presumption that one serializer is intended for input and output, and best practices for overriding that in the case that the default behavior doesn't suit your use case.
@thedrow
Copy link
Contributor

thedrow commented May 2, 2014

You need to use the read/write_only kwargs if you want to return different data but allow writing to other attributes.
If you need different formatting entirely or even different field classes for the same attribute then yeh, you need to use two different serializers.

@carltongibson
Copy link
Collaborator

Hi @foresmac,

You've hit on one of the... erm... learning points of DRF. This sort of thing comes up on StackOverflow a number of times.

Overriding get_serializer_class() works well enough, so I'd favour your Option 2.

If you fancy drafting up some changes in a pull request — or doing a blog post or something else like that — that would be cool.

In the meantime I'll close this particular issue.

@foresmac
Copy link
Author

foresmac commented May 6, 2014

@carltongibson:

Overriding get_serializer_class() only works if you're using a different serializer for different HTTP methods, right? IS there a way to override it so that it returns a different serializer for input on a request vs output on a Response?

@carltongibson
Copy link
Collaborator

@foresmac — I see — a slightly different case. I think the short answer is "Not automagically, not currently". I imagine the simplest thing (if this is really necessary) is setting the response data (with your output serialiser by hand. (But you've found your own way via create it seems.)

if this is really necessary

You really can get a long way with read only fields and so on — I can certainly believe there are cases where this isn't enough but I'm not sure at all that such cases would fall in the 80:20 that needs to be served (in the core) by DRF.

If you think we're missing something, I recommend you explain it in depth, show where the code would change, show what use-cases would be resolved by it — if it sounds good, then open a pull request to that effect so that it can be reviewed.

If you fancy taking a pop at Option 2 — that'll always be well received.

@foresmac
Copy link
Author

foresmac commented May 6, 2014

Yeah, I'm fine either way, honestly. I just found it difficult to figure out how to do what I wanted from the docs, but as much may be my misunderstanding of how DRF is designed to work.

Here's a sample so you can see what I am talking about; feel free to note if I'm doing something monumentally stupid and that there is/should be a better way in DRF itself that I'm just missing.

from rest_framework import generics, status
from rest_framework.response import Response

from rack.models import RackItem
from rack.serializers import RackItemSerializer, NewRackItemSerializer


class ListCreateRackItem(generics.ListCreateAPIView):
    model = RackItem

    def get_serializer_class(self):
        if self.request.method == 'POST':
            return NewRackItemSerializer
        return RackItemSerializer

    def get_queryset(self):
        return RackItem.objects.filter(shopper=self.request.user)

    def create(self, request, *args, **kwargs):
        serializer = self.get_serializer(data=request.DATA)

        if not serializer.is_valid():
            return Response(
                serializer.errors, status=status.HTTP_400_BAD_REQUEST)

        item = RackItem.objects.create(
            shopper=request.user,
            item_url=serializer.data['item_url'],
            item_image_url=serializer.data['item_image_url'])

        result = RackItemSerializer(item)
        return Response(result.data, status=status.HTTP_201_CREATED)


class GetUpdateDeleteRackItem(generics.RetrieveUpdateDestroyAPIView):
    model = RackItem
    serializer_class = RackItemSerializer

    def get_queryset(self):
        return RackItem.objects.filter(shopper=self.request.user)

and the serializers themselves:

from rest_framework import serializers

from models import RackItem


class RackItemSerializer(serializers.ModelSerializer):
    class Meta:
        model = RackItem


class NewRackItemSerializer(serializers.Serializer):
    item_url = serializers.URLField()
    item_image_url = serializers.URLField()

@foresmac
Copy link
Author

foresmac commented May 6, 2014

The gist here is that I'm only getting some small bit of information to create a rack item; the server itself generates all the other fields on the model and stores them in the DB. But, I want my endpoint to spit out all that info when a new item is created.

The pattern isn't too difficult, but all the docs seem to make a lot of assumptions that everyone is doing the 80% common case, and what's there makes it hard to see what to override or where to achieve other ends. If doing what I'm doing isn't common enough to address in the code, I'm more than happy to provide some sample code and an explanation of how it works and why.

@carltongibson
Copy link
Collaborator

@foresmac — it looks to me like you're switching the serialisers in the most sensible way.

However, I'd guess you could get the same result by marking the server-provided fields — shopper in your example — as read-only in RackItemSerializer and then just use that. — I guess it's just a question of what you prefer in the end.

@thedrow
Copy link
Contributor

thedrow commented May 7, 2014

I think we should state that we prefer read only fields in order to DRY up the code.
Creating two serializers seems redundant.
@foresmac What do you think?

@foresmac
Copy link
Author

foresmac commented May 7, 2014

There will be more fields that are editable later—mostly some boolean fields that record some user actions with the model—so I'm not sure that solves the problem in the long term. Agree that I probably should be making better use of read_only, though. Maybe setting required explicitly in some cases may also help? I'm not sure, TBH.

I'm used to basically creating a Django form to use for input validation, and basically just building a dict of key_name, value pairs and basically just doing json.dumps() for output. The whole concept of a serializer that works both ways was completely foreign to me before using DRF.

@cyclecycle
Copy link

cyclecycle commented Jun 27, 2017

The answer I came here to find turned out to be:

If read_only doesn't provide the control you need, and you want to customise the input or output validation logic, you should override to_internal_value() or to_representation(), respectively.

In this case, you'd use to_internal_value() to customise the generation of validated_data from whatever the client provides. If the subsequent built-in call to YourModel.objects.create(**validated_data) doesn't work, you can then override create().

@frenetic
Copy link

Im trying to tackle this issue right now.
Im going to give a try to this package - https://github.com/vintasoftware/drf-rw-serializers.

@naderalexan
Copy link

Your approach seems logical, why not just make a mixin out of it and re-use it. Something along the lines of:

from rest_framework import status
from rest_framework.response import Response


class DifferentOutInViewsetSerializers:
    """
    Mixin for allowing the use of different serializers for responses and
    requests for update/create
    """
    request_serializer_update = None
    response_serializer_update = None

    request_serializer_create = None
    response_serializer_create = None

    def get_serializer_class(self):
        if self.action == 'update':
            return self.request_serializer_update
        elif self.action == 'create':
            return self.request_serializer_create
        return super().get_serializer_class()

    def create(self, request, *args, **kwargs):
        serializer = self.get_serializer(data=request.data)
        serializer.is_valid(raise_exception=True)
        self.perform_create(serializer)

        response_serializer = self.response_serializer_create(
            instance=serializer.instance)

        headers = self.get_success_headers(response_serializer.data)
        return Response(
            response_serializer.data,
            status=status.HTTP_201_CREATED, headers=headers)

    def update(self, request, *args, **kwargs):
        partial = kwargs.pop('partial', False)
        instance = self.get_object()
        serializer = self.get_serializer(instance, data=request.data, partial=partial)
        serializer.is_valid(raise_exception=True)
        self.perform_update(serializer)

        if getattr(instance, '_prefetched_objects_cache', None):
            # If 'prefetch_related' has been applied to a queryset, we need to
            # forcibly invalidate the prefetch cache on the instance.
            instance._prefetched_objects_cache = {}

        response_serializer = self.response_serializer_update(
            instance=serializer.instance)
        return Response(response_serializer.data)

@Moulde
Copy link

Moulde commented Oct 2, 2018

I find myself in need of this as well, but in my case it's not for POST requests, but a GET request that returns a bunch of non-modal objects, and I need a bunch of filter parameters that are not directly related to the fields on the objects.

I'm not sure if i'm doing something weird or missing something entirely, as I can't be the first to have an endpoint that is not model CRUD and needs to generate the documentation from the endpoint?

I can just retrieve the arguments from the request object, but I'm trying to have my API be self-documenting by using the API documentation generator in drf, or the drf-yasg package, hence my reason for wanting to use the serializers for specifying the parameters.
Is there another way to describe the endpoint parameters besides serializers?

Sorry if this is not within the scope of this issue.

@frenetic
Copy link

frenetic commented Oct 2, 2018

Previously I said I was going to use https://github.com/vintasoftware/drf-rw-serializers.
After 6 months using it, it has been really helpful. For most problems stated here, this library will help.

@Moulde Take a look at the lib Im using. If it does not suit you, try implementing the method get_serializer_class - http://www.django-rest-framework.org/api-guide/generic-views/#get_serializer_classself.

@foresmac
Copy link
Author

foresmac commented Oct 2, 2018

@frenetic This seems to solve a problem that `get_serializer_class’ can already solve most of the time. What @Moulde (who presents a slightly different use-case than I previously mentioned) is saying is that there are times when you want/need to use different serializers for the Request vs the Response. And being able to take advantage of automatic documentation is an important consideration for this in my mind, outside of the desire to avoid boilerplate code in this case.

@Moulde
Copy link

Moulde commented Jan 22, 2019

Yes, basically I think DRF has less than ideal support for non-modal views, where you want to use a serializer to describe the interface, so that the automatic documentation can be generated.
An example could be a endpoint for filtering/searching (external) non-modal data.

@joshowen
Copy link

joshowen commented Jul 13, 2019

there are times when you want/need to use different serializers for the Request vs the Response

This was a big reason to use https://github.com/limdauto/drf_openapi (when it was still maintained). It would be awesome if it was possible to differentiate request and response schemas in django-rest-framework. There are a lot of times we're extending another api which has these characteristics.

@matthewarmand
Copy link

I do think that having APIView support for different serializers for input and output would be a killer feature. It might even be easy to implement; for example a parameter could be supplied to get_serializer so that get_serializer_class knows the context (input vs output). This seems like it'd be fairly unobtrusive, and would allow for apps needing this functionality to follow the usual advice of "override get_serializer_class".

In the absence of that, I wanted to highlight a couple current django-rest-framework features I didn't see in this thread that might help someone else coming through here later on. If it's workable for you to include all your desired input and output fields into one Model, you can use a ModelSerializer to specify read-only and write-only fields such that you effectively get different input and output schemas.
(Versions: djangorestframework 3.10.3, Django 2.2.5, Python 3.7.4)

So a Model like this:

from django.db import models

class ThingModel(models.Model):
    input_field = models.CharField()
    output_date = models.DateTimeField()
    output_id = models.IntegerField()
    output_string = models.CharField()

And a Serializer like this:

from rest_framework import serializers
from .models import ThingModel

class ThingSerializer(serializers.ModelSerializer):
    class Meta:
        model = ThingModel
        fields = [
            "input_field",
            "output_date",
            "output_id",
            "output_string",
        ]
        read_only_fields = [
            "output_date",
            "output_id",
            "output_string",
        ]
        extra_kwargs = {"input_field": {"write_only": True}}

Will produce the following OpenAPI docs when generateschema is used:

...
  /urlpath/:
    post:
      operationId: CreateThing
        parameters: []
        requestBody:
          content:
            application/json:
              schema:
                properties:
                  input_field:
                    type: string
                    write_only: true
                required:
                - input_field
        responses:
          '200':
            content:
              application/json:
                schema:
                  properties:
                    output_date:
                      type: string
                      format: date-time
                      readOnly: true
                    output_id:
                      type: integer
                      readOnly: true
                    output_string:
                      type: string
                      readOnly: true
                  required: []
          description: ''

This helps when you want one endpoint, within the context of a single method (eg POST), to have different input and output schemas, if the whole object can be expressed as a Model. It'd be great to have similar functionality without having to use a ModelSerializer.

@fronbasal
Copy link

What is the "state of the art" solution for this problem? Was there any progression since 2014?

@matthewarmand
Copy link

@fronbasal We've been accomplishing this with Serializers by using write_only=True and read_only=True on each serializer Field as appropriate. I'm not sure what the generated Swagger docs look like for that, but it does accomplish different schemas for input and output to a sufficient extent for our purposes.

I'm not sure whether the maintainers have a different/better solution in mind

@fronbasal
Copy link

@matthewwithanm All right, thank you very much!

@michaelhenry
Copy link

michaelhenry commented Jun 6, 2020

@fronbasal I have come up with a simple solution, just pass the model to another serializer

here is the example

def create(self, request):
    serializer = InputSerializer(data=request.data)
    output_serializer = serializer # default
    if not serializer.is_valid():
        return Response(
             {'errors': serializer.errors},
            status=status.HTTP_400_BAD_REQUEST)
    try:
        output_serializer = OutputSerializer(serializer.save())
    except Exception as e:
        raise ValidationError({"detail": e})
    return Response(output_serializer.data, status=status.HTTP_201_CREATED)

Cheers

@fronbasal
Copy link

@michaelhenry thats a really clever solution! Love it.

@fechnert
Copy link

@cyclecycle

If read_only doesn't provide the control you need, and you want to customize the input or output validation logic, you should override to_internal_value() or to_representation(), respectively.

Thank you, just saved my day. I have the use case where i want to display more data of the owner of an object while retrieving it, but only give the owner's id while creating new entries.

This was rather tedious to accomplish, because using nested serializers (even whith providing all of the required data) raises the following AssertionError:

AssertionError: The `.create()` method does not support writable nested fields by default.

So instead i had to overwrite to_internal_value() to act like the given data for the request contains the user instance, but for the response contains the nested owner information.

See code for implementation details
class OwnerSerializer(serializers.Serializer):
    """Serializer for the related/nested owner information"""

    id = serializers.UUIDField()
    username = serializers.CharField()

    def to_internal_value(self, data):
        if not isinstance(data, str):
            raise ValidationError("This value needs to be the owners UUID")

        return User.objects.get(id=data)


class SomeModelSerializer(serializers.ModelSerializer):
    """The actual model with a relation to a user as owner"""

    owner = OwnerSerializer()

    class Meta:
        model = SomeModel
        fields = [
            'title',
            'owner',
            ...
        ]

The request to create a new object can look like this:

{
    "title": "Some title",
    "owner": "9c3faee7-2ce1-4b86-a58c-45f70a042125"
}

While the response contains all desired nested information about the related owner after creation

{
    "title": "Some title",
    "owner": {
        "id": "9c3faee7-2ce1-4b86-a58c-45f70a042125",
        "username": "some_user0815"
    }
}

@ulgens
Copy link

ulgens commented Apr 5, 2022

I'd recommend an even cleaner way (for view layer):

    def create(self, request, *args, **kwargs):
        # self.get_serializer_class() returns the correct serializer for writing
        input_serializer = self.get_serializer(data=request.data)  
        input_serializer.is_valid(raise_exception=True)
        self.perform_create(input_serializer)

        output_serializer = ReadSerializer(input_serializer.instance)
        headers = self.get_success_headers(output_serializer.data)
        return Response(output_serializer.data, status=status.HTTP_201_CREATED, headers=headers)

@sshishov
Copy link
Contributor

sshishov commented Aug 2, 2023

Can we include the work done in drf-rw-serializers into the core of viewsets at least? It would be great to distinguish between read and write and at the same time if this distinguish is not needed, we can fallback to get_serializer_class which will fallback to serializer_class attribute by default.

People are adding a lot of workarounds here just to solve the issue with data and to generate proper OpenAPI schemes.
Also it would be great for DRF core to include "filter serializers" for proper scheme generation and "error serializers" as currently all of this should be done by hands, for instance in drf-spectacular

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests