Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Request.data empty when multipart/form-data POSTed #3951

Closed
CaffeineFueled opened this issue Feb 18, 2016 · 55 comments
Closed

Request.data empty when multipart/form-data POSTed #3951

CaffeineFueled opened this issue Feb 18, 2016 · 55 comments

Comments

@CaffeineFueled
Copy link

CaffeineFueled commented Feb 18, 2016

Django==1.9.2
djangorestframework==3.3.2

This is essentially a repost of #3814. There is already some research and discussion attached to that issue that I will not rehash, but the author closed out the item once a workaround was found.

Is that workaround the long term solution? Is the answer to never touch the Django Request before DRF has a chance to do its thing? This can be difficult to control in projects with third party middleware.

The issue seems to be that, when POSTing data, Django is exhausting the underlying request stream if the WSGIRequest is accessed prior to initialize_request in the DRF View dispatcher. PUTs seem to be unaffected.

A totally contrived example that demonstrates the issue. First example, presuming no middleware Request meddling...

class TestView(APIView):

    def post(self, request, *args, **kwargs):
        # request.data, .FILES, and .POST are populated
        # request._request.FILES and .POST are empty
        return Response()

    def put(self, request, *args, **kwargs):
        # request.data, .FILES, and .POST are populated
        # request._request.FILES and .POST are empty
        return Response()

Second example, forcibly evaluating the Django request before dispatching.

class TestView(APIView):

    def dispatch(self, request, *args, **kwargs):
        p = request.POST  # Force evaluation of the Django request
        return super().dispatch(request, *args, **kwargs)

    def post(self, request, *args, **kwargs):
        # request.data, .FILES, and .POST are empty
        # request._request.FILES and .POST are populated
        return Response()

    def put(self, request, *args, **kwargs):
        # request.data, .FILES, and .POST are populated
        # request._request.FILES and .POST are empty
        return Response()

Cheers!

@foslock
Copy link

foslock commented Feb 18, 2016

+1 here. The workaround for the previously closed issue is not a solution for the issue at hand. The truth is that there very well is a use case for accessing the Django POST QueryDict in middleware before DRF can consume it.

@xordoquy
Copy link
Collaborator

xordoquy commented Feb 18, 2016

As there is for streaming content too which breaks the use of middleware introspecting the request content.
Moreover, if a middleware do look at the request content, it means it'll bypasses at least partially the content negotiation. At this point, it's probably a good option not to use DRF views and explicitly feed the serializer if required.

@xordoquy
Copy link
Collaborator

xordoquy commented Feb 19, 2016

I'd also be interested in learning a bit more in your solution about this issue - in particular how to preserve the compatibility with the current behavior.

@CaffeineFueled
Copy link
Author

CaffeineFueled commented Feb 21, 2016

The closest I've come to a "solution" is to add an additional check in the _parse method of the DRF Request:

...
        try:
            parsed = parser.parse(stream, media_type, self.parser_context)
            # If Django got to the POST/FILES data before we did, just use theirs.
            if (len(parsed.data) + len(parsed.files)) < 1:
                parsed.data = self._request.POST
                parsed.files = self._request.FILES
        except:
            # If we get an exception during parsing, fill in empty data and
...

I'm not sure how dangerous it is to use Django's POST data directly like this (rather than making a copy). They seem to be essentially read only once parsed.

While this retains the current Django WSGIRequest behavior, it leaves it in an inconsistent state insofar as POSTs may or may not have POST/FILE data depending on whether or not the request was touched prior to DRF Request parsing.

@rsalmaso
Copy link
Contributor

rsalmaso commented Feb 24, 2016

Just hit this on production today...
As workaround, I've put a check in our base GenericAPIView instead of custom middleware
(for compatibility must keep both override methods)

    def initialize_request(self, request, *args, **kwargs):
       request = super().initialize_request(request, *args, **kwargs)
       if request.path.startswith("/api/3/") and request.method == "POST":
          request.method = request.META.get("HTTP_X_HTTP_METHOD_OVERRIDE", request.POST.get("_method", request.method)).upper()
       return request

@alukach
Copy link
Contributor

alukach commented Mar 14, 2016

Also running into this problem. It appears to have been introduced in 3.2 (was not experiencing it while using 3.1.1).

Does anyone know if there is a way to reset the request object to its original state after examination or possibly duplicate the object without exhausting the POST data stream?

It feels like DRF should be capable of using an already-examined WSGIRequest.

@dannybrowne86
Copy link

dannybrowne86 commented Mar 22, 2016

+1
I have run into a similar issue. I created a new nested_route decorator that acts a lot like list_route and detail_route. I usually redirect to another viewset to handle the nested path. The issue was that the stream was being read too early (by the parent viewset dispatch), so I needed to ensure that the initialize_request function was only called once for a given request.

Working off of @rsalmaso's comment above, I overrode the initialize_request method in the children viewsets with the following.

def initialize_request(self, request, *args, **kwargs):
        if not isinstance(request, Request):
            request = super().initialize_request(request, *args, **kwargs)
        return request

This works fine, but I think having some sort of attribute for either ignoring the initialize_request function within dispatch or having a global check to only run initialize_request iff isinstance(request, rest_framework.request.Request) == False. I'm happy to prepare the PR with tests if that could be acceptable.

@jab3z
Copy link

jab3z commented Mar 25, 2016

I have the same issue. For now I have reverted to drf 3.2

@LouWii
Copy link

LouWii commented Mar 30, 2016

I've spent almost 3 hours searching about why my data wouldn't be parsed correctly. I first thought the issue was on my Angular App code. And I finally found this issue. Damn, almost banged my head on my desk.

The only work around working for me is getting back to 3.2.5. I'm surprised this issue has no real fix ?

@xordoquy
Copy link
Collaborator

xordoquy commented Mar 30, 2016

This shows what open source really is.

To get this fixed, the first step is to have someone take time and have a deep analysis of when / why and how the behavior change has been introduced. An initial analysis was performed on #3814 though it requires confirmation that the mentioned change did affect the behavior and then identify how the behavior change has been introduced.

Before that, I won't be able to consider the various workarounds as a solution.

@CaffeineFueled
Copy link
Author

CaffeineFueled commented Apr 1, 2016

Here's what I've managed to track down.

The change definitely occurred in commit 566812a as part of removing method and content overriding. Prior to that change, in request.py, the _load_data_and_files method called _load_method_and_content_type which called _perform_form_overloading. Within _perform_form_overloading was a bit of code that seeded the DRF Request._data attribute with Request._request.POST (which is the Django WSGIRequest POST).

        # At this point we're committed to parsing the request as form data.
        self._data = self._request.POST
        self._files = self._request.FILES
        self._full_data = self._data.copy()
        self._full_data.update(self._files)

This means that, back in _load_data_and_files, _parse was probably never called for multipart POSTs (at least not in the default configuration) and thus DRF did not attempt to re-read the request stream.

In more recent versions of DRF, _load_data_and_files no longer calls _load_method_and_content_type as that method no longer exists, so now execution is falling through to the call to _parse. Herein lies the problem.

For multipart POSTs, _parse is using a DRF MultiPartParser which is using a Django MultiPartParser which is attempting to re-read the request stream which has previously been exhausted.

So, another potential solution would be to just allow Django to handle the parsing of multipart forms. This could be achieved by replacing the MultiPartParser.parse method with something more along the lines of:

    def parse(self, stream, media_type=None, parser_context=None):
        """
        Parses the incoming bytestream as a multipart encoded form,
        and returns a DataAndFiles object.

        `.data` will be a `QueryDict` containing all the form parameters.
        `.files` will be a `QueryDict` containing all the form files.
        """
        parser_context = parser_context or {}
        request = parser_context['request']
        _request = request._request
        if _request.method == 'POST':
            return DataAndFiles(_request.POST, _request.FILES)
        encoding = parser_context.get('encoding', settings.DEFAULT_CHARSET)
        meta = request.META.copy()
        meta['CONTENT_TYPE'] = media_type
        upload_handlers = request.upload_handlers

        try:
            parser = DjangoMultiPartParser(meta, stream, upload_handlers, encoding)
            data, files = parser.parse()
            return DataAndFiles(data, files)
        except MultiPartParserError as exc:
            raise ParseError('Multipart form parse error - %s' % six.text_type(exc))

This essentially grabs the WSGIRequest that underlies the DRF Request and allows it to perform the request parsing. What I don't know is, are there any conditions under which Django would use another parser when DRF is expecting a MultiPartParser?

Edit: Updated "fix" code. Forgot that Django really only parses POSTs for multipart forms. Also wanted to note that I'm working on tests and eventually a pull request.

@totycro
Copy link

totycro commented Jun 27, 2016

I'm depending on this fix. Are there already plans on when to release it?

@vakorol
Copy link

vakorol commented Jun 29, 2016

I experience the same in 3.3.3. Had to roll back to 3.2.4 on Django 1.8.13. A patch for this issue is highly anticipated.

@bolgradov-modata
Copy link

bolgradov-modata commented Jun 29, 2016

I met the same issue today. Doing PUT instead of POST is really the solution for now.

@tomchristie tomchristie added this to the 3.4.0 Release milestone Jul 4, 2016
@tomchristie
Copy link
Member

tomchristie commented Jul 4, 2016

I'll put it on the list, but no promises if it'll make the 3.4.0 cut or not. We'll see.

@hvdklauw
Copy link

hvdklauw commented Jul 15, 2016

I did some digging on this too as I ran into it as well.

(This is django 1.8)
The DjangoMultiPartParser loops through the upload_handlers and called handle_raw_input which should return something, but none of them do.

The MemoryFileUploadHandler just set's activated to True and the TemporaryFileUploadHandler doesn't even implement it.

So maybe the handlers being passed are the wrong one, or rest framework is using the DjangoMultiPartParser wrong.

@vabada
Copy link

vabada commented Jul 26, 2016

+1 Answering @alukach, I have the same issue but only when updating to 3.3.0. With 3.2.0 works fine (I had to downgrade)

@tomchristie
Copy link
Member

tomchristie commented Oct 10, 2016

Something that demonstrates the issue against the tutorial project would be tremendous: https://github.com/tomchristie/rest-framework-tutorial

@tomchristie tomchristie removed this from the 3.4.8 Release milestone Oct 10, 2016
@alukach
Copy link
Contributor

alukach commented Oct 10, 2016

I was able to confirm the fix with this repo: https://github.com/alukach/drf-test (single commit here)

Tests failed with djangorestframework==3.4.6 but passes with djangorestframework==3.4.7.

@tomchristie tomchristie reopened this Oct 11, 2016
@tomchristie
Copy link
Member

tomchristie commented Oct 11, 2016

Thanks @alukach - will look into that!

@tomchristie tomchristie added this to the 3.5.0 Release milestone Oct 11, 2016
@tomchristie
Copy link
Member

tomchristie commented Oct 12, 2016

Sorry, read the previous comment incorrectly. Let's try that again...

I was able to confirm the fix

Wonderful, let's close this off then unless there's any further evidence that this is still an issue against master.

@rjpruitt16
Copy link

rjpruitt16 commented Jun 20, 2018

Django Rest API unable to Parse Form data using viewsets.ModelViewSet (MultiPartParser, FormParser)


class UserViewSet(viewsets.ModelViewSet):
    """
    API endpoint that allows users to be viewed or edited.
    """
    queryset = User.objects.all()
    parser_classes = (MultiPartParser, FormParser)
    serializer_class = UserSerializer

screen shot 2018-06-20 at 6 24 00 pm

screen shot 2018-06-20 at 6 24 55 pm

@hvdklauw
Copy link

hvdklauw commented Jun 21, 2018

Weird that it doesn't throw an error on email, can you add the UserSerializer code?

@xordoquy
Copy link
Collaborator

xordoquy commented Jun 21, 2018

We will need the simplest test case to work on this as it is usually known to be working.

@rjpruitt16
Copy link

rjpruitt16 commented Jun 21, 2018

class UserSerializer(serializers.HyperlinkedModelSerializer):
    snapcapsules = SnapCapsuleSerializer(
        many=True,
        read_only=True,
        allow_null=True,
    )

    class Meta:
        model = User
        fields = ('snapcapsules', 'url', 'username', 'email', 'password', )
        write_only_fields = ('password',)

    def create(self, validated_data):
        user = User.objects.create(
            username=validated_data['username'],
            email=validated_data['email'],
        )

        user.set_password(validated_data['password'])
        user.save()

        return user

    def update(self, instance, validated_data):
        capsules_data = validated_data.pop('snapcapsules')

        for capsule in capsules_data:
             SnapCapsule.objects.create(user=instance, **capsule)

        return instance

@rjpruitt16
Copy link

rjpruitt16 commented Jun 21, 2018

Maybe you could teach me to write the test

@rpkilby
Copy link
Member

rpkilby commented Jun 21, 2018

Hi @rjpruitt16. General advice is to reduce this to the minimum amount of code to reproduce the test case. For example, are all of the fields required to replicate the bug? Do you need to override both the create and update methods? Do you need the nested serializer?

From there, the minimal test case is just the necessary inputs to generate the current, unexpected result (inputs such as test data or an entire request object). Then, compare that against your expected result.

The test doesn't need to be perfect - it just needs to demonstrate the failure.

@SuperMasterBlasterLaser

I have this issue too:

class ImagesViewSet(ModelViewSet):
    serializer_class = ImageSerializer
    queryset = Image.objects.all()
    permission_classes = (IsAuthenticated,)

    parser_classes = (MultiPartParser, FormParser)

    @action(methods=['post'], detail=False,
        permission_classes=[IsAuthenticated])
    def upload_avatar(self, request):
        print(request.data)
        return Response({"image": "ok"})

After making this request:
image

My Query dict returns:

<QueryDict: {}>

Two years passed since this issue.

@rpkilby
Copy link
Member

rpkilby commented Jun 23, 2018

Two years passed since this issue.

All indications are that this was fixed in version 3.4.7. If you're still having an issue, the best thing you can do is provide an example test case. You might start by using the example tests provided by @alukach.

https://github.com/alukach/drf-test/commit/77f0e540da42f295197c888b0985c782439d6917

@rjpruitt16
Copy link

rjpruitt16 commented Jun 23, 2018

I figure out the problem. I am not sure if my headers were wrong, but in PostMan, If I do not provide any headers than it works perfectly.

You can see my Post on Stackoverflow here
https://stackoverflow.com/questions/50959137/django-rest-framework-unable-to-parse-multipart-form-data

joshuarli pushed a commit to getsentry/sentry that referenced this issue Oct 7, 2019
our middleware src/sentry/utils/linksign.py when request.POST is accessed,
django multipartparser is invoked and the underlying WSGIRequest buf is exhausted before it can get to DRF.
WSGIRequest._files is still accessible, but they aren't read back into the buf on the 2nd time.

according to encode/django-rest-framework#3951, the fix should be in drf 3.4.7.
@ydaniels
Copy link

ydaniels commented Oct 26, 2019

problem still persist even with latest DRF when using ModelViewSet

@xordoquy
Copy link
Collaborator

xordoquy commented Oct 27, 2019

@ydaniels please provide a minimal test case that demonstrate the issue. What I get from the few exchanges here is that it was not an issue with DRF.

@ydaniels
Copy link

ydaniels commented Oct 27, 2019

@xordoquy I am not sure this would do but i made quick one pure drf and django https://github.com/ydaniels/drf-upload-test . It works perfectly using the ViewSet class but not with ModelViewSet class so the issue should probably be from DRF.

@rpkilby
Copy link
Member

rpkilby commented Oct 28, 2019

Hi @ydaniels. The ImageModelViewSet is implemented incorrectly. I would recommend reading the below to understand the relationship between create and perform_create.

class CreateModelMixin:
"""
Create a model instance.
"""
def create(self, request, *args, **kwargs):
serializer = self.get_serializer(data=request.data)
serializer.is_valid(raise_exception=True)
self.perform_create(serializer)
headers = self.get_success_headers(serializer.data)
return Response(serializer.data, status=status.HTTP_201_CREATED, headers=headers)
def perform_create(self, serializer):
serializer.save()

You either need to rename ImageModelViewSet.perform_create to ImageModelViewSet.create or since the implementation is largely identical to the CreateModelMixin, you could delete the method entirely.

@ydaniels
Copy link

ydaniels commented Oct 29, 2019

@rpkilby noted Thanks

@Routhinator
Copy link

Routhinator commented Mar 18, 2020

@xordoquy I believe I have a case of this not working in DRF 3.11.0 - not sure if this is a regression or I'm missing something, but after a few days of reviewing my code I'm fairly positive I have everything configured correctly, and yet request.data is always empty.

I've got a stack overflow open here that has all the code snippets.. https://stackoverflow.com/questions/60741507/drf-serializer-update-method-always-has-empty-validated-data-object-when-using-p

I'm going to hold off opening a new issue until it's confirmed that this is a new bug.

I'm using Django 2.2 and DRF 3.11.0 - basic modelviewset and a serializer. This should be working.

The post body from angular is in the issue as well.

The one difference is in my case I am calling PATCH and not POST.

@Routhinator
Copy link

Routhinator commented Mar 19, 2020

Ok, the boundary= part missing from content-type is what was causing this. So I actually have an angular issue, not a django issue.

@encode encode deleted a comment from ABDURAZZOQ-ABDUSALOMOV Jul 5, 2022
@encode encode deleted a comment from ABDURAZZOQ-ABDUSALOMOV Jul 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests