Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

content type header is null #1469

Closed
timmyschweer opened this issue Jul 19, 2013 · 10 comments
Closed

content type header is null #1469

timmyschweer opened this issue Jul 19, 2013 · 10 comments

Comments

@timmyschweer
Copy link

  1. What am I doing?
    using python requests to upload very large files to an java restlet that uses "org.apache.commons.fileupload" to receive the upload.
  2. Python source code:
import requests
url = "http://127.0.0.1:8888/service/rest/project/27/uploadfile/filename.zip"
filepath = "/home/user/3gb-file.zip"
session = requests.Session()
session.trust_env = False
session.auth = requests.auth.HTTPDigestAuth("user", "mypassword")
fpFile = open(filepath, 'rb')
session.post(url, data=fpFile, timeout=20)
fpFile.close()
  1. Java Exception Message:
[INFO] org.apache.commons.fileupload.FileUploadBase$InvalidContentTypeException: the request doesn't contain a multipart/form-data or multipart/mixed stream, content type header is null
[INFO] [ERROR] Jul 19, 2013 2:11:11 PM org.restlet.engine.log.LogFilter afterHandle
        at org.apache.commons.fileupload.FileUploadBase$FileItemIteratorImpl.<init>(FileUploadBase.java:908)
[INFO]  at org.apache.commons.fileupload.FileUploadBase.getItemIterator(FileUploadBase.java:331)

I checked the request header from the Response object, and there really is no "Content-Type" specified. I'm using the current requests library version 1.2.3 and python version 2.7.2

@Lukasa
Copy link
Member

Lukasa commented Jul 19, 2013

Hi @timmyschweer, thanks for opening this issue!

You should upload this as a file, rather than as a single giant data string. Try using:

import requests
url = "http://127.0.0.1:8888/service/rest/project/27/uploadfile/filename.zip"
filepath = "/home/user/3gb-file.zip"
session = requests.Session()
session.trust_env = False
sessionauth = requests.auth.HTTPDigestAuth("user", "password")
files = {'file': open(filepath, 'rb')}
session.post(url, files=files, timeout=20)

@timmyschweer
Copy link
Author

Hi @Lukasa, thank you for your reply =)
I already used the way you suggest. For small files that does work, but for large ones python crashes after 99% memory usage..
The idea was to use upload streaming without copying the whole file to memory:
http://docs.python-requests.org/en/latest/user/advanced/#streaming-uploads

@Lukasa
Copy link
Member

Lukasa commented Jul 19, 2013

Ah, ok. In that case we don't provide a Content-Type header because we don't know what the body of the message actually is. You'll need to set the Content-Type header yourself. =)

@timmyschweer
Copy link
Author

@Lukasa i already tried that! then the boundary is missing =/

@Lukasa
Copy link
Member

Lukasa commented Jul 19, 2013

What are you setting the Content-Type header to?

@timmyschweer
Copy link
Author

  1. drops an "missing boundary exception" on restlet side:
session.post(url, data=fpFile, headers={'Content-Type':'multipart/form-data' })
  1. using an stolen boundary (from another post request) the "org.apache.commons.fileupload" can't find any fileitem
session.post(url, data=fpFile, headers={'Content-Type':'multipart/form-data; boundary=fe3ea15cf3314b448addc215130893a0' })

seems like the "org.apache.commons.fileupload" is not capable of receiving data this way ha? :'(

@Lukasa
Copy link
Member

Lukasa commented Jul 19, 2013

Nah, it's that you're lying to it! =D

You aren't sending multipart/form-data. Chunked encoding is sending the literal contents of the file, just in chunks. Your Content-Type header should be the actual MIME type of the data you're sending. For example, for a zip file, it should be application/zip.

@timmyschweer
Copy link
Author

@Lukasa thank you for your brain muscle action!
I investigated the apache commons Fileupload class and it ONLY accepts multipart uploads with an boundary.. Due the fact that i can't write an FileUpload class that can handle streams without boundary i don't have a solution for my problem..

Is there another way of using

files = {'file': open(filepath, 'rb')}
session.post(url, files=files, timeout=20)

without copying the file to memory?

@Lukasa
Copy link
Member

Lukasa commented Jul 19, 2013

Re my brain: I'm always happy to help! It's what I'm here for. =)

Unfortunately, in vanilla Requests the answer is simply no.

Making this work is a fairly major diversion from how Requests works. If you wanted this to work you'd need to submit an improvement to urllib3. This improvement would fundamentally change the way urllib3 handles multipart data.

It is not impossible that we'll consider this as part of our work on Requests 2.0, as it would certainly been an awesome feature. However, it's not in the immediate plan. Sorry. =(

@timmyschweer
Copy link
Author

okay, would never have thought it would work that simple... Dependent on the mediatype I'm using the file upload multipart parser or directly writing the HTTP stream to disk.

@Lukasa thx for your help!

Here's my java restlet code:

import java.io.IOException;
import java.util.ArrayList;
import java.util.List;

import org.apache.commons.fileupload.FileItem;
import org.apache.commons.fileupload.FileUploadException;
import org.apache.commons.fileupload.disk.DiskFileItem;
import org.apache.commons.fileupload.disk.DiskFileItemFactory;
import org.apache.commons.fileupload.util.Streams;
import org.restlet.data.MediaType;
import org.restlet.data.Status;
import org.restlet.ext.fileupload.RestletFileUpload;
import org.restlet.representation.Representation;
import org.restlet.representation.Variant;
import org.restlet.resource.ResourceException;
import org.restlet.resource.ServerResource;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

public final class FileUploadResource extends ServerResource {

    private static final Logger LOG = LoggerFactory.getLogger(UploadFileResource.class);

    private String fileName;

    private final DiskFileItemFactory diskFileItemFactory;

    public FileUploadResource() {
        super();
        this.diskFileItemFactory = new DiskFileItemFactory();
        // always write file to disk independent from the size
        this.diskFileItemFactory.setSizeThreshold(0);
    }

    @Override
    protected void doInit() {
        this.fileName = this.getAttribute("file_name");
    }

    private List<FileItem> parseMultiPartUpload(final Representation representation) {
        final List<FileItem> fileItemList = new ArrayList<FileItem>();
        // necessary to parse the Representation to FileItem objects
        final RestletFileUpload fileUpload = new RestletFileUpload(this.diskFileItemFactory);
        try {
            fileItemList.addAll(fileUpload.parseRepresentation(representation));
        } catch (final FileUploadException e) {
            LOG.error("parseMultiPartUpload(): error occured when parsing representation", e);
            throw new ResourceException(Status.CLIENT_ERROR_BAD_REQUEST, e);
        }
        return fileItemList;
    }

    private List<FileItem> parseUploadStream(final Representation representation) {
        final List<FileItem> fileItemList = new ArrayList<FileItem>();
        String mediaType = "";
        if (representation.getMediaType() != null) {
            mediaType = representation.getMediaType().toString();
        }
        final FileItem fileItem = this.diskFileItemFactory
                .createItem("file", mediaType, false, this.fileName);
        try {
            Streams.copy(representation.getStream(), fileItem.getOutputStream(), true);
            fileItemList.add(fileItem);
        } catch (final IOException e) {
            LOG.error("post():failed reading upload stream", e);
            throw new ResourceException(Status.CLIENT_ERROR_BAD_REQUEST, e);
        }
        return fileItemList;
    }

    @Override
    public Representation post(final Representation representation, final Variant variant) {
        LOG.debug("post(): adding file variant {}", variant);
        return this.post(representation);
    }

    @Override
    public Representation post(final Representation representation) {
        LOG.debug("post():");
        // all items that will be found
        final List<FileItem> allItems = new ArrayList<FileItem>();
        // list of items that are real files not form fields
        final List<FileItem> fileItems = new ArrayList<FileItem>();
        if (representation.getMediaType() != null
                && representation.getMediaType().isCompatible(MediaType.MULTIPART_ALL)) {
            allItems.addAll(this.parseMultiPartUpload(representation));
        } else {
            allItems.addAll(this.parseUploadStream(representation));
        }
        for (final FileItem fileItem : allItems) {
            if (!fileItem.isFormField()) {
                fileItems.add(fileItem);
            }
        }
        // VALIDATING 1 FILE IS UPLOADED
        if (fileItems.size() == 0) {
            LOG.error("post(): no file item was found.");
            throw new ResourceException(Status.CLIENT_ERROR_EXPECTATION_FAILED, "no file item was found.");
        } else if (fileItems.size() > 1) {
            LOG.error("post(): more than one file item was found.");
            throw new ResourceException(Status.CLIENT_ERROR_EXPECTATION_FAILED,
                    "more than one file item was found.");
        }
        final DiskFileItem diskFileItem = (DiskFileItem) fileItems.get(0);
        if (diskFileItem.getSize() == 0) {
            LOG.warn("post(): file '{}' is empty and won't be added.", this.fileName);
            throw new ResourceException(Status.CLIENT_ERROR_UNPROCESSABLE_ENTITY, "uploaded file was empty");
        } else {
            // DO WHAT YOU WANT WITH THE FILE
        }
        return representation;
    }
}

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 9, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants