Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for multipart/form-data? #922

Open
omani opened this issue Jul 28, 2017 · 11 comments
Open

Support for multipart/form-data? #922

omani opened this issue Jul 28, 2017 · 11 comments
Labels
enhancement a feature, ready for implementation

Comments

@omani
Copy link

omani commented Jul 28, 2017

hi,

are there any plans to support multipart/form-data uploads? I need to upload raw binary. since json is encoding it to base64 I am not only suffering from 33% more space consumption but also more cpu usage for encoding/decoding.

it would be helpful to be able to upload raw binaries with eg. multipart/form-data.

@omani
Copy link
Author

omani commented Jul 28, 2017

or is there a way to post raw binary to postgrest so I can insert bytea in postgresql? this is what I actually want to do here. open a file, read bytes, make a post request with that bytes in the body (no json here, because json encodes to base64) and store it into a field type bytea.

@steve-chavez
Copy link
Member

Right now posting raw binary is not possible in postgREST, but I think you could encode base64 in the client, post the payload in an /rpc then decode it in the procedure to finally insert it in a table, if you don't want to encode/decode base64 you could do it directly in nginx with ngx_postgres, as suggested in this previous issue #894.

@omani
Copy link
Author

omani commented Jul 29, 2017

ok that works. thanks.

@steve-chavez
Copy link
Member

I'm interested in implementing multipart/form-data as it's really convenient, so I'll reopen.

My proposal for this would be to use PostgreSQL COPY BINARY, this would make the insertion off all the bytea fields in one go and it's also very efficient.

For this we would need to:

  1. Add COPY support in hasql.
  2. A library for pg binary formats for all the data types(required by COPY BINARY).

2 Would be a lot of work, I don' think a ready made haskell library exists.

For anyone interested in this feature, consider supporting PostgREST development.

@steve-chavez steve-chavez reopened this Jan 14, 2019
@steve-chavez steve-chavez added the enhancement a feature, ready for implementation label Jan 14, 2019
@alf-mindshift
Copy link

This is outside of my understanding, but I was wondering if https://github.com/nikita-volkov/postgresql-binary might be a library for pg binary formats?

@steve-chavez
Copy link
Member

@alf-mindshift True. That would only leave adding support for COPY on hasql.

@mxk
Copy link

mxk commented Mar 3, 2020

An alternative to an /rpc call is to define an INSERT/UPDATE trigger and use a custom header to control base64 decoding. An example is below. It would be great to have a way of uploading binary content directly.

CREATE OR REPLACE FUNCTION b64decode() RETURNS trigger AS $$
BEGIN
	IF current_setting('request.header.postgrest-encoding', TRUE) = 'base64' THEN
		new.val = decode(convert_from(new.val, 'SQL_ASCII'), 'base64');
	END IF;
	RETURN new;
END
$$ LANGUAGE plpgsql;

CREATE TRIGGER decode
	BEFORE INSERT OR UPDATE OF val
	ON my_table
	FOR EACH ROW
EXECUTE FUNCTION b64decode();

@aljungberg
Copy link
Contributor

aljungberg commented Feb 13, 2023

The data representations feature (#2523) will make it easy to receive base64 and save it to bytea, and to do the inverse.

  • Solves the extra storage cost since base64 is not used on disk, only on the wire
  • The extra bandwidth taken by base64 encoding is in my experience a non-issue assuming you have gzip compression enabled between your clients and your nginx (which you generally should with any JSON based API) -- base64 gzipped is only marginally larger than raw binary [1]
  • ...and you get the benefit of a more consistent API (that consistency of always using RESTful JSON style APIs without having too many exceptions that involve multipart forms)
  • Data reps don't solve the CPU overhead of encoding and decoding base64 so there is still reason for PostgREST to support raw binary APIs

[1]: Don't just take my experience, here's someone who ran the numbers: incompressible data is, as expected, 1.333x the size in base64, 1.0001x the size in gzipped binary, 1.009x the size in gzipped base64. An overhead of 0.9% is in my opinion too small to worry about unless the files are truly huge, in which case you're probably going to want to send them direct to your S3/GCS storage bucket anyhow.

@steve-chavez
Copy link
Member

steve-chavez commented Mar 27, 2023

One reason to not do this with COPY(as proposed above): pipeline mode doesn't support it

(We'll get pipeline mode haskellari/postgresql-libpq#42)


Also, I believe if we solve #2711, we'll be able to do multipart on SQL.

@wolfgangwalther
Copy link
Member

It's already possible to use a single unnamed BYTEA argument for an RPC and upload a raw Content-Type: application/octet-stream. No need for base64 encoding.

IMHO, the only downside to this is, that it's right now not easily possible to provide other arguments/input to the RPC. So e.g. when uploading a file, you often like to send a filename with it. Or some metadata where to reference this file. Right now, this can basically only be done by sending custom headers and then parsing current_setting('request.headers').

However.. what if we were to support mixing arguments for RPCs in the body and via query string? That could be really neat:

CREATE FUNCTION upload(BYTEA, filename TEXT) RETURNS void ...;

Right now this function would not be callable, I think. We expect a "single unnamed argument" to match the body with that.

But if we could call it like the following, that would be great:

POST /rpc/upload?filename=my_image.png HTTP/1.1
Content-Type: application/octet-stream

<raw file content as the body>

@steve-chavez
Copy link
Member

Ah nice, that seems it would work.

Though I guess the downsides are:

  • Non-standard, compared to OOTB support for multipart.
  • findProc becomes more complicated(it's a bit already)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement a feature, ready for implementation
Development

No branches or pull requests

6 participants