Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support file parameters (upload) #26

Closed
lucaswerkmeister opened this issue Apr 24, 2023 · 14 comments
Closed

Support file parameters (upload) #26

lucaswerkmeister opened this issue Apr 24, 2023 · 14 comments
Labels
enhancement New feature or request
Milestone

Comments

@lucaswerkmeister
Copy link
Owner

I don’t think we support this yet, but we should. (I’ll need to look into what the API should look like – probably Blob as parameter type? But let’s see what e.g. fetch() accepts.)

@lucaswerkmeister lucaswerkmeister added the enhancement New feature or request label Apr 24, 2023
@lucaswerkmeister lucaswerkmeister added this to the 1.0.0 release milestone Jun 10, 2023
@lucaswerkmeister
Copy link
Owner Author

Let’s make this part of the 1.0 milestone (and probably try to get it done before 0.8, which would otherwise just be the release that drops Node 12 compat with no new features). If it turns out to be more difficult or less doable than expected, we can always kick it out again.

@lucaswerkmeister
Copy link
Owner Author

If we bump the Node requirement from 16 to 18 (cf. #25), then we get a whole lot of nice things that are useful for this feature, I think:

  • FormData
  • Blob and File
  • fetch()

At that point, we could even drop axios, and just use http-cookie-agent with Node’s native fetch() directly. The parameters could be instances of Blob (including its subclass File); the m3api session backend would turn the params object into a FormData instance.

There is one issue with this, which is that adding a Blob to FormData is apparently considered experimental in Node 18:

$ node
Welcome to Node.js v18.13.0.
Type ".help" for more information.
> b = new Blob([''])
Blob { size: 0, type: '' }
> d = new FormData()
FormData { [Symbol(state)]: [] }
> d.append('', b)
undefined
> (node:65345) ExperimentalWarning: buffer.File is an experimental feature and might change at any time
(Use `node --trace-warnings ...` to show where the warning was created)

(In Node 20, it’s stable.) But given that it works, and you can suppress the warning with --no-warnings=ExperimentalWarning, I’m quite tempted to say that we should go with this, use the browser APIs in Node as much as possible, and the small subset of Node 18 users who want to upload files (rather than do anything else with the API, or use Node 20 or later) will have to either live with these warnings or suppress them.

lucaswerkmeister added a commit that referenced this issue Jun 29, 2023
Try to use browser APIs in Node.js as much as possible, now that Node
supports fetch(). This requires bumping our Node.js requirement to
version 18.2.0 (fetch() is available since Node 18.0.0, but the
dispatcher option needed by http-cookie-agent only since 18.2.0), so we
can also drop the second GitHub Actions job.

fetch.js is now the base implementation of both node.js and browser.js;
fetch-node.js adds cookie support, while fetch-browser.js renames the
user-agent header to api-user-agent because Chrome doesn’t care about
web standards that clearly say user-agent is allowed to be set.

Unfortunately, http-cookie-agent still requires undici to be installed
even when it’s also built into Node, because Node doesn’t expose the
necessary internals to use the dispatcher option mentioned above.

Related to #26 (though we don’t actually support file uploads yet) as
well as #25.
@lucaswerkmeister
Copy link
Owner Author

lucaswerkmeister commented Jun 29, 2023

95e551f implements a replacement of axios with native fetch(). However, I think there’s one issue with it:

Unfortunately, http-cookie-agent still requires undici to be installed even when it’s also built into Node, because Node doesn’t expose the necessary internals to use the dispatcher option mentioned above.

My feeling is that this isn’t just unsatisfying due to undici slightly burdening the node_modules/ (though, on the whole, this still results in a smaller node_modules/ than the axios version), but it also points to a general problem with this approach: http-cookie-agent is using undici internals which, I assume, are not considered part of Node.js’ stable interface. For all I know, Node.js could decide to swap out their fetch() implementation for something different tomorrow, and m3api would at best break immediately (or worse, silently discard cookies). I’m not sure I’m willing to take that risk.

I think I’ll put together an implementation of file uploads based on axios, and see how that shakes out. The native fetch() approach could perhaps be left for a later release.

@lucaswerkmeister
Copy link
Owner Author

I’ll have to try it out (probably tomorrow) to be sure, but I think with axios we can even support file parameters without bumping the Node requirement to 18. Axios will accept Buffer (or Stream) in FormData on Node, and I don’t think we need to check instanceof on parameters for either of those types anywhere in core.js or combine.js, so it’s fine that those types don’t exist in the browser. So we can say that individual parameters can be File|Blob|Stream|Buffer, and clients can choose whichever of those types they have available, and we’ll just pass it through to axios. (For FormData itself, axios already depends on a polyfill, so we can use that too.)

lucaswerkmeister added a commit that referenced this issue Jul 1, 2023
I’ve had this vaguely planned for a while, and now seems like a good
time to do it, as we’re about to add more allowed param types for #26.
@lucaswerkmeister
Copy link
Owner Author

So I’m running into several problems.

Uploads on the Beta cluster have been broken for months (T340908), so I can’t integration-test this as well as I hoped. I think the available options are testing against the production cluster instead (see discussion with @legoktm), or still testing against the beta cluster and just treating lockmanager-fail-svr-acquire errors as “the request was still POSTed correctly, everything beyond that isn’t our problem” and letting the test succeed.

I also only managed to make Buffer uploads work so far, and even then, they require an ugly little hack in the FormData construction:

diff --git a/axios.js b/axios.js
index d93e4fbf34..68f99f1f21 100644
--- a/axios.js
+++ b/axios.js
@@ -39,6 +39,10 @@ class AxiosSession extends Session {
 	}
 
 	async internalPost( apiUrl, urlParams, bodyParams, headers ) {
+		const bodyFormData = new FormData();
+		for ( const [ paramName, paramValue ] of Object.entries( bodyParams ) ) {
+			bodyFormData.append( paramName, paramValue, typeof paramValue === 'string' ? {} : { filename: 'whatever.jpg' } );
+		}
 		const response = await this.session.request( {
 			method: 'POST',
 			baseURL: apiUrl,

Because MediaWiki will refuse to process a file as upload unless it has a file name in the Content-Disposition (even though it’s not going to use that file name), but at the same time, non-file parameters must not have a file name. So we need to tell FormData to please treat buffer params like files. I think it may do that automatically for streams created by fs.createReadStream(), but so far I haven’t managed to make streams work at all; and Blob and File might not be supported at all by the FormData polyfill, or at least they also give me errors.

I think I might try the native fetch + undici version again and see if Blob and File work with less hassle there.

@lucaswerkmeister
Copy link
Owner Author

I think it may do that automatically for streams created by fs.createReadStream(), but so far I haven’t managed to make streams work at all

Okay, an fs.createReadStream() stream does work, with or without the special FormData hack shown above. So two options we have with axios are:

  • Only fs.createReadStream() is supported on Node.js, and we keep the FormData interaction nice and simple.
  • Both Buffer and fs.createReadStream() (but not other types of streams) are supported on Node.js, and the FormData interaction is a bit hacky (but it’s not really a huge deal tbh, this is all internal to us).

But what’s still missing, and may or may not be doable at all, is integration with Blob and File on Node versions that support them. Hmph.

@tuukka
Copy link

tuukka commented Jul 1, 2023

Would a stream.Readable.from work in addition to fs.createReadStream? That way, it would be possible to upload file data without writing it to a temp file first.(Having to use a temp file wouldn't be a showstopper though.)

Also, we are already using Axios in Wikidocumentaries.

@lucaswerkmeister
Copy link
Owner Author

No, Readable.from( [ 'someString' ] ) is what I’ve had errors with so far, I’m afraid. (They’re really strange errors, too – the API returns “The "token" parameter must be set”, so it must be messing with the encoding of the entire form data somehow.)

@lucaswerkmeister
Copy link
Owner Author

Hahaaa, this is the entirety of the diff needed to make Blob and File uploads work on the no-axios branch:

diff --git a/fetch.js b/fetch.js
index ad31477586..f3e0033996 100644
--- a/fetch.js
+++ b/fetch.js
@@ -45,10 +45,14 @@ class FetchSession extends Session {
 	async internalPost( apiUrl, urlParams, bodyParams, headers ) {
 		const url = new URL( apiUrl );
 		url.search = new URLSearchParams( urlParams );
+		const body = new FormData();
+		for ( const [ paramName, paramValue ] of Object.entries( bodyParams ) ) {
+			body.append( paramName, paramValue );
+		}
 		const response = await fetch( url, {
 			...this.getFetchOptions( headers ),
 			method: 'POST',
-			body: new URLSearchParams( bodyParams ),
+			body,
 		} );
 		return transformResponse( response );
 	}

(If anyone’s curious, I’m creating the param values as new Blob( [ content ], { type: 'image/svg+xml' } ) and new File( [ content ], 'blank.svg', { type: 'image/svg.xml' } ) respectively.) The integration test still fails with the same lockmanager-fail-svr-acquire errors, but I’m taking that as a sign that the request is being posted correctly and the error is just due to Beta being broken.

That’s a pretty strong argument in favor of native fetch, I have to say.

@tuukka
Copy link

tuukka commented Jul 3, 2023

Hahaaa, this is the entirety of the diff needed to make Blob and File uploads work on the no-axios branch:

Awesome - thank you! Time spent to find the simplest approach is well spent.

Could @zexigong start testing this already by npm install lucaswerkmeister/m3api#no-axios in the backend, or will something else be needed before that?

@lucaswerkmeister
Copy link
Owner Author

I didn’t push any commit with that diff yet, but I can if it helps you testing.

lucaswerkmeister added a commit that referenced this issue Jul 3, 2023
When we’re using native fetch() everywhere, this is quite simple: we
just have to specify the body as FormData instead of URLSearchParams.

Implements #26.
@lucaswerkmeister
Copy link
Owner Author

lucaswerkmeister commented Jul 3, 2023

Alright, pushed to the no-axios branch, feel free to try it out.

lucaswerkmeister added a commit that referenced this issue Jul 8, 2023
Beta is not stable enough, and uploads in particular have been broken
for months (T340908 [1]), so if I want to integration-test file
parameters (#26), I either need to tolerate those errors or target
production. Based on a discussion with @legoktm [2], using Test
Wikipedia for this seems fine.

Also update the example command(s) in .env.example to avoid a race
condition: with cp+chmod, another user can open the file while it’s
still world-readable, then read secrets from the open file descriptor
later, since permissions are only checked on open(2) not read(2).

[1]: https://phabricator.wikimedia.org/T340908
[2]: https://wikis.world/@legoktm/110639639612274985
lucaswerkmeister added a commit that referenced this issue Jul 8, 2023
Try to use browser APIs in Node.js as much as possible, now that Node
supports fetch(). This requires bumping our Node.js requirement to
version 18.2.0 (fetch() is available since Node 18.0.0, but the
dispatcher option needed by http-cookie-agent only since 18.2.0), so we
can also drop the second GitHub Actions job.

fetch.js is now the base implementation of both node.js and browser.js;
fetch-node.js adds cookie support, while fetch-browser.js renames the
user-agent header to api-user-agent because Chrome doesn’t care about
web standards that clearly say user-agent is allowed to be set.

Unfortunately, http-cookie-agent still requires undici to be installed
even when it’s also built into Node, because Node doesn’t expose the
necessary internals to use the dispatcher option mentioned above.

Related to #26 (though we don’t actually support file uploads yet) as
well as #25.

This also shrinks our non-dev node_modules from 2913282 to 2073521
bytes, a 28% reduction, which is pretty sweet.
lucaswerkmeister added a commit that referenced this issue Jul 8, 2023
When we’re using native fetch() everywhere, this is quite simple: we
just have to specify the body as FormData instead of URLSearchParams.

Implements #26.
lucaswerkmeister added a commit that referenced this issue Jul 8, 2023
When we’re using native fetch() everywhere, this is quite simple: we
just have to specify the body as FormData instead of URLSearchParams.
(Although the tests need to import File from buffer in Node 18.)

Implements #26.
@lucaswerkmeister
Copy link
Owner Author

And now merged on main.

@lucaswerkmeister
Copy link
Owner Author

Released in v0.8.0 the other day. @tuukka I’d be interested to know if it works for you or not :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants