Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

prefer duck-type checks over instanceof - Upload doesn't support polyfills/compat shims for Readable/ReadableStream #6153

Open
3 tasks done
jedwards1211 opened this issue Jun 1, 2024 · 6 comments
Assignees
Labels
feature-request New feature or enhancement. May require GitHub community feedback. p2 This is a standard priority issue queued This issues is on the AWS team's backlog

Comments

@jedwards1211
Copy link

jedwards1211 commented Jun 1, 2024

Checkboxes for prior research

Describe the bug

If I pass a polyfill for a Readable or ReadableStream in params in Upload, the request fails saying Body Data is unsupported format, expected data to be one of: string | Uint8Array | Buffer | Readable | ReadableStream | Blob.

This is very confusing when working with a library like archiver since it uses the readable-stream userland implementation of Node's Readable. Their docs don't make this clear at all (that's their bad), so I was baffled what was wrong until I dug into their code. And it would certainly be better if archiver used Node's own streams when possible.

But, @aws-sdk could tolerate cases like this instead of erroring out, and that would make things go more smoothly for users in a variety of cases.

SDK version number

@aws-sdk/lib-storage 3.588.0

Which JavaScript Runtime is this issue in?

Node.js

Details of the browser/Node.js/ReactNative version

v20.10.0

Reproduction Steps

import { S3Client } from "@aws-sdk/client-s3";
import { Upload } from "@aws-sdk/lib-storage";
import archiver from "archiver";

(async () => {
  const archive = archiver("zip");

  archive.append("this is a test", { name: "test.txt" });
  archive.finalize();
  await new Upload({
    client: new S3Client(),
    params: {
      Bucket: "test",
      Key: "test",
      Body: archive,
    },
  }).done();
})();

Observed Behavior

Running the code outputs the following error:

Error: Body Data is unsupported format, expected data to be one of: string | Uint8Array | Buffer | Readable | ReadableStream | Blob;.
    at getChunk (/Users/andy/gh/aws-sdk-v3-upload-stream-issue/node_modules/.pnpm/@aws-sdk+lib-storage@3.588.0_@aws-sdk+client-s3@3.588.0/node_modules/@aws-sdk/lib-storage/dist-cjs/index.js:167:9)
    at _Upload.__doMultipartUpload (/Users/andy/gh/aws-sdk-v3-upload-stream-issue/node_modules/.pnpm/@aws-sdk+lib-storage@3.588.0_@aws-sdk+client-s3@3.588.0/node_modules/@aws-sdk/lib-storage/dist-cjs/index.js:383:24)
    at _Upload.done (/Users/andy/gh/aws-sdk-v3-upload-stream-issue/node_modules/.pnpm/@aws-sdk+lib-storage@3.588.0_@aws-sdk+client-s3@3.588.0/node_modules/@aws-sdk/lib-storage/dist-cjs/index.js:213:37)

Expected Behavior

Upload would work if I had permission on this bucket

Possible Solution

Upload could use duck typing checks to determine if an object looks like a Readable or ReadableStream instead of using instanceof checks.

Additional Information/Context

I'm able to work around this by doing archiver.pipe(new PassThrough()) since that returns a bonafide Node.js Readable instance.

@jedwards1211 jedwards1211 added bug This issue is a bug. needs-triage This issue or PR still needs to be triaged. labels Jun 1, 2024
@RanVaknin RanVaknin added feature-request New feature or enhancement. May require GitHub community feedback. and removed bug This issue is a bug. labels Jun 4, 2024
@aBurmeseDev
Copy link
Member

Hi @jedwards1211 - thanks for reaching out.

This was brought up to team discussion and as we all understand, this's because archiver doesn't accept standard valid ReadableStream type that's expected. We'd suggest reaching out to archiver to accept the one Node defines as we won't make changes on our end unfortunately. Let us know if you have further questions.

Best,
John

@aBurmeseDev aBurmeseDev self-assigned this Jun 6, 2024
@aBurmeseDev aBurmeseDev added closing-soon This issue will automatically close in 4 days unless further comments are made. p3 This is a minor priority issue and removed needs-triage This issue or PR still needs to be triaged. labels Jun 6, 2024
@jedwards1211
Copy link
Author

jedwards1211 commented Jun 6, 2024

I mainly wanted to help other library users avoid having an annoying debugging session with archiver or other libraries that use polyfills.

I've talked to archiver about, at the very least, explaining this clearly in their docs. But, people don't always read docs well, and if they wanted to fix this with code changes, it would probably break other use cases. On the other hand, @aws-sdk/lib-storage could broaden its compatibility with everything without any breaking change.

I am disappointed the team doesn't think this would be a worthwhile change.

@aBurmeseDev
Copy link
Member

aBurmeseDev commented Jun 6, 2024

Thanks for your response. Totally understand the frustration that you're trying to help other users. However, since the type returned by archiver isn't standard to NodeJS, we can't make an exception and make the change.

Also I understand that there may be workaround to use PassThrough, from browsing the issues there on https://github.com/archiverjs/node-archiver/issues and hopefully it will get documented as you requested. Closing the issue for now.

@jedwards1211
Copy link
Author

jedwards1211 commented Jun 6, 2024

@aBurmeseDev I thought I would share some more context FWIW. I'm not assuming y'all don't know about things like this, but sharing it just in case you don't.

Weird environments

Last year I briefly worked at an SDK codegen company on supporting a wider variety of JavaScript environments -- Node, Deno, Bun, Vercel Edge, CloudFlare Workers, Jest node, Jest jsdom, react-native etc.

We ended up being unable to support certain environments with body instanceof Blob etc and had to resort to checks like this:

function isBlob(x: any): x is Blob {
  return x instanceof Object && typeof x.size === 'number' &&
    typeof x.type === 'string' && typeof x.arrayBuffer === 'function' &&
    typeof x.slice === 'function'
}

Jest still has a particularly onerous problem with instanceof; even instanceof Array fails in some cases: jestjs/jest#2549.

And as expected, @aws-sdk has these exact compatibility problems with Jest: #4273

Cross-realm compatibility

Some objects, for example Uint8Array, can be transferred across realms in JavaScript. In the browser, this is between windows, and in Node, this is between vm contexts. And these objects, despite being legitimate standard APIs, don't work with instanceof.

For example, a Uint8Array from another realm will fail an instanceof check. And I do see an instanceof Uint8Array check in @aws-sdk/lib-storage. This is another issue waiting to get filed, though a rare case.

Functions like Array.isArray were created to help with cross-realm support:

Array.isArray() checks if the passed value is an Array. It does not check the value's prototype chain, nor does it rely on the Array constructor it is attached to. It returns true for any value that was created using the array literal syntax or the Array constructor. This makes it safe to use with cross-realm objects, where the identity of the Array constructor is different and would therefore cause instanceof Array to fail.

Has the AWS SDK team considered if you want the SDKs to generally be cross-realm compatible? If so, until the day that Uint8Array.isUint8Array functions are created, you would have to implement a userland check for it.

Unfortunately, userland code to test the identity of cross-realm objects can't prevent all false positives. But this is not necessarily so bad, because an object that happens to contain all of the expected properties of Uint8Array or some other standard API was most likely intended to work like one, so a userland isUint8Array check like my isBlob example above is arguably low-risk.

And even though the issue with archiver is not a cross-realm compatibility issue per se, if you do implement cross-realm support with userland functions, it would happen to solve the issue with archiver.

@jedwards1211
Copy link
Author

@aBurmeseDev I just noticed that there's already duck typing support for Blob in the code...
https://github.com/aws/aws-sdk-js-v3/blame/main/lib/lib-storage/src/chunker.ts#L26

  if (typeof (data as any).stream === "function") {
    // approximate support for Blobs.
    return getChunkStream<ReadableStream>((data as any).stream(), partSize, getDataReadableStream);
  }

@kuhe kuhe reopened this Jun 12, 2024
@kuhe kuhe added queued This issues is on the AWS team's backlog and removed closing-soon This issue will automatically close in 4 days unless further comments are made. labels Jun 12, 2024
@kuhe
Copy link
Contributor

kuhe commented Jun 12, 2024

converting this to feature request to prefer duck checks globally

@kuhe kuhe changed the title Upload doesn't support polyfills/compat shims for Readable/ReadableStream prefer duck-type checks over instanceof - Upload doesn't support polyfills/compat shims for Readable/ReadableStream Jun 12, 2024
@kuhe kuhe added p2 This is a standard priority issue and removed p3 This is a minor priority issue labels Jun 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature-request New feature or enhancement. May require GitHub community feedback. p2 This is a standard priority issue queued This issues is on the AWS team's backlog
Projects
None yet
Development

No branches or pull requests

4 participants