Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

apache-arrow marked as external dependency #345

Closed
hamilton opened this issue Nov 3, 2021 · 22 comments
Closed

apache-arrow marked as external dependency #345

hamilton opened this issue Nov 3, 2021 · 22 comments
Labels

Comments

@hamilton
Copy link

hamilton commented Nov 3, 2021

Hello folks ~

So thrilled to experiment w/ duckdb-wasm! Thank you for all your hard work.

I'm having a bit of trouble bundling duckdb-wasm in a simple case. I'm using Rollup and opting for the simplest possible input file:

import * as duckdb from "@duckdb/duckdb-wasm/dist/duckdb-esm";
const JSDELIVR_BUNDLES = duckdb.getJsDelivrBundles();
console.log("resolving bundles");
async function resolve() {
  // Select a bundle based on browser checks
  const bundle = await duckdb.selectBundle(JSDELIVR_BUNDLES);
  // Instantiate the asynchronus version of DuckDB-Wasm
  const worker = new Worker(bundle.mainWorker);
  const logger = new duckdb.ConsoleLogger();
  const db = new duckdb.AsyncDuckDB(logger, worker);
  return await db.instantiate(bundle.mainModule, bundle.pthreadWorker);
}

(async () => {
  const db = await resolve();
  console.log(db);
})();

But rollup produces this error:

rollup v2.59.0
bundles src/main.js → public/poc.js...
(!) Unresolved dependencies
https://rollupjs.org/guide/en/#warning-treating-module-as-external-dependency
apache-arrow (imported by node_modules/@duckdb/duckdb-wasm/dist/duckdb-esm.js)
(!) Missing global variable name
Use output.globals to specify browser global variable names corresponding to external modules
apache-arrow (guessing 'apacheArrow')

Here is a repository that hopefully reproduces this (including my rollup file, includes instructions): https://github.com/hamilton/duckdb-wasm-issues

Is there something obvious that I'm missing here? (There usually is!)

@ankoh
Copy link
Member

ankoh commented Nov 3, 2021

Thanks for trying it out!

Bundling duckdb-wasm is less straight-forward than I anticipated due to currently broken bundles of apache-arrow 6.0.0.
A fix is already on the arrow master but we'll have to wait for 6.0.1 until this gets easier.
There are workarounds for webpack and esbuild but I haven't checked rollup yet.
You have to override the dependencies to resolve apache-arrow as @duckdb/duckdb-wasm/dist/duckdb-esm right now.

I'll add examples for the different bundlers as soon as we get arrow 6.0.1.

@hamilton
Copy link
Author

hamilton commented Nov 3, 2021

Thanks for the quick response @ankoh! Let me know if you'd like any assistance with documentation or examples when 6.0.1 lands.

@ankoh ankoh added the examples label Nov 3, 2021
@llimllib
Copy link
Contributor

llimllib commented Nov 4, 2021

@hamilton here's the esbuild build script I'm using to build with a working version of apache-arrow, not sure exactly how to do it in rollup though

@peder1001
Copy link

Also looking forward to this.

Looks like 6.0.1 just released? https://github.com/apache/arrow/releases/tag/apache-arrow-6.0.1
Hopefully won't take them too long to update npm.

@peder1001
Copy link

Apache Arrow is now at 6.0.1 on NPM.
https://www.npmjs.com/package/apache-arrow

@llimllib
Copy link
Contributor

I still am not able to build against it, I tried removing my hack and it didn’t work

@ankoh
Copy link
Member

ankoh commented Nov 19, 2021

So you installed the pre-released version 0.1.13-dev8.0 with

npm install @duckdb/duckdb-wasm@0.1.13-dev8.0

and imported DuckDB-Wasm using only:

import * as duckdb from "@duckdb/duckdb-wasm";

?
I tested this with esbuild and webpack, i'll check rollup later
we definitely need test the major bundlers in the ci

@llimllib
Copy link
Contributor

llimllib commented Nov 19, 2021

Yes, here's a repro with esbuild. From a fresh directory:

$ npm install @duckdb/duckdb-wasm@0.1.13-dev8.0 esbuild &&   \
    echo 'import * as duckdb from "@duckdb/duckdb-wasm";' >> test.js && \
    node_modules/.bin/esbuild test.js --bundle --outfile=out.js

added 35 packages, and audited 36 packages in 842ms

found 0 vulnerabilities
 > node_modules/apache-arrow/io/node/builder.js:20:25: error: Could not resolve "stream" (use "--platform=node" when building for node)
    20 │ const stream_1 = require("stream");
       ╵                          ~~~~~~~~

 > node_modules/apache-arrow/io/node/reader.js:21:25: error: Could not resolve "stream" (use "--platform=node" when building for node)
    21 │ const stream_1 = require("stream");
       ╵                          ~~~~~~~~

 > node_modules/apache-arrow/io/node/iterable.js:21:25: error: Could not resolve "stream" (use "--platform=node" when building for node)
    21 │ const stream_1 = require("stream");
       ╵                          ~~~~~~~~

 > node_modules/apache-arrow/io/node/writer.js:21:25: error: Could not resolve "stream" (use "--platform=node" when building for node)
    21 │ const stream_1 = require("stream");
       ╵                          ~~~~~~~~

4 errors

how did you get that to build with esbuild?

@llimllib
Copy link
Contributor

I did manage to get rid of Could not resolve "apache-arrow" though, with all upgrades applied, and if I npm install -D stream it builds

@llimllib
Copy link
Contributor

but then actually using the build fails, the browser console gives:

Uncaught TypeError: class heritage stream_1.Readable is not an object or null
    js iterable.ts:37
    __require viewer_duckdb.js:9
    <anonymous> Arrow.node.ts:22

@domoritz
Copy link
Collaborator

With es build, you may need apache/arrow#11742. I'll try to get it released soon.

@peder1001
Copy link

peder1001 commented Nov 19, 2021

Not sure if its useful. But I tried to use it in an Angular project now, and get build errors:

(using @duckdb/duckdb-wasm@0.1.13-dev8.0, and import * as duckdb from '@duckdb/duckdb-wasm';

`Error: node_modules/@duckdb/duckdb-wasm/dist/types/src/bindings/duckdb_module.d.ts:1:23 - error TS2688: Cannot find type definition file for 'emscripten'.

1 ///
~~~~~~~~~~

Error: node_modules/@duckdb/duckdb-wasm/dist/types/src/bindings/duckdb_module.d.ts:6:39 - error TS2304: Cannot find name 'EmscriptenModule'.

6 export interface DuckDBModule extends EmscriptenModule {
~~~~~~~~~~~~~~~~

Error: node_modules/@duckdb/duckdb-wasm/dist/types/src/bindings/duckdb_module.d.ts:7:23 - error TS2304: Cannot find name 'stackSave'.

7 stackSave: typeof stackSave;
~~~~~~~~~

Error: node_modules/@duckdb/duckdb-wasm/dist/types/src/bindings/duckdb_module.d.ts:8:24 - error TS2304: Cannot find name 'stackAlloc'.

8 stackAlloc: typeof stackAlloc;
~~~~~~~~~~

Error: node_modules/@duckdb/duckdb-wasm/dist/types/src/bindings/duckdb_module.d.ts:9:26 - error TS2304: Cannot find name 'stackRestore'.

9 stackRestore: typeof stackRestore;
~~~~~~~~~~~~

Error: node_modules/@duckdb/duckdb-wasm/dist/types/src/bindings/duckdb_module.d.ts:10:19 - error TS2304: Cannot find name 'ccall'.

10 ccall: typeof ccall;
~~~~~

Error: node_modules/@duckdb/duckdb-wasm/dist/types/src/bindings/runtime.d.ts:1:23 - error TS2688: Cannot find type definition file for 'emscripten'.

1 ///
~~~~~~~~~~

Error: node_modules/@duckdb/duckdb-wasm/dist/types/src/bindings/runtime.d.ts:27:87 - error TS2503: Cannot find namespace 'Emscripten'.

27 export declare function callSRet(mod: DuckDBModule, funcName: string, argTypes: Array<Emscripten.JSType>, args: Array): [number, number, number];

etc.

@ankoh
Copy link
Member

ankoh commented Nov 19, 2021

Our benchmarks are built against the esm using esbuild but they are on node, so it works there.
Regarding the typing issue, you'll need @types/emscripten. I'll move it to the dependencies.

@ankoh
Copy link
Member

ankoh commented Nov 19, 2021

@peder1001 in the meantime, you could use @duckdb/duckdb-wasm/dist/duckdb-browser.js directly. It's an iife.

@ankoh
Copy link
Member

ankoh commented Nov 19, 2021

Also, we're using this tiny hack with esbuild to fix the issue with the arrow exports:
https://github.com/duckdb/duckdb-wasm/blob/master/packages/duckdb-wasm/bundle.mjs#L21

@peder1001
Copy link

Thanks ankoh, and now its more back to apache-arrow, so I will have to try to figure it out..

Error: node_modules/apache-arrow/Arrow.d.ts:54:466 - error TS2304: Cannot find name 'ReadableStreamReadValueResult'.
Error: node_modules/apache-arrow/Arrow.d.ts:54:641 - error TS2304: Cannot find name 'ReadableStreamReadDoneResult'.
Error: node_modules/apache-arrow/interfaces.d.ts:180:5 - error TS2502: '[Type.FixedSizeList]' is referenced directly or indirectly in its own type annotation.
Error: node_modules/apache-arrow/io/interfaces.d.ts:57:51 - error TS2304: Cannot find name 'PipeOptions'.
Error: node_modules/apache-arrow/recordbatch.d.ts:17:18 - error TS2430: Interface 'RecordBatch' incorrectly extends interface 'StructVector'.

I guess your tiny hack will solve it, just need to figure out what it does and implement it in the angular compiler

@ankoh
Copy link
Member

ankoh commented Nov 21, 2021

@peder1001 @hamilton @llimllib
I reworked the bundling this weekend.

You can read more about the current bundles here:

// Current bundling strategy
//
// We actually aim to be an ESM-only package but thats not possible for several reasons today.
//
// A) Karma does not support esm tests which has the following consequences:
// A.1) tests-browser.js needs to be an iife
// A.2) The worker scripts need to stay a iife since Karma can't import them otherwise (import.meta.url)
// B) Users that bundle our main modules might not want to also bundle our workers themselves, therefore:
// B.1) The workers remain self-contained iife and don't need to be bundled.
// B.2) That also allows us to host the iife workers on jsdelivr/unpkg.
// C) On node, we dynamically require "stream" (via apache-arrow) so node bundles have to stay commonjs for now.
//
// Bundles:
// duckdb-browser.mjs - ESM Default Browser Bundle
// duckdb-browser-blocking.mjs - ESM Blocking Browser Bundle (synchronous API, unstable)
// duckdb-browser.worker.js - IIFE Web Worker for Wasm MVP
// duckdb-browser-next.worker.js - IIFE Web Worker with Wasm EH
// duckdb-browser-next-coi.worker.js - IIFE Web Worker with Wasm EH + COI
// duckdb-browser-next-coi.pthread.worker.js - IIFE PThread Worker with Wasm EH + COI
// duckdb-node.cjs - CommonJS Default Node Bundle
// duckdb-node-blocking.cjs - CommonJS Blocking Node Bundle (synchronous API, unstable)
// duckdb-node.worker.cjs - CommonJS Worker for Wasm MVP
// duckdb-node-next.worker.cjs - CommonJS Worker with Wasm EH
// tests-browser.js - IIFE Jasmine Karma tests
// tests-node.cjs - CommonJS Jasmine Node tests
//
// The lack of alternatives for Karma won't allow us to bundle workers and tests as ESM.
// We should upgrade all CommonJS bundles to ESM as soon as the dynamic requires are resolved.

The problems with apache-arrow are defused for now by bundling apache-arrow into the browser (esm) bundles.
(You don't need to patch the export mapping of apache-arrow yourself)

Here are very basic examples for:
bare + browser
bare + node
esbuild + browser
esbuild + node

I'll extend the list soon.

@ankoh
Copy link
Member

ankoh commented Nov 21, 2021

DuckDB-Wasm now ships with a more fine granular export map.

"main:": "dist/duckdb-browser.mjs",
"types": "dist/duckdb-browser.d.ts",
"jsdelivr": "dist/duckdb-browser.mjs",
"unpkg": "dist/duckdb-browser.mjs",
"browser": {
"fs": false,
"path": false,
"perf_hooks": false,
"os": false,
"worker_threads": false
},
"exports": {
"./dist/duckdb.wasm": "./dist/duckdb.wasm",
"./dist/duckdb-next.wasm": "./dist/duckdb-next.wasm",
"./dist/duckdb-next-coi.wasm": "./dist/duckdb-next-coi.wasm",
"./dist/duckdb-browser": "./dist/duckdb-browser.mjs",
"./dist/duckdb-browser-blocking": "./dist/duckdb-browser-blocking.mjs",
"./dist/duckdb-browser-next-coi.pthread.worker.js": "./dist/duckdb-browser-next-coi.pthread.worker.js",
"./dist/duckdb-browser-next-coi.worker.js": "./dist/duckdb-browser-next-coi.worker.js",
"./dist/duckdb-browser-next.worker.js": "./dist/duckdb-browser-next.worker.js",
"./dist/duckdb-browser.worker.js": "./dist/duckdb-browser.worker.js",
"./dist/duckdb-node": "./dist/duckdb-node.cjs",
"./dist/duckdb-node-blocking": "./dist/duckdb-node-blocking.cjs",
"./dist/duckdb-node-next.worker.cjs": "./dist/duckdb-node-next.worker.cjs",
"./dist/duckdb-node.worker.cjs": "./dist/duckdb-node.worker.cjs",
"./blocking": {
"browser": {
"types": "./dist/duckdb-browser-blocking.d.ts",
"import": "./dist/duckdb-browser-blocking.mjs"
},
"node": {
"types": "./dist/duckdb-node-blocking.d.ts",
"require": "./dist/duckdb-node-blocking.cjs",
"import": "./dist/duckdb-node-blocking.cjs"
}
},
".": {
"browser": {
"types": "./dist/duckdb-browser.d.ts",
"import": "./dist/duckdb-browser.mjs"
},
"node": {
"types": "./dist/duckdb-node.d.ts",
"require": "./dist/duckdb-node.cjs",
"import": "./dist/duckdb-node.cjs"
}
}
}
}

In most scenarios, a simple import * as duckdb from "@duckdb/duckdb-wasm" should be enough.
For browser builds, this should automatically resolve to import * as duckdb from "@duckdb/duckdb-wasm/dist/duckdb-browser.mjs"

Our main browser bundles are esm, our worker scripts are iife, our node bundles are commonjs.

@llimllib
Copy link
Contributor

This is great @ankoh ! Will be very helpful to people getting started

@llimllib
Copy link
Contributor

tested it in my app and I was able to remove the name resolution hack, this is great

@ankoh
Copy link
Member

ankoh commented Nov 22, 2021

Closing this for now. @hamilton let me know if the problem persists.

@ankoh ankoh closed this as completed Nov 22, 2021
@hamilton
Copy link
Author

hamilton commented Dec 3, 2021

Thanks @ankoh! Will be trying now, and probably time to turn back on my GH notifications :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants