Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: change arrow from a direct dependency to a peer dependency #984

Merged
merged 4 commits into from
Feb 23, 2024

Conversation

westonpace
Copy link
Contributor

BREAKING CHANGE: users will now need to npm install apache-arrow and @apache-arrow/ts themselves.

@westonpace
Copy link
Contributor Author

Unfortunately, arrow-js makes a lot of instanceof comparisons. This is very brittle in node. As an example, the following will fail:

const arrow = require("apache-arrow")
const lancedb = require("vectordb")

async function main() {
    const vectorDb = await lancedb.connect("/tmp/my_lance");
    const schema = new arrow.Schema([
        new arrow.Field("my_field", new arrow.Int32(), false),
    ]);

    await vectorDb.createTable({
        name: "my_table",
        schema
    });
}

main().then(() => console.log("Done")).catch(err => console.log(err))

This is because arrow-js has a check somewhere that looks like type instanceof Int32 and that check fails because the user is passing in APP/arrow-js/Int32 and not APP/vectordb/arrow-js/Int32. The user's dependency tree will look like:

apache-arrow
vectordb
  |---apache-arrow

It will sometimes work, because node's dependency management can sometimes coalesce two dependencies together, but most of the time it will fail. For example, if the user just installs apache-arrow today they will get version 15. vectordb is currently using version 14. Npm cannot coalesce those two.

Making this a peer dependency is a bit of a breaking change. By doing so we are saying that vectordb is going to use the users version of apache-arrow. This means there is a single apache-arrow package, which is a good thing, but that package has to satisfy everyone's versioning requirements, which can lead to conflicts.

It means that the following will fail, but the error message is hopefully clear.

> npm install apache-arrow # will install version 15
> npm install vectordb
npm ERR! code ERESOLVE
npm ERR! ERESOLVE unable to resolve dependency tree
npm ERR! 
npm ERR! While resolving: undefined@undefined
npm ERR! Found: apache-arrow@15.0.0
npm ERR! node_modules/apache-arrow
npm ERR!   apache-arrow@"^15.0.0" from the root project
npm ERR! 
npm ERR! Could not resolve dependency:
npm ERR! peer apache-arrow@"^14.0.2" from vectordb@0.4.10
npm ERR! node_modules/vectordb
npm ERR!   vectordb@"file:../../lancedb/node/vectordb-0.4.10.tgz" from the root project
npm ERR! 
npm ERR! Fix the upstream dependency conflict, or retry
npm ERR! this command with --force, or --legacy-peer-deps
npm ERR! to accept an incorrect (and potentially broken) dependency resolution.
npm ERR! 
npm ERR! See /home/pace/.npm/eresolve-report.txt for a full report.

npm ERR! A complete log of this run can be found in:
npm ERR!     /home/pace/.npm/_logs/2024-02-17T14_47_24_459Z-debug-0.log

This is a complicated way of saying "your project is currently configured to use arrow 15. vectordb expects you to have arrow 14 somewhere. We can't fix this".

The solution would be to downgrade the arrow to 14.

Alternatively (or concurrently) we can work to fix the underlying problem in the upstream codebase.

@westonpace westonpace merged commit 8a52619 into lancedb:main Feb 23, 2024
9 checks passed
raghavdixit99 pushed a commit to raghavdixit99/lancedb that referenced this pull request Apr 5, 2024
…ancedb#984)

BREAKING CHANGE: users will now need to npm install `apache-arrow` and
`@apache-arrow/ts` themselves.
westonpace added a commit that referenced this pull request Apr 5, 2024
…984)

BREAKING CHANGE: users will now need to npm install `apache-arrow` and
`@apache-arrow/ts` themselves.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants