Skip to content

NodeJS Local Files Only - Headers Not Defined & Incorrect Path Splitters #520

@axrati

Description

@axrati

System Info

Windows 10 - 10.0.19045 Build 19045
Alienware m17 R3
CPU - Intel i7-10750H

Node version:
v16.14.2

main.mjs:

import { pipeline } from '@xenova/transformers';

let pipe = await pipeline('feature-extraction','gte-small',{local_files_only:true});
let out = await pipe('Hey model! Respond to me!');

Package.json:

{
  "name": "js-hf",
  "version": "1.0.0",
  "description": "",
  "main": "main.mjs",
  "scripts": {
    "test": "echo \"Error: no test specified\" && exit 1"
  },
  "author": "",
  "license": "ISC",
  "dependencies": {
    "@xenova/transformers": "^2.14.0"
  }
}

Clone this repository directly into root of project:
https://huggingface.co/Supabase/gte-small

Your final project outlook will look like this:

--${YOUR_PROJ_NAME}
----- gte-small
----- node_modules
----- main.mjs
----- package-lock.json
----- package.json

Environment/Platform

  • Website/web-app
  • Browser extension
  • Server-side (e.g., Node.js, Deno, Bun)
  • Desktop app (e.g., Electron)
  • Other (e.g., VSCode extension)

Description

When trying to import models locally, it looks like there are still HTTP requests trying to be fired off. Expected behavior is that when local_files_only is true, that it would only try to use local files.

Secondarily, it looks like the paths to load assets is incorrect on a Windows computer. It is using / instead of \ for transformer assets. It also doesnt seem to be respecting relative path vs absolute path... perhaps that needs to be changed as well?

Error output:

Axrati@DESKTOP-H8KG7FT MINGW64 ~/Desktop/Code/js-hf
$ node main.mjs
Unable to load from local path "C:\Users\Axrati\Desktop\Code\js-hf\node_modules\@xenova\transformers\models\/gte-small/tokenizer.json": "ReferenceError: Headers is not defined"
Unable to load from local path "C:\Users\Axrati\Desktop\Code\js-hf\node_modules\@xenova\transformers\models\/gte-small/tokenizer_config.json": "ReferenceError: Headers is not defined"
Unable to load from local path "C:\Users\Axrati\Desktop\Code\js-hf\node_modules\@xenova\transformers\models\/gte-small/config.json": "ReferenceError: Headers is not defined"
file:///C:/Users/Axrati/Desktop/Code/js-hf/node_modules/@xenova/transformers/src/utils/hub.js:462
                    throw Error(`\`local_files_only=true\` or \`env.allowRemoteModels=false\` and file was not found locally at "${localPath}".`);
                          ^

Error: `local_files_only=true` or `env.allowRemoteModels=false` and file was not found locally at "C:\Users\Axrati\Desktop\Code\js-hf\node_modules\@xenova\transformers\models\/gte-small/tokenizer.j
    at getModelFile (file:///C:/Users/Axrati/Desktop/Code/js-hf/node_modules/@xenova/transformers/src/utils/hub.js:462:27)
    at async getModelJSON (file:///C:/Users/Axrati/Desktop/Code/js-hf/node_modules/@xenova/transformers/src/utils/hub.js:575:18)
    at async Promise.all (index 0)
    at async loadTokenizer (file:///C:/Users/Axrati/Desktop/Code/js-hf/node_modules/@xenova/transformers/src/tokenizers.js:61:18)
    at async Function.from_pretrained (file:///C:/Users/Axrati/Desktop/Code/js-hf/node_modules/@xenova/transformers/src/tokenizers.js:4296:50)
    at async Promise.all (index 0)
    at async loadItems (file:///C:/Users/Axrati/Desktop/Code/js-hf/node_modules/@xenova/transformers/src/pipelines.js:3115:5)
    at async pipeline (file:///C:/Users/Axrati/Desktop/Code/js-hf/node_modules/@xenova/transformers/src/pipelines.js:3055:21)
    at async file:///C:/Users/Axrati/Desktop/Code/js-hf/main.mjs:5:12

As you can see, the "Unable to read local path" is trying to reference node_modules\@xenova\transformers\models\/gte-small/tokenizer.json, which wouldn't be valid Windows path... That said, it looks to not be respecting the relative path (if you see the System Requirements section, you can see the model is a directory in the root of the project, and this is searching through your library in node_modules)

If you look at your code in https://github.com/xenova/transformers.js/blob/main/src/utils/hub.js, you can see on lines 55-56 that the constructor for a FileResponse is instantiating Headers. This leads me to believe that even if the getFile function had its first 2 criteria met (env.useFS && !isValidHttpUrl(urlOrPath))), that its still executing unnecessary code for the protocol its trying to use.

I am happy to help create a PR for this! Please reach out and let me know. Would be helpful to catch up with someone on the team for repo direction/etc.

Reproduction

Based on steps in Sys Reqs / Description

npm install
node main.mjs

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions