[Web] WebGPU and WASM Backends Unavailable within Service Worker #20876

ggaabe · 2024-05-30T23:09:14Z

Describe the issue

I'm running into issues trying to use the WebGPU or WASM backends inside of a ServiceWorker (on a chrome extension). More specifically, I'm attempting to use Phi-3 with transformers.js v3

Every time I attempt this, I get the following error:

Uncaught (in promise) Error: no available backend found. ERR: [webgpu] 
TypeError: import() is disallowed on ServiceWorkerGlobalScope by the HTML specification. 
See https://github.com/w3c/ServiceWorker/issues/1356.

This is originating in the InferenceSession class in js/common/lib/inference-session-impl.ts.

More specifically, it's happening in this method:
const [backend, optionsWithValidatedEPs] = await resolveBackendAndExecutionProviders(options);
where the implementation is in js/common/lib/backend-impl.ts and the tryResolveAndInitializeBackend fails to initialize any of the execution providers.

WebGPU is now supported in ServiceWorkers though; it is a recent change and it should be feasible. Here were the chrome release notes.

Additionally, here is an example browser extension from the mlc-ai/web-llm framework that implements WebGPU usage in service workers successfully:
https://github.com/mlc-ai/web-llm/tree/main/examples/chrome-extension-webgpu-service-worker

Here is some further discussion on this new support from Google itself:
https://groups.google.com/a/chromium.org/g/chromium-extensions/c/ZEcSLsjCw84/m/WkQa5LAHAQAJ

So technically I think it should be possible for this to be supported now? Unless I'm doing something else glaringly wrong. Is it possible to add support for this?

To reproduce

Download and set up the transformers.js extension example and put this into the background.js file:

// background.js - Handles requests from the UI, runs the model, then sends back a response

import {
  pipeline,
  env,
  AutoModelForCausalLM,
  AutoTokenizer,
  TextStreamer,
  StoppingCriteria,
} from "@xenova/transformers";

// Skip initial check for local models, since we are not loading any local models.
env.allowLocalModels = false;

// Due to a bug in onnxruntime-web, we must disable multithreading for now.
// See https://github.com/microsoft/onnxruntime/issues/14445 for more information.
env.backends.onnx.wasm.numThreads = 1;

class CallbackTextStreamer extends TextStreamer {
  constructor(tokenizer, cb) {
    super(tokenizer, {
      skip_prompt: true,
      skip_special_tokens: true,
    });
    this.cb = cb;
  }

  on_finalized_text(text) {
    this.cb(text);
  }
}

class InterruptableStoppingCriteria extends StoppingCriteria {
  constructor() {
    super();
    this.interrupted = false;
  }

  interrupt() {
    this.interrupted = true;
  }

  reset() {
    this.interrupted = false;
  }

  _call(input_ids, scores) {
    return new Array(input_ids.length).fill(this.interrupted);
  }
}

const stopping_criteria = new InterruptableStoppingCriteria();

async function hasFp16() {
  try {
    const adapter = await navigator.gpu.requestAdapter();
    return adapter.features.has("shader-f16");
  } catch (e) {
    return false;
  }
}

class PipelineSingleton {
  static task = "feature-extraction";
  static model_id = "Xenova/Phi-3-mini-4k-instruct_fp16";
  static model = null;
  static instance = null;

  static async getInstance(progress_callback = null) {
    this.model_id ??= (await hasFp16())
      ? "Xenova/Phi-3-mini-4k-instruct_fp16"
      : "Xenova/Phi-3-mini-4k-instruct";

    this.tokenizer ??= AutoTokenizer.from_pretrained(this.model_id, {
      legacy: true,
      progress_callback,
    });

    this.model ??= AutoModelForCausalLM.from_pretrained(this.model_id, {
      dtype: "q4",
      device: "webgpu",
      use_external_data_format: true,
      progress_callback,
    });

    return Promise.all([this.tokenizer, this.model]);
  }
}

// Create generic classify function, which will be reused for the different types of events.
const classify = async (text) => {
  // Get the pipeline instance. This will load and build the model when run for the first time.
  const [tokenizer, model] = await PipelineSingleton.getInstance((data) => {
    // You can track the progress of the pipeline creation here.
    // e.g., you can send `data` back to the UI to indicate a progress bar
    console.log("progress", data);
    // data logs as this:
    /**
     * 
     * {
    "status": "progress",
    "name": "Xenova/Phi-3-mini-4k-instruct_fp16",
    "file": "onnx/model_q4.onnx",
    "progress": 99.80381792394503,
    "loaded": 836435968,
    "total": 838080131
  }

  when complete, last status will be 'done'
     */
  });
  /////////////
  const inputs = tokenizer.apply_chat_template(text, {
    add_generation_prompt: true,
    return_dict: true,
  });

  let startTime;
  let numTokens = 0;
  const cb = (output) => {
    startTime ??= performance.now();

    let tps;
    if (numTokens++ > 0) {
      tps = (numTokens / (performance.now() - startTime)) * 1000;
    }
    self.postMessage({
      status: "update",
      output,
      tps,
      numTokens,
    });
  };

  const streamer = new CallbackTextStreamer(tokenizer, cb);

  // Tell the main thread we are starting
  self.postMessage({ status: "start" });

  const outputs = await model.generate({
    ...inputs,
    max_new_tokens: 512,
    streamer,
    stopping_criteria,
  });
  const outputText = tokenizer.batch_decode(outputs, {
    skip_special_tokens: false,
  });

  // Send the output back to the main thread
  self.postMessage({
    status: "complete",
    output: outputText,
  });
  ///////////////

  // Actually run the model on the input text
  // let result = await model(text);
  // return result;
};

////////////////////// 1. Context Menus //////////////////////
//
// Add a listener to create the initial context menu items,
// context menu items only need to be created at runtime.onInstalled
chrome.runtime.onInstalled.addListener(function () {
  // Register a context menu item that will only show up for selection text.
  chrome.contextMenus.create({
    id: "classify-selection",
    title: 'Classify "%s"',
    contexts: ["selection"],
  });
});

// Perform inference when the user clicks a context menu
chrome.contextMenus.onClicked.addListener(async (info, tab) => {
  // Ignore context menu clicks that are not for classifications (or when there is no input)
  if (info.menuItemId !== "classify-selection" || !info.selectionText) return;

  // Perform classification on the selected text
  let result = await classify(info.selectionText);

  // Do something with the result
  chrome.scripting.executeScript({
    target: { tabId: tab.id }, // Run in the tab that the user clicked in
    args: [result], // The arguments to pass to the function
    function: (result) => {
      // The function to run
      // NOTE: This function is run in the context of the web page, meaning that `document` is available.
      console.log("result", result);
      console.log("document", document);
    },
  });
});
//////////////////////////////////////////////////////////////

////////////////////// 2. Message Events /////////////////////
//
// Listen for messages from the UI, process it, and send the result back.
chrome.runtime.onMessage.addListener((message, sender, sendResponse) => {
  console.log("sender", sender);
  if (message.action !== "classify") return; // Ignore messages that are not meant for classification.

  // Run model prediction asynchronously
  (async function () {
    // Perform classification
    let result = await classify(message.text);

    // Send response back to UI
    sendResponse(result);
  })();

  // return true to indicate we will send a response asynchronously
  // see https://stackoverflow.com/a/46628145 for more information
  return true;
});

Urgency

this would help enable a new ecosystem to build up around locally intelligent browser extensions and tooling.

it's urgent for me because it would be fun to build and I want to build it and it would be fun to be building it rather than not be building it.

ONNX Runtime Installation

Built from Source

ONNX Runtime Version or Commit ID

1.19.0-dev.20240509-69cfcba38a

Execution Provider

'webgpu' (WebGPU)

The text was updated successfully, but these errors were encountered:

fs-eire · 2024-05-31T10:14:31Z

Than you for reporting this issue. I will try to figure out how to fix this problem.

fs-eire · 2024-06-02T01:18:01Z

So it turns out to be that dynamic import (ie. import()) and top-level await is not supported in current service worker. I was not expecting that import() is banned in SW.

Currently, the WebAssembly factory (wasm-factory.ts) uses dynamic import to load the JS glue. This does not work in service worker. A few potential solutions are also not available:

Modifying it to import statement: won't work, because the JS glue includes top-level await.
Using importScripts: won't work, because the JS glue is ESM
Using eval: won't work; same to importScripts

I am now trying to make a JS bundle that does not use dynamic import for usage of service worker specifically. Still working on it

ggaabe · 2024-06-02T03:03:34Z

Thanks, I appreciate your efforts around this. It does seem like some special-case bundle will need to be built after all; you might need iife or umd for the bundler output format

fs-eire · 2024-06-02T04:41:03Z

Thanks, I appreciate your efforts around this. It does seem like some special-case bundle will need to be built after all; you might need iife or umd for the bundler output format

I have considered this option. However, Emscripten does not offer an option to output both UMD(IIFE+CJS) & ESM for JS glue (emscripten-core/emscripten#21899). I have to choose either. I choose the ES6 format output for the JS glue, because of a couple of problems when import UMD from ESM, and import() is a standard way to import ESM from both ESM and UMD. ( Until I know its not working in service worker by this issue)

I found a way to make ORT web working, - yes this need the build script to do some special handling. And this will only work for ESM, because the JS glue is ESM and it seems no way to import ESM from UMD in service worker.

### Description  This PR allows to build ORT web to `ort{.all|.webgpu}.bundle.min.mjs`, which does not have any dynamic import. This makes it possible to use ort web via static import in service worker. Fixes #20876

fs-eire · 2024-06-05T06:08:40Z

@ggaabe Could you please help to try import * as ort from “./ort.webgpu.bundle.min.js” from version 1.19.0-dev.20240604-3dd6fcc089 ?

ggaabe · 2024-06-06T00:35:27Z

@fs-eire my project is dependent on transformersjs, which imports onnxruntime webgpu backend like this here:

https://github.com/xenova/transformers.js/blob/v3/src/backends/onnx.js#L24

Is this the right usage? In my project I've added this to my package.json to resolve onnx-runtime to this new version though the issue is still occurring:

  "overrides": {
    "onnxruntime-web": "1.19.0-dev.20240604-3dd6fcc089"
  }

ggaabe · 2024-06-06T00:45:27Z

Maybe also important: The same error is still occurring in same spot in inference session in the onnx package and not from transformersjs. Do I need to add a resolver for onnxruntime-common as well?

fs-eire · 2024-06-10T23:04:06Z

#20991 makes default ESM import to use non-dynamic-import and hope this change may fix this problem. PR is still in progress

ggaabe · 2024-06-12T22:36:00Z

Hi @fs-eire, is the newly-merged fix in a released build I can try?

fs-eire · 2024-06-13T03:39:18Z

Please try 1.19.0-dev.20240612-94aa21c3dd

ggaabe · 2024-06-13T23:24:04Z

@fs-eire EDIT: Nvm the comment I just deleted, that error was because I didn't set the webpack target to webworker.

However, I'm getting a new error now (progress!):

Error: no available backend found. ERR: [webgpu] RuntimeError: null function or function signature mismatch

ggaabe · 2024-06-13T23:33:52Z

Update: Found the error is happening in here:

onnxruntime/js/common/lib/backend-impl.ts

Lines 83 to 86 in fff68c3

    
           if (!isInitializing) { 
        
             backendInfo.initPromise = backendInfo.backend.init(backendName); 
        
           } 
        
           await backendInfo.initPromise;

For some reason the webgpu backend.init promise is rejecting due to the null function or function signature mismatch error. This is much further along than we were before though.

fs-eire · 2024-06-14T05:46:47Z

Update: Found the error is happening in here:

onnxruntime/js/common/lib/backend-impl.ts

Lines 83 to 86 in fff68c3

if (!isInitializing) {

backendInfo.initPromise = backendInfo.backend.init(backendName);

}

await backendInfo.initPromise;

For some reason the webgpu backend.init promise is rejecting due to the null function or function signature mismatch error. This is much further along than we were before though.

Could you share me the reproduce steps?

ggaabe · 2024-06-14T14:39:55Z

@fs-eire You'll need to run the webGPU setup in a chrome extension.

You can use my code I just published here: https://github.com/ggaabe/extension
run npm install
run npm run build
open the chrome manage extensions

load unpacked

select the build folder from the repo.
open the AI WebGPU Extension extension
type some text in the text input. it will load Phi-3 mini and after finishing loading this error will occur
if you view the extension in the extension in the extension manager and select the "Inspect views
service worker" link before opening the extension it will bring up an inspection window to view the errors as they occur. A little "errors" bubble link also shows up here after they occur.

You will need to click the "Refresh" button on the extension in the extension manager to rerun the error because it does not attempt reloading the model after the first attempt until another refresh

fs-eire · 2024-06-18T00:46:03Z

@ggaabe I did some debug on my box and made some fixes -

Changes to ONNXRuntime Web:

[js/web] skip default locateFile() when dynamic import is disabled #21073 is created to make sure the web assembly file can be loaded correctly when env.wasm.wasmPaths is not specified.
Changes to https://github.com/ggaabe/extension

fix ORT wasm loading ggaabe/extension#1 need to be made to the extension example, to make it load the model correctly. Please note:
- The onnxruntime-web version need to be updated to consume changes from (1) (after it get merged and published for dev channel)
- There are still errors in background.js, which looks like incorrect params passed to tokenizer.apply_chat_template(). However, the WebAssembly is initialized and the model loaded successfully.
Other issues:
- Transformerjs overrides env.wasm.wasmPaths to a CDN URL internally. At least for this example, we don't want this behavior so we need to reset it to undefined to keep the default behavior.
- Multi-threaded CPU EP is not supported because Worker is not accessible in service worker. Issue tracking: Allow workers & shared workers to be created within a service worker whatwg/html#8362

ggaabe · 2024-06-18T20:50:21Z

Awesome, thank you for your thoroughness in explaining this and tackling this head on. Is there a dev channel version I can test out?

fs-eire · 2024-06-18T23:03:03Z

Not yet. Will update here once it is ready.

ggaabe · 2024-06-23T01:34:38Z

sorry to bug; is there any dev build number? wasn't sure how often a release runs

fs-eire · 2024-06-23T20:51:23Z

sorry to bug; is there any dev build number? wasn't sure how often a release runs

Please try 1.19.0-dev.20240621-69d522f4e9

ggaabe · 2024-06-23T22:14:34Z

@fs-eire I'm getting one new error:

ort.webgpu.bundle.min.mjs:6 Uncaught (in promise) Error: The data is not on CPU. Use `getData()` to download GPU data to CPU, or use `texture` or `gpuBuffer` property to access the GPU data directly.
    at get data (ort.webgpu.bundle.min.mjs:6:13062)
    at get data (tensor.js:62:1)

I pushed the code changes to my repo and fixed the call to the tokenizer. To reproduce, just type 1 letter in the chrome extension’s text input and wait

nickl1234567 · 2024-06-24T12:22:00Z

Hey, I also need this. I am struggling with importing this version. So far I have been importing ONNX using
import * as ort from "https://cdn.jsdelivr.net/npm/onnxruntime-web/dist/esm/ort.webgpu.min.js".
However, when I change to import * as ort from "https://cdn.jsdelivr.net/npm/onnxruntime-web@1.19.0-dev.20240621-69d522f4e9/dist/esm/ort.webgpu.min.js" it seems not to have an .../esm/ folder. Do you know why that is and how to import it then?

fs-eire · 2024-06-24T16:11:06Z

Hey, I also need this. I am struggling with importing this version. So far I have been importing ONNX using import * as ort from "https://cdn.jsdelivr.net/npm/onnxruntime-web/dist/esm/ort.webgpu.min.js". However, when I change to import * as ort from "https://cdn.jsdelivr.net/npm/onnxruntime-web@1.19.0-dev.20240621-69d522f4e9/dist/esm/ort.webgpu.min.js" it seems not to have an .../esm/ folder. Do you know why that is and how to import it then?

just replace .../esm/ort.webgpu.min.js to .../ort.webgpu.min.mjs should work. If you are also using service worker, use ort.webgpu.bundle.min.mjs instead of ort.webgpu.min.mjs.

fs-eire · 2024-06-24T16:13:16Z

@fs-eire I'm getting one new error:
ort.webgpu.bundle.min.mjs:6 Uncaught (in promise) Error: The data is not on CPU. Use `getData()` to download GPU data to CPU, or use `texture` or `gpuBuffer` property to access the GPU data directly.
    at get data (ort.webgpu.bundle.min.mjs:6:13062)
    at get data (tensor.js:62:1)
I pushed the code changes to my repo and fixed the call to the tokenizer. To reproduce, just type 1 letter in the chrome extension’s text input and wait

This may be a problem of transformerjs. Could you try whether this problem happen in a normal page? If so, can report the issue to transformerjs. If it's only happening in service worker, I can take a closer look

kyr0 · 2024-07-06T17:59:03Z

@fs-eire I can verify that using 1.19.0-dev.20240621-69d522f4e9 loading a model using webgpu in a service worker works - even in a web extension. The necessary code is:

import * as ONNX_WEBGPU from "onnxruntime-web/webgpu";

// any Blob that contains a valid ORT model would work
// I'm using Xenova/multilingual-e5-small/onnx/model_quantized.with_runtime_opt.ort
const buffer = await mlModel.blob.arrayBuffer();  
  
const sessionwebGpu = await ONNX_WEBGPU.InferenceSession.create(buffer, {
  executionProviders: ["webgpu"],
});
console.log("Loading embedding model using sessionwebGpu", sessionwebGpu);

Results in a successful execution, yay! 💯 :)

I think we can ignore the warning, printed as an error, as the session loads.

WebAssembly would work in a Service Worker. Just because Service Workers are limited in their ability to load external resources such as WASM runtime files as Blob or ArrayBuffer doesn't mean you can't get such data transferred into the Service Worker context. In fact, you can transfer Gigabytes instantly using MessageChannel and the concept of Transferable objects.

Passing down a Blob/ArrayBuffer from a content script to a background worker/service worker even works, standard-compliant, with Web Extensions, as I demonstrate here: w3c/webextensions#293 (comment)

It's even much simpler for non-Web-Extension use cases as you simply only use the self.onmessage API in a service worker to receive a MessageChannel object and via a port of it, receive the Blob or ArrayBuffer.

I'm aware that the current implementation hard-codes a few things. Like importWasmModule() is trying to import the Emscripten runtime JS and by default, Emscripten is trying to import the WASM binary. But this isn't something that needs to be engraved into stone...

The Emscripten runtime code can be imported in userland code like this:

import ortWasmRuntime from "onnxruntime-web/dist/ort-wasm-simd-threaded"

As the node_modules show:

The runtime exports a default runtime function:

You can easily override the default Emscripten WASM binary module loader by a custom Emscripten WASM module loader, that allows for an ArrayBuffer to be passed by reference:

Module['instantiateWasm'] = async(imports, onSuccess) => {
  let result;
  if (WebAssembly.instantiateStreaming) {
    result = WebAssembly.instantiateStreaming(Module["wasmModule"], imports);
  } else {
    result = await WebAssembly.instantiate(Module["wasmModule"], imports)
  }
  return onSuccess(result.instance, result.module)
};

Of course, we don't want it that way, but I mention it as this is the "documented way".

But as you know, you can also set Module["$option"] by passing these options as an object to the runtime factory function. In this case, the passed down runtime function, imported by userland code, exactly as you already do here:

{
    numThreads,
    // just conditionally merge in:
    instantiateWasm: ONNX_WASM.env.wasm.instantiateWasm
}

A proposal on how WASM could work as a PoC in service workers (and web extensions):

import * as ONNX_WASM from "onnxruntime-web/wasm";

// the difference is, that this will be bundled in by the user-land bundler, 
// while the conditional dynamic import that happens in the  ONNX runtime would not 
// as the trenary operator here: https://github.com/microsoft/onnxruntime/blob/83e0c6b96e77634dd648e890cead598b6e065cde/js/web/lib/wasm/wasm-utils-import.ts#L157 
// and all it's following code cannot be statically analyzed by bundlers; tree-shaking and inline cannot happen,
// so bundler will be forced to generate dynamic import() code
// this could also lead to downstream issues with the transformersjs package and other packages / bundler combinations,
// while this is explicit and inlined
import ortWasmRuntime from "onnxruntime-web/dist/ort-wasm-simd-threaded"

// could maybe be passed a Blob via https://emscripten.org/docs/api_reference/module.html#Module.mainScriptUrlOrBlob
ONNX_WASM.env.wasm.proxy = false;

// instead of always calling importWasmModule() in wasm-factory.ts, allow to pass down the callback of the Emscripten JS runtime
ONNX_WASM.env.wasm.wasmRuntime = ortWasmRuntime;

// allow to also set a custom Emscripten loader
ONNX_WASM.env.wasm.instantiateWasm = async(imports, onSuccess) => {
  let result;
  if (WebAssembly.instantiateStreaming) {
    // please note that wasmRuntimeBlob comes from user-land code. It may be passed via a MessageChannel
    result = WebAssembly.instantiateStreaming(await wasmRuntimeBlob.arrayBuffer(), imports);
  } else {
    // please note that wasmRuntimeBlob comes from user-land code. It may be passed via a MessageChannel
    result = await WebAssembly.instantiate(await wasmRuntimeBlob.arrayBuffer(), imports)
  }
  return onSuccess(result.instance, result.module)
}

// then continuing as usual
// please note that mlModel comes from user-land code. It may have been passed via a MessageChannel
const modelBuffer = await mlModel.blob.arrayBuffer();
const sessionWasm = await ONNX_WASM.InferenceSession.create(buffer, {
  executionProviders: ["wasm"],
});
console.log("Loading embedding model using sessionWasm", sessionWasm);

So with a 1 LoC change (using passed down runtime callback) here, and 1 LoC change here, (add the instantiateWasm callback reference), the WebAssembly backend should work as well in Service Workers, if I'm not mistaken in this 4D chess pseudo-code, reverse engineering game.

Currently, when I call the WASM implementation:

import * as ONNX_WASM from "onnxruntime-web/wasm";

const sessionWasm = await ONNX_WASM.InferenceSession.create(buffer, {
  executionProviders: ["wasm"],
});
console.log("Loading embedding model using sessionWasm", sessionWasm);

Result:

Thank you for your help!

ChTiSh · 2024-07-07T20:59:14Z

I can confirm Web GPU is working for my little chrome extension app as well, but I'm having a problem disabling the warning.

kyr0 · 2024-07-08T14:03:16Z

@ChTiSh

I can confirm Web GPU is working for my little chrome extension app as well, but I'm having a problem disabling the warning.

You can numb it using a brittle monkey patch...

// store original reference
const originalConsole = self.console;

// override function reference with a new arrow function that does nothing
self.console.error = () => {}

// code will internally call the function that does nothing...
const sessionwebGpu = await ONNX_WEBGPU.InferenceSession.create(buffer, {
  executionProviders: ["webgpu"],
});

// still works, we did only replace the reference for the .error() function
console.log("Loading embedding model using sessionwebGpu", sessionwebGpu);

// restore the original function reference, so that console.error() works just as before
self.console.error = originalConsole.error;

But I agree.. it should probably be a console.warning() call if it is intended to be a warning

ChTiSh · 2024-07-08T16:16:28Z

Thank you so much!!!!! The whole time I was trying to change the ort log severity, now it's fast and beautiful!!!!

…

On Mon, Jul 8, 2024 at 7:03 AM Aron Homberg ***@***.***> wrote: @ChTiSh <https://github.com/ChTiSh> I can confirm Web GPU is working for my little chrome extension app as well, but I'm having a problem disabling the warning. You can numb it using a brittle monkey patch... const originalConsole = self.console; self.console.error = () => {} const sessionwebGpu = await ONNX_WEBGPU.InferenceSession.create(buffer, { executionProviders: ["webgpu"],});console.log("Loading embedding model using sessionwebGpu", sessionwebGpu); self.console.error = originalConsole.error; — Reply to this email directly, view it on GitHub <#20876 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AWQJC525QV2VY4WC2636BI3ZLKL3XAVCNFSM6AAAAABIRY45QKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMJUGE3DONRSHE> . You are receiving this because you were mentioned.Message ID: ***@***.***>

kyr0 · 2024-07-08T20:33:00Z

@ChTiSh You're welcome 🫶 Always happy to help :)

kyr0 · 2024-07-09T00:46:07Z

@fs-eire I'm getting one new error:
ort.webgpu.bundle.min.mjs:6 Uncaught (in promise) Error: The data is not on CPU. Use `getData()` to download GPU data to CPU, or use `texture` or `gpuBuffer` property to access the GPU data directly.
    at get data (ort.webgpu.bundle.min.mjs:6:13062)
    at get data (tensor.js:62:1)
I pushed the code changes to my repo and fixed the call to the tokenizer. To reproduce, just type 1 letter in the chrome extension’s text input and wait
This may be a problem of transformerjs. Could you try whether this problem happen in a normal page? If so, can report the issue to transformerjs. If it's only happening in service worker, I can take a closer look

Did data structures change for the Tensor class? Specifically dataLocationvs location? And if so, did it change consistently? I'm facing issues with data being undefined but cpuData being set (tokenizer result). But when I pass the data down to a BERT model, onnxruntime-web seems to expect a different data structure and checks location and data. Am I missing something or has something changed? Could this lead to downstream issues of code checking for the location and data properties mistakenly believing the data isn't there or not in the right place? I linked a downstream issue..

ggaabe added the platform:web issues related to ONNX Runtime web; typically submitted using template label May 30, 2024

fs-eire self-assigned this May 31, 2024

ggaabe mentioned this issue Jun 2, 2024

WebGPU and WASM Backends Unavailable within Service Worker (V3 issue) xenova/transformers.js#787

Open

5 tasks

fs-eire mentioned this issue Jun 2, 2024

[js/web] allow build target for non dynamic import #20898

Merged

fs-eire closed this as completed in #20898 Jun 3, 2024

fs-eire reopened this Jun 3, 2024

sophies927 added the ep:WebGPU ort-web webgpu provider label Jun 13, 2024

kyr0 mentioned this issue Jul 9, 2024

The data is not on CPU. Use getData() to download GPU data to CPU, or use texture or gpuBuffer property to access the GPU data directly. xenova/transformers.js#824

Open

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Web] WebGPU and WASM Backends Unavailable within Service Worker #20876

[Web] WebGPU and WASM Backends Unavailable within Service Worker #20876

ggaabe commented May 30, 2024 •

edited

Loading

fs-eire commented May 31, 2024

fs-eire commented Jun 2, 2024 •

edited

Loading

ggaabe commented Jun 2, 2024

fs-eire commented Jun 2, 2024 •

edited

Loading

fs-eire commented Jun 5, 2024

ggaabe commented Jun 6, 2024

ggaabe commented Jun 6, 2024

fs-eire commented Jun 10, 2024

ggaabe commented Jun 12, 2024

fs-eire commented Jun 13, 2024

ggaabe commented Jun 13, 2024

ggaabe commented Jun 13, 2024 •

edited

Loading

fs-eire commented Jun 14, 2024

ggaabe commented Jun 14, 2024 •

edited

Loading

fs-eire commented Jun 18, 2024

ggaabe commented Jun 18, 2024

fs-eire commented Jun 18, 2024 •

edited

Loading

ggaabe commented Jun 23, 2024

fs-eire commented Jun 23, 2024

ggaabe commented Jun 23, 2024 •

edited

Loading

nickl1234567 commented Jun 24, 2024

fs-eire commented Jun 24, 2024

fs-eire commented Jun 24, 2024

kyr0 commented Jul 6, 2024 •

edited

Loading

ChTiSh commented Jul 7, 2024

kyr0 commented Jul 8, 2024 •

edited

Loading

ChTiSh commented Jul 8, 2024 via email

kyr0 commented Jul 8, 2024

kyr0 commented Jul 9, 2024

[Web] WebGPU and WASM Backends Unavailable within Service Worker #20876

[Web] WebGPU and WASM Backends Unavailable within Service Worker #20876

Comments

ggaabe commented May 30, 2024 • edited Loading

Describe the issue

To reproduce

Urgency

ONNX Runtime Installation

ONNX Runtime Version or Commit ID

Execution Provider

fs-eire commented May 31, 2024

fs-eire commented Jun 2, 2024 • edited Loading

ggaabe commented Jun 2, 2024

fs-eire commented Jun 2, 2024 • edited Loading

fs-eire commented Jun 5, 2024

ggaabe commented Jun 6, 2024

ggaabe commented Jun 6, 2024

fs-eire commented Jun 10, 2024

ggaabe commented Jun 12, 2024

fs-eire commented Jun 13, 2024

ggaabe commented Jun 13, 2024

ggaabe commented Jun 13, 2024 • edited Loading

fs-eire commented Jun 14, 2024

ggaabe commented Jun 14, 2024 • edited Loading

fs-eire commented Jun 18, 2024

ggaabe commented Jun 18, 2024

fs-eire commented Jun 18, 2024 • edited Loading

ggaabe commented Jun 23, 2024

fs-eire commented Jun 23, 2024

ggaabe commented Jun 23, 2024 • edited Loading

nickl1234567 commented Jun 24, 2024

fs-eire commented Jun 24, 2024

fs-eire commented Jun 24, 2024

kyr0 commented Jul 6, 2024 • edited Loading

ChTiSh commented Jul 7, 2024

kyr0 commented Jul 8, 2024 • edited Loading

ChTiSh commented Jul 8, 2024 via email

kyr0 commented Jul 8, 2024

kyr0 commented Jul 9, 2024

ggaabe commented May 30, 2024 •

edited

Loading

fs-eire commented Jun 2, 2024 •

edited

Loading

fs-eire commented Jun 2, 2024 •

edited

Loading

ggaabe commented Jun 13, 2024 •

edited

Loading

ggaabe commented Jun 14, 2024 •

edited

Loading

fs-eire commented Jun 18, 2024 •

edited

Loading

ggaabe commented Jun 23, 2024 •

edited

Loading

kyr0 commented Jul 6, 2024 •

edited

Loading

kyr0 commented Jul 8, 2024 •

edited

Loading