Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrate TFJS kernels into TVM #4

Open
gyagp opened this issue Mar 17, 2023 · 15 comments
Open

Integrate TFJS kernels into TVM #4

gyagp opened this issue Mar 17, 2023 · 15 comments
Assignees

Comments

@gyagp
Copy link
Member

gyagp commented Mar 17, 2023

Some kernels in TFJS may further improve the performance of TVM for the time being and Intel may provide them.

@tqchen
Copy link
Collaborator

tqchen commented Mar 17, 2023

One item that would be helpful is to provide nodejs program

tfjs_shader.js --create-shader conv2d_mm

which outputs the shader, then we can likely take over from there and start some initial integrations

@tqchen
Copy link
Collaborator

tqchen commented Mar 21, 2023

Followup comments, one thing to note is that the kernel can be shape dependent. So one thing that could be helpful instead is something like (where we pass in the input shape spec as well) that way we will be able to allow the tfjs side to get the related kernels

tfjs_shader.js --create-shader conv2d_mm --shapes "[224,224,3], [32, 32, 4, 4]"

@tqchen
Copy link
Collaborator

tqchen commented Mar 23, 2023

@qjia7 please let me know of the new additional shape config would be sufficient for shader dumping

@qjia7
Copy link

qjia7 commented Mar 27, 2023

@tqchen We are preparing PRs to dump shaders. The plan is to add a string flag (the dumped kernel name) to tell the backend which kernel to dump. The first PR is here.

@tqchen
Copy link
Collaborator

tqchen commented Mar 31, 2023

Thank you @qjia7 ! This seems to be a great step.

It would be super nice is to avoid the creation of the tfjs tensor and directly pass in the shape spec that would enable quite natural integration as the command showed above.

@axinging
Copy link
Contributor

@tqchen, for webgpu backend, print shader is now behind a flag WEBGPU_PRINT_SHADER(tensorflow/tfjs#7523). Here are examples.

Print shader in non-model mode

Open below page with urls like: index.html?WEBGPU_PRINT_SHADER=all, index.html?WEBGPU_PRINT_SHADER=binary, index.html?WEBGPU_PRINT_SHADER=binary,depth):

async function testWebGPUPrintShader() {
  tf.env().set('WEBGPU_CPU_FORWARD', false);
  await tf.setBackend('webgpu');
  await tf.ready();
  const re = getURLState(location.search);
  tf.env().set('WEBGPU_PRINT_SHADER', re);
  console.log(tf.env().get('WEBGPU_PRINT_SHADER'));
  // depthwise, matches 'depth'.
  {
    const fSize = 2;
    const pad = 'valid';
    const stride = 1;
    const chMul = 1;
    const inDepth = 1;

    const x = tf.tensor4d(
        [
          0.230664, 0.987388, 0.0685208, 0.419224, 0.887861, 0.731641,
          0.0741907, 0.409265, 0.351377
        ],
        [1, 3, 3, inDepth]);
    const w = tf.tensor4d(
        [0.303873, 0.229223, 0.144333, 0.803373],
        [fSize, fSize, inDepth, chMul],
    );

    const result = tf.depthwiseConv2d(x, w, stride, pad);
  }

  // add(sub,mul), matches 'binary'(Full binary list: https://github.com/tensorflow/tfjs/blob/master/tfjs-backend-webgpu/src/binary_op_util.ts).
  {
    const a = tf.tensor2d([1, 2], [1, 2]);
    const b = tf.tensor2d([1, 2], [1, 2]);
    const c = tf.add(a, b);
  }

  // maxPool, matches 'pool'.
  {
    const x = tf.tensor3d([1, 2, 3, 4, 5, 6, 7, 9, 8], [3, 3, 1]);

    const result = tf.maxPool(x, 2, 1, 0);
  }
}

function getURLState(url) {
  let params = new URLSearchParams(url);
  const keys = [...params.keys()];
  if (keys.length === 0) return '';
  let printShaderString = '';
  if (params.has('WEBGPU_PRINT_SHADER')) {
    printShaderString = params.get('WEBGPU_PRINT_SHADER');
  }
  return printShaderString;
}

Print shader in model mode

If you want try this on a model, you can put this and this under tfjs\e2e\benchmarks\local-benchmark. Setup a web server then type url like:

https://127.0.0.1:8080/tfjs//e2e/benchmarks/local-benchmark/index_model.html?WEBGPU_PRINT_SHADER=binary

@tqchen
Copy link
Collaborator

tqchen commented Apr 10, 2023

Thank you! is it possible to install tfjs as a nodejs dependency and prin using nodejs? That would allow some native integration of python packages that leverages this

@axinging
Copy link
Contributor

@tqchen I will try how to make this works on node. Will update when any progress.

@tqchen
Copy link
Collaborator

tqchen commented Apr 16, 2023

cc @Hzfengsy

@axinging
Copy link
Contributor

@tqchen, If you want a quick try on print shader on webgpu-nodejs, I draft a document here:
https://github.com/axinging/webgpu-node/tree/main/tfjsmodel-on-external-node

Please note: currently some webgpu APIs are not fully supported in dawn, so this can be only used for dump shader, the predict results are invalid. BTW, I will try to see if there any opportunity to upstream these change and make the usage more simple.

@tqchen
Copy link
Collaborator

tqchen commented Apr 17, 2023

actually we don't need to see prediction result, instead it would be great to simply get the shaders without running the prediction or even running webgpu api, since we are on the compilation and packaging side.

@axinging
Copy link
Contributor

axinging commented Apr 17, 2023

Hi, @tqchen @Hzfengsy, I drafted a design doc about dump shader here:
https://github.com/webatintel/tvm-web/blob/main/TFJS%20WebGPU%20dump%20shader%20design.md
So could you please help to review and clarify your detailed request?

@tqchen
Copy link
Collaborator

tqchen commented Apr 17, 2023

Thank you @axinging , what we want is the ability to get the WGSL shader code without executing them. So effectively an lookup feature.

My understanding is that most of the execution contains two parts of logic(that maybe be coupled together)

  • S0: shader code based on the workload and return the shader string
  • S1: compile/cache run the shader code to get the final result

Let me use the following code to show some of the intend of the logic

interface InputSpec {
   shapes: Array<Array<number>>;
};

// Get the shader string based on ket and input shapes (in spec)
//
function getShader(key: str, spec: InputSpec) : string {
   if (spec.shapes[0] match some pattern) {
       return shader0;
   } else {
     ....
   }
}

function matmul(input: Tensor, w: Tensor) {
   const shader = getShader("matmul", [input.shape, w.shape]);
   const output = allocOutput(...);
   // abstract code for compile
   const pipeline = compile(shader);
   ...
   submit(pipeline, inputs, output);
}

What we need is the ability to be able to directly call getShader(key: str, spec: InputSpec) : string by passing in the spec. Note that the definition of input spec can change depending the implementation.

Being able to call this function pragmatically from nodejs, like node tfjsmodel --get-shader conv2d "[[3, 224, 224], [3,3]]" will enable to pattern match conv2d automatically replace our impl with the kernel

@pyu10055
Copy link

@tqchen is there a way to incorporate TFJS backend into the TVM runtime instead of relies on the AOT shader copy?

@tqchen
Copy link
Collaborator

tqchen commented Apr 27, 2023

Thanks @pyu10055 , the main goal is we would like to be able to do development through python env and recompose the solutions. This would be a orthogonal path from the tvmjs backend runtime integration and use tfjs as a graph exec provider, which i think would also be valuable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants