Skip to content

Support PluggableDevices #8040

@cromefire

Description

@cromefire

System information

  • TensorFlow.js version (you are using): 4.6.0
  • Are you willing to contribute it (Yes/No): Probably not, I definitely lack the expertise for that if it's not a really simple thing

Describe the feature and the current behavior/state.

Both tensorflow python and libtensorflow provide the option to load a pluggable device driver, like Intel's extensions for tensorflow, where as TensorFlow.js on Node.js currently seems to be vendor locked on NVIDIA GPUs, making it kinda less of a tfjs-node-gpu than a tfjs-node-nvidia.

Will this change the current api? How?

There'd probably be an additional api to load the driver .so and there'd maybe be more kinds of devices available.

Who will benefit with this feature?

Basically anyone that uses tfjs node and has any hardware that is not NVIDIA, but support pluggable devices, notably Intel GPUs (also optimized code for their CPUs) and Apple devices, AMD still doesn't have a pluggable device driver I think, but uses a custom fork of tensorflow.

Any Other info.

As (although I can't find any confirmation, other than the log format indicating it) tfjs-node seems to be built on libtensorflow, it seems to you'd "just" need to call the C API (TF_LoadPluggableDeviceLibrary(<lib_path>, status);) from libtensorflow for loading the driver and there you go.

Activity

cromefire

cromefire commented on Oct 28, 2023

@cromefire
Author

You'd probably do it somewhere in here:

static napi_value InitTFNodeJSBinding(napi_env env, napi_value exports) {
napi_status nstatus;
TFJSBackend *const backend = TFJSBackend::Create(env);
ENSURE_VALUE_IS_NOT_NULL_RETVAL(env, backend, nullptr);
// Store the backend in node's instance data for this addon
nstatus = napi_set_instance_data(env, backend, &FinalizeTFNodeJSBinding, nullptr);
ENSURE_NAPI_OK_RETVAL(env, nstatus, exports);
// TF version
napi_value tf_version;
nstatus = napi_create_string_latin1(env, TF_Version(), -1, &tf_version);
ENSURE_NAPI_OK_RETVAL(env, nstatus, exports);
But I lack the knowledge how to pass an argument into the function and how to properly do error handling in it or you probably also could load it via a new function on the binding, but either way I lack the knowledge to add new headers to the build (as tensorflow/c/c_api_experimental.h is needed) and how to properly write all that C code (not a C dev, especially not with node-gyp...).

You also probably want to use XPU devices here, as Intel calls them XPUs for some reason and add another flag isXpuDevice or so:

// TODO(kreeger): Add better support for this in the future through the JS
// API. https://github.com/tensorflow/tfjs/issues/320
std::string cpu_device_name;
const int num_devices = TF_DeviceListCount(device_list);
for (int i = 0; i < num_devices; i++) {
const char *device_type =
TF_DeviceListType(device_list, i, tf_status.status);
ENSURE_TF_OK(env, tf_status);
// Keep a reference to the host CPU device:
if (strcmp(device_type, "CPU") == 0) {
cpu_device_name =
std::string(TF_DeviceListName(device_list, i, tf_status.status));
ENSURE_TF_OK(env, tf_status);
} else if (strcmp(device_type, "GPU") == 0) {
device_name =
std::string(TF_DeviceListName(device_list, i, tf_status.status));
ENSURE_TF_OK(env, tf_status);
}
}
// If no GPU devices found, fallback to host CPU:
if (device_name.empty()) {
device_name = cpu_device_name;
is_gpu_device = false;
} else {
is_gpu_device = true;
}

linked a pull request that will close this issue on Feb 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

    Development

    Participants

    @cromefire@Linchenn@gaikwadrahul8

    Issue actions

      Support PluggableDevices · Issue #8040 · tensorflow/tfjs