[remoting] Rebase on top of b6945 #15

kpouget · 2025-11-04T14:26:20Z

Summary by CodeRabbit

New Features
- Added remote GPU compute capabilities enabling distributed inference through remoting backend and frontend integration with Vulkan/virtgpu infrastructure.
Chores
- Added new build scripts and CMake configurations for remote compute targets.
- Consolidated project ownership structure via OWNERS file.

…timize

coderabbitai · 2025-11-04T14:26:37Z

Caution

Review failed

Failed to post review comments

Walkthrough

This PR introduces a comprehensive GGML remoting backend and frontend infrastructure, enabling distributed tensor computation across virtgpu. Adds dispatched backend handling, RPC serialization protocols, frontend buffer/device management, DRM kernel interfaces, and build/run scripts. Disables verbose logging in llama.cpp and adds timing instrumentation.

Changes

Cohort / File(s)	Summary
Build system & configuration `.gitignore`, `CMakePresets.json`, `OWNERS`	Updated .gitignore with build path patterns; added remoting frontend/backend CMake presets; created OWNERS file with approvers/reviewers
Build & preparation scripts `build.sh`, `build.backend.sh`, `build.remoting.sh`, `build.vulkan.sh`, `prepare.sh`, `prepare.backend.sh`, `prepare.remoting.sh`, `prepare.vulkan.sh`, `podman_compile.sh`	Added parallelized build scripts for remoting backend/frontend, Vulkan; preparation scripts configure CMake for different backends with feature flags; podman script orchestrates container-based builds
Run/execution scripts `run.ramalama.sh`, `run.remoting.sh`, `run.vulkan.sh`	Added execution scripts with environment setup for Vulkan ICD, remoting backend selection (bench/perf/normal modes), and debugging tool prefixes
GGML backend registration & dispatch `ggml/CMakeLists.txt`, `ggml/src/CMakeLists.txt`, `ggml/src/ggml-backend-reg.cpp`, `ggml/include/ggml-remoting-frontend.h`	Added GGML_REMOTING_FRONTEND/BACKEND options, registered RemotingFrontend/RemotingBackend; exposed remoting frontend public header with registration function
Metal backend remoting `ggml/src/ggml-metal/CMakeLists.txt`, `ggml/src/ggml-metal/ggml-metal-context.m`, `ggml/src/ggml-metal/ggml-metal-device.m`, `ggml/src/ggml-metal/ggml-metal-remoting.cpp`	Added Metal remoting support file to backend; enabled graph optimization timing; disabled debug logging in pipeline compilation; exposed Metal device context retrieval API
Remoting backend (server-side) `ggml/src/ggml-remotingbackend/CMakeLists.txt`, `ggml/src/ggml-remotingbackend/backend-.{cpp,h}`, `ggml/src/ggml-remotingbackend/shared/{api_remoting.h,apir_backend.h,venus_cs.h,venus_cs_ggml.{h,cpp}}`	Comprehensive backend dispatcher with device/buffer-type/buffer/Metal command handlers; RPC protocol definitions; binary encoding/decoding for tensor serialization; graph construction from RPC payloads
Remoting frontend (client-side) `ggml/src/ggml-remotingfrontend/CMakeLists.txt`, `ggml/src/ggml-remotingfrontend/ggml-.{cpp,h}`, `ggml/src/ggml-remotingfrontend/virtgpu-.{cpp,h}`	Frontend buffer/device/Metal operations; virtgpu integration with DRM IOCTLs; shared memory management; remote call lifecycle (prepare/dispatch/finish)
DRM kernel UAPI headers `ggml/src/ggml-remotingfrontend/include/drm-uapi/{drm.h,virtgpu_drm.h}`, `include/venus_hw.h`	Complete DRM and Virtio-GPU userspace API definitions; capability structures; IOCTL codes
Utility infrastructure `ggml/src/ggml-remotingfrontend/virtgpu-utils.{cpp,h}`	Sparse hierarchical array implementation, logging/debug helpers, alignment/atomic utilities, linked-list structures
Logging suppression in llama.cpp `src/llama-context.cpp`, `src/llama-kv-cache.cpp`, `src/llama-model-loader.cpp`, `src/llama-model.cpp`, `src/llama-vocab.cpp`	Commented out verbose debug/info logs for metadata, tensor loading, and context info; added early returns in print_info functions
Performance instrumentation `tools/run/run.cpp`	Added timing instrumentation for token generation with throughput reporting (tokens/sec)

Sequence Diagram(s)

sequenceDiagram
    participant Frontend as GGML Frontend
    participant Dispatcher as Backend Dispatcher
    participant GPU as Remote GPU
    participant Host as Host Virtgpu

    Frontend->>Host: create_virtgpu() - handshake & load backend library
    Host-->>Frontend: virtgpu handle, capset, shmem regions
    
    Note over Frontend: Graph Compute Flow
    Frontend->>Frontend: serialize_graph(cgraph)
    Frontend->>Host: remote_call_prepare(GRAPH_COMPUTE)
    Host-->>Frontend: encoder + decoder
    
    Frontend->>Frontend: encode cgraph to shmem
    Frontend->>Host: remote_call() - send command
    Host->>Dispatcher: apir_backend_dispatcher(cmd_type, encoded_data)
    Dispatcher->>GPU: backend_graph_compute(cgraph)
    GPU-->>Dispatcher: ggml_status result
    Dispatcher->>Host: encode status
    Host-->>Frontend: remote_call result
    
    Frontend->>Frontend: deserialize result
    Frontend->>Host: remote_call_finish()

sequenceDiagram
    participant App as Application
    participant FrontBuf as Frontend Buffer
    participant RemoteCall as Remote Backend
    participant GPU as GPU Device
    participant SharedMem as Shared Memory

    App->>FrontBuf: set_tensor(tensor, data)
    FrontBuf->>SharedMem: allocate/use shmem region
    FrontBuf->>FrontBuf: encode tensor + shmem_id
    FrontBuf->>RemoteCall: apir_buffer_set_tensor(encoded)
    
    RemoteCall->>SharedMem: resolve shmem pointer
    RemoteCall->>GPU: dma/copy data to GPU buffer
    GPU-->>RemoteCall: done
    RemoteCall-->>FrontBuf: status
    
    FrontBuf-->>App: set_tensor complete

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~90+ minutes

Justification: This is a substantial architectural addition with high complexity and heterogeneity:

Scope: ~80 new/modified files across three major components (backend, frontend, build system)
Density: Dense logic in dispatchers, RPC serialization, buffer management, and virtgpu integration
Heterogeneity: Diverse patterns—DRM IOCTLs, VN encoding/decoding protocol, GGML tensor serialization, CMake target configuration
Critical infrastructure: Remote procedure call mechanism, shared memory management, and cross-process tensor graph execution require careful review
Language variety: Mix of C++, C, Objective-C, shell scripts, and CMake

Areas requiring extra attention:

RPC serialization protocol (venus_cs.h, venus_cs_ggml-rpc*.cpp): Custom binary encoding/decoding with bounds checks and overflow validation
Shared memory lifecycle (virtgpu-shm.cpp/h): Allocation, mapping, deallocation patterns; sparse array indexing
Buffer tracking and synchronization (backend-dispatched-buffer.cpp, ggml-backend-buffer.cpp): Proper lifetime management across frontend/backend boundary
Graph serialization/deserialization (venus_cs_ggml-rpc*.cpp): Tensor dependency reconstruction with buffer validation
Remote call dispatch table (backend-dispatched.h): Command routing correctness and bounds checking
Metal device context handling (ggml-metal-remoting.cpp, virtgpu-forward-metal.cpp): Capability query correctness

Possibly related PRs

remoting: improve the frontend<>backend error handling #7: Overlapping changes to remoting API definitions, backend initialization flow, and remote call protocol (apir_backend_initialize, api_remoting.h enums)
remoting: improve the failure when no virtgpu is available #6: Related edits to remoting build scripts and virtgpu error handling pathways
OWNERS: add file for openshift CI #3: Adds the same OWNERS file with identical maintainer list

Suggested reviewers

cfergeau
praveenkumar

Poem

🐰 A rabbit's remote dream, so grand and sublime,
Dispatchers and buffers, all coded with time,
Through virtgpu's tunnel, the tensors now flow,
From frontend to backend, a virtuoso show! ✨
With serialized graphs and shared memory deep,
The compute remotes forth—innovations to reap!

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Description check	⚠️ Warning	The pull request description is entirely empty; no description was provided by the author despite a template being available in the repository.	Add a comprehensive description explaining the purpose of the rebase, what changes are included, and why this rebase is necessary. Reference the template in CONTRIBUTING.md.

✅ Passed checks (1 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title '[remoting] Rebase on top of b6945' is specific and clearly indicates a rebase operation on a commit identifier, directly related to the changeset.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

openshift-ci · 2025-11-04T14:26:38Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

kpouget · 2025-11-04T14:27:45Z

/test topsail
/cluster mac5

coderabbitai

Actionable comments posted: 51

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

tools/run/run.cpp (1)

1175-1199: Call stop_timer() on every exit path.

We start the timer at the top of each loop iteration, but exit paths (decode failure, EOG break, token-to-string failure) return or break before the trailing stop_timer(). Those iterations never increment timer_total/timer_count, so the stats under-report work and may leave the last start_timer() unmatched. Stop the timer before each early exit.

-        if (llama_decode(llama_data.context.get(), batch)) {
+        if (llama_decode(llama_data.context.get(), batch)) {
+            stop_timer();
             printe("failed to decode\n");
             return 1;
         }
 
         // sample the next token, check is it an end of generation?
         new_token_id = llama_sampler_sample(llama_data.sampler.get(), llama_data.context.get(), -1);
-        if (llama_vocab_is_eog(vocab, new_token_id)) {
+        if (llama_vocab_is_eog(vocab, new_token_id)) {
+            stop_timer();
             break;
         }
 
         std::string piece;
-        if (convert_token_to_string(vocab, new_token_id, piece)) {
+        if (convert_token_to_string(vocab, new_token_id, piece)) {
+            stop_timer();
             return 1;
         }
 
         print_word_and_concatenate_to_response(piece, response);
 
         // prepare the next batch with the sampled token
         batch = llama_batch_get_one(&new_token_id, 1);
         stop_timer();

🧹 Nitpick comments (19)

ggml/src/ggml-remotingfrontend/include/venus_hw.h (1)
30-33: Consider documenting the version fields.

The first four fields lack inline comments explaining their purpose and valid value ranges. While the names are descriptive, brief documentation would improve maintainability, especially for protocol version compatibility checks.

Example:
 struct virgl_renderer_capset_venus {
+   /* Wire protocol format version for Venus commands */
    uint32_t wire_format_version;
+   /* Vulkan XML specification version */
    uint32_t vk_xml_version;
+   /* VK_EXT_command_serialization specification version */
    uint32_t vk_ext_command_serialization_spec_version;
+   /* VK_MESA_venus_protocol specification version */
    uint32_t vk_mesa_venus_protocol_spec_version;
ggml/src/ggml-remotingfrontend/virtgpu-utils.h (2)
83-94: Consider adding [[noreturn]] attribute to FATAL.

Since FATAL always calls abort(), it would be beneficial to mark it with the [[noreturn]] attribute to help the compiler with optimization and dead code analysis.

Apply this diff:
+[[noreturn]]
 inline void
 FATAL(const char *format, ...) {
96-100: Consider removing redundant wrapper function.

util_is_power_of_two_nonzero64 is a simple wrapper around the IS_POT_NONZERO macro with no additional logic. Consider using the macro directly or documenting why the function wrapper is necessary.
src/llama-model.cpp (1)
2256-2256: Keep debug logs; they’re already gated by LLAMA_LOG_DEBUG.

Commenting them out removes useful diagnostics for device assignment/splits. Since these are debug-level, they won’t spam unless enabled. Recommend restoring them.
-            //LLAMA_LOG_DEBUG("load_tensors: layer %3d assigned to device %s, is_swa = %d\n", il, ggml_backend_dev_name(cpu_dev), is_swa);
+            LLAMA_LOG_DEBUG("load_tensors: layer %3d assigned to device %s, is_swa = %d\n", il, ggml_backend_dev_name(cpu_dev), is_swa);
@@
-        //LLAMA_LOG_DEBUG("load_tensors: layer %3d assigned to device %s, is_swa = %d\n", il, ggml_backend_dev_name(dev), is_swa);
+        LLAMA_LOG_DEBUG("load_tensors: layer %3d assigned to device %s, is_swa = %d\n", il, ggml_backend_dev_name(dev), is_swa);
Also applies to: 2262-2262
ggml/src/ggml-remotingfrontend/include/drm-uapi/drm.h (1)

1-64: Document the kernel source and version of this vendored UAPI header.

This appears to be a DRM UAPI header copied from the Linux kernel. It's important to document:

Which kernel version this header is from

The update policy for keeping it in sync with kernel changes

Why vendoring is preferred over using system headers

Consider adding a comment at the top indicating this is from kernel UAPI and the specific version/commit.

ggml/src/ggml-remotingfrontend/include/drm-uapi/virtgpu_drm.h (1)

1-38: Document kernel version and note dependency on drm.h.

Similar to drm.h, this vendored virtgpu UAPI header should document its kernel source version. Additionally, this header depends on drm.h which has a missing drm_mode.h include (see drm.h review comments).
run.vulkan.sh (2)
12-12: Remove commented-out code.

Dead code should be removed rather than left commented out. Use version control to recover it if needed.

Apply this diff:
-#rm -f /usr/lib64/libvulkan_virtio.so
-
16-16: Make MESA_FLAVOR configurable.

The MESA_FLAVOR variable is hardcoded to "good". Consider making this configurable via environment variable or command-line argument to support different testing scenarios.

Apply this diff:
-MESA_FLAVOR=good
+MESA_FLAVOR="${MESA_FLAVOR:-good}"
ggml/src/ggml-remotingfrontend/virtgpu-forward-impl.h (1)

7-8: Clarify or remove incomplete CACHED macro.

The CACHED macro is defined as empty with a commented-out placeholder. If this is for future use, consider adding a TODO comment explaining its purpose. If it's not needed, remove it.

src/llama-vocab.cpp (3)

2360-2361: Debug log suppression is fine; prefer a feature toggle.

Commenting out LLAMA_LOG_DEBUG reduces diagnosability. Consider guarding with a compile-time flag or env-driven verbosity check instead of hard-disabling.

3207-3207: Early-return disables print_info entirely.

This short-circuits all vocab diagnostics. Recommend gating via a runtime flag (e.g., LLAMA_QUIET=1) or compile-time option so developers can re-enable when needed.

3573-3573: Same as above: print_info() is now a no-op.

Apply the same gating approach to allow opt-in diagnostics.
prepare.vulkan.sh (1)
1-7: Missing shebang and strict mode; add for robustness.

Shellcheck SC2148 applies. Also suggest set -euo pipefail.

Apply this diff:
+#!/usr/bin/env bash
+set -euo pipefail
 cmake -S . \
       -B ../build.vulkan \
       -DGGML_VULKAN=ON \
       -DGGML_NATIVE=OFF \
       -DGGML_METAL=OFF \
       -DLLAMA_CURL=OFF \
       -DCMAKE_BUILD_TYPE=Debug
src/llama-model-loader.cpp (2)

682-709: Log lines disabled; prefer a controllable verbosity switch.

Commenting out LLAMA_LOG_INFO/DEBUG reduces helpful diagnostics. Gate via env var (e.g., LLAMA_VERBOSE_META) or compile-time option instead of hard-disabling.

Also applies to: 793-794

1160-1161: print_info() early-return suppresses all loader metadata.

Consider honoring a quiet flag rather than unconditional return, to aid debugging when needed.
build.backend.sh (1)
13-13: Consider separating declaration and assignment to catch errors.

Combining export with command substitution can mask failures from the subcommand.

Apply this diff to separate the operations:
-export SDKROOT=$(xcrun --sdk macosx --show-sdk-path)
+SDKROOT=$(xcrun --sdk macosx --show-sdk-path)
+export SDKROOT
podman_compile.sh (1)
34-34: Consider mounting only the required directory instead of entire $HOME.

Mounting the entire home directory gives the container broad access to user data. For better security isolation, mount only the workspace directory needed for the build.

For example, if only the current project directory is needed:
--env HOME="$HOME" \
--env PERF_MODE="${PERF_MODE:-}" \
--env BENCH_MODE="${BENCH_MODE:-}" \
--v "$HOME":"$HOME":Z \
+-v "$PWD":"$PWD":Z \
-w "$PWD" \
ggml/src/ggml-remotingfrontend/CMakeLists.txt (1)
27-29: Use pkg-config or find_package instead of hardcoding system paths.

The hardcoded path /usr/include/libdrm/ and the Fedora-specific dnf install comment reduce portability. Different Linux distributions and macOS may install libdrm in different locations.

Apply this diff to use pkg-config for better portability:
 # dnf install -y libdrm-devel
-target_link_libraries(ggml-remotingfrontend PUBLIC drm)
-target_include_directories(ggml-remotingfrontend PUBLIC /usr/include/libdrm/)
+find_package(PkgConfig REQUIRED)
+pkg_check_modules(DRM REQUIRED libdrm)
+target_link_libraries(ggml-remotingfrontend PUBLIC ${DRM_LIBRARIES})
+target_include_directories(ggml-remotingfrontend PUBLIC ${DRM_INCLUDE_DIRS})
ggml/src/ggml-remotingbackend/backend-dispatched.h (1)
56-65: Fix command name strings to match handlers.

Several command-name strings drop the _device/_backend prefixes (e.g., Line 56 returns "backend_get_device_count" while the handler is backend_reg_get_device_count). When this helper is used for tracing or diagnostics, the mismatches make it very hard to map logs back to the actual dispatchers. Please align the returned strings with the real handler names.

Apply this diff:
-  case APIR_COMMAND_TYPE_DEVICE_GET_COUNT: return "backend_get_device_count";
-  case APIR_COMMAND_TYPE_DEVICE_GET_NAME: return "backend_get_device_name";
-  case APIR_COMMAND_TYPE_DEVICE_GET_DESCRIPTION: return "backend_get_device_description";
-  case APIR_COMMAND_TYPE_DEVICE_GET_MEMORY: return "backend_get_device_memory";
-  case APIR_COMMAND_TYPE_DEVICE_GET_BUFFER_TYPE: return "backend_get_buffer_type";
-  case APIR_COMMAND_TYPE_DEVICE_GET_PROPS: return "backend_get_props";
-  case APIR_COMMAND_TYPE_DEVICE_BUFFER_FROM_PTR: return "backend_buffer_from_ptr";
+  case APIR_COMMAND_TYPE_DEVICE_GET_COUNT: return "backend_reg_get_device_count";
+  case APIR_COMMAND_TYPE_DEVICE_GET_NAME: return "backend_device_get_name";
+  case APIR_COMMAND_TYPE_DEVICE_GET_DESCRIPTION: return "backend_device_get_description";
+  case APIR_COMMAND_TYPE_DEVICE_GET_MEMORY: return "backend_device_get_memory";
+  case APIR_COMMAND_TYPE_DEVICE_GET_BUFFER_TYPE: return "backend_device_get_buffer_type";
+  case APIR_COMMAND_TYPE_DEVICE_GET_PROPS: return "backend_device_get_props";
+  case APIR_COMMAND_TYPE_DEVICE_BUFFER_FROM_PTR: return "backend_device_buffer_from_ptr";

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between cc98f8d and a15ef1d.

📒 Files selected for processing (76)

.gitignore (1 hunks)
CMakePresets.json (1 hunks)
OWNERS (1 hunks)
build.backend.sh (1 hunks)
build.remoting.sh (1 hunks)
build.sh (1 hunks)
build.vulkan.sh (1 hunks)
ggml/CMakeLists.txt (2 hunks)
ggml/include/ggml-remoting-frontend.h (1 hunks)
ggml/src/CMakeLists.txt (1 hunks)
ggml/src/ggml-backend-reg.cpp (3 hunks)
ggml/src/ggml-metal/CMakeLists.txt (1 hunks)
ggml/src/ggml-metal/ggml-metal-context.m (1 hunks)
ggml/src/ggml-metal/ggml-metal-device.m (2 hunks)
ggml/src/ggml-metal/ggml-metal-remoting.cpp (1 hunks)
ggml/src/ggml-remotingbackend/CMakeLists.txt (1 hunks)
ggml/src/ggml-remotingbackend/backend-convert.h (1 hunks)
ggml/src/ggml-remotingbackend/backend-dispatched-backend.cpp (1 hunks)
ggml/src/ggml-remotingbackend/backend-dispatched-buffer-type.cpp (1 hunks)
ggml/src/ggml-remotingbackend/backend-dispatched-buffer.cpp (1 hunks)
ggml/src/ggml-remotingbackend/backend-dispatched-device.cpp (1 hunks)
ggml/src/ggml-remotingbackend/backend-dispatched-metal.cpp (1 hunks)
ggml/src/ggml-remotingbackend/backend-dispatched.cpp (1 hunks)
ggml/src/ggml-remotingbackend/backend-dispatched.h (1 hunks)
ggml/src/ggml-remotingbackend/backend-internal.h (1 hunks)
ggml/src/ggml-remotingbackend/backend-utils.h (1 hunks)
ggml/src/ggml-remotingbackend/backend.cpp (1 hunks)
ggml/src/ggml-remotingbackend/shared/api_remoting.h (1 hunks)
ggml/src/ggml-remotingbackend/shared/apir_backend.h (1 hunks)
ggml/src/ggml-remotingbackend/shared/venus_cs.h (1 hunks)
ggml/src/ggml-remotingbackend/shared/venus_cs_ggml-rpc.cpp (1 hunks)
ggml/src/ggml-remotingbackend/shared/venus_cs_ggml-rpc.h (1 hunks)
ggml/src/ggml-remotingbackend/shared/venus_cs_ggml.h (1 hunks)
ggml/src/ggml-remotingbackend/venus_cs_ggml-rpc-back.cpp (1 hunks)
ggml/src/ggml-remotingfrontend/CMakeLists.txt (1 hunks)
ggml/src/ggml-remotingfrontend/ggml-backend-buffer-type.cpp (1 hunks)
ggml/src/ggml-remotingfrontend/ggml-backend-buffer.cpp (1 hunks)
ggml/src/ggml-remotingfrontend/ggml-backend-device.cpp (1 hunks)
ggml/src/ggml-remotingfrontend/ggml-backend-host-buffer-type.cpp (1 hunks)
ggml/src/ggml-remotingfrontend/ggml-backend-reg.cpp (1 hunks)
ggml/src/ggml-remotingfrontend/ggml-backend.cpp (1 hunks)
ggml/src/ggml-remotingfrontend/ggml-metal-remoting.cpp (1 hunks)
ggml/src/ggml-remotingfrontend/ggml-metal-remoting.h (1 hunks)
ggml/src/ggml-remotingfrontend/ggml-remoting-frontend.cpp (1 hunks)
ggml/src/ggml-remotingfrontend/ggml-remoting.h (1 hunks)
ggml/src/ggml-remotingfrontend/include/drm-uapi/drm.h (1 hunks)
ggml/src/ggml-remotingfrontend/include/drm-uapi/virtgpu_drm.h (1 hunks)
ggml/src/ggml-remotingfrontend/include/venus_hw.h (1 hunks)
ggml/src/ggml-remotingfrontend/venus_cs_ggml-rpc-front.cpp (1 hunks)
ggml/src/ggml-remotingfrontend/virtgpu-forward-backend.cpp (1 hunks)
ggml/src/ggml-remotingfrontend/virtgpu-forward-buffer-type.cpp (1 hunks)
ggml/src/ggml-remotingfrontend/virtgpu-forward-buffer.cpp (1 hunks)
ggml/src/ggml-remotingfrontend/virtgpu-forward-device.cpp (1 hunks)
ggml/src/ggml-remotingfrontend/virtgpu-forward-impl.h (1 hunks)
ggml/src/ggml-remotingfrontend/virtgpu-forward-metal.cpp (1 hunks)
ggml/src/ggml-remotingfrontend/virtgpu-forward.h (1 hunks)
ggml/src/ggml-remotingfrontend/virtgpu-shm.cpp (1 hunks)
ggml/src/ggml-remotingfrontend/virtgpu-shm.h (1 hunks)
ggml/src/ggml-remotingfrontend/virtgpu-utils.cpp (1 hunks)
ggml/src/ggml-remotingfrontend/virtgpu-utils.h (1 hunks)
ggml/src/ggml-remotingfrontend/virtgpu.cpp (1 hunks)
ggml/src/ggml-remotingfrontend/virtgpu.h (1 hunks)
podman_compile.sh (1 hunks)
prepare.backend.sh (1 hunks)
prepare.remoting.sh (1 hunks)
prepare.sh (1 hunks)
prepare.vulkan.sh (1 hunks)
run.ramalama.sh (1 hunks)
run.remoting.sh (1 hunks)
run.vulkan.sh (1 hunks)
src/llama-context.cpp (2 hunks)
src/llama-kv-cache.cpp (1 hunks)
src/llama-model-loader.cpp (5 hunks)
src/llama-model.cpp (2 hunks)
src/llama-vocab.cpp (3 hunks)
tools/run/run.cpp (3 hunks)

🧰 Additional context used

🧬 Code graph analysis (42)

ggml/include/ggml-remoting-frontend.h (1)

ggml/src/ggml-remotingfrontend/ggml-backend-reg.cpp (2)

ggml_backend_remoting_frontend_reg (132-159)

ggml_backend_remoting_frontend_reg (132-132)

ggml/src/ggml-backend-reg.cpp (1)

ggml/src/ggml-remotingfrontend/ggml-backend-reg.cpp (2)

ggml_backend_remoting_frontend_reg (132-159)

ggml_backend_remoting_frontend_reg (132-132)

ggml/src/ggml-remotingfrontend/virtgpu-forward-metal.cpp (2)

ggml/src/ggml-remotingbackend/shared/venus_cs.h (1)

vn_decode_bool_t (508-512)

ggml/src/ggml-remotingfrontend/virtgpu.cpp (2)

remote_call_finish (551-567)

remote_call_finish (552-555)

ggml/src/ggml-remotingfrontend/ggml-metal-remoting.h (1)

ggml/src/ggml-remotingfrontend/ggml-metal-remoting.cpp (4)

get_metal_dev_context (4-18)

get_metal_dev_context (4-4)

ggml_metal_device_supports_op (20-254)

ggml_metal_device_supports_op (20-20)

ggml/src/ggml-remotingbackend/backend-dispatched-metal.cpp (2)

ggml/src/ggml-remotingbackend/backend-utils.h (1)

ERROR (52-55)

ggml/src/ggml-remotingbackend/shared/venus_cs.h (1)

vn_encode_bool_t (502-506)

ggml/src/ggml-remotingbackend/backend-convert.h (1)

ggml/src/ggml-remotingfrontend/ggml-remoting.h (2)

ggml_buffer_to_apir_handle (137-139)

ggml_buffer_type_to_apir_handle (34-38)

ggml/src/ggml-remotingfrontend/virtgpu-forward-buffer-type.cpp (4)

ggml/src/ggml-remotingbackend/shared/venus_cs_ggml.h (1)

vn_encode_ggml_buffer_type (69-73)

ggml/src/ggml-remotingbackend/shared/venus_cs.h (7)

vn_decode_array_size_unchecked (293-299)

vn_cs_decoder_alloc_array (493-498)

vn_decode_char_array (466-476)

vn_decode_size_t (394-400)

vn_decode_bool_t (508-512)

vn_encode_size_t (387-392)

vn_decode_apir_buffer_host_handle_t (536-540)

ggml/src/ggml-remotingfrontend/virtgpu-utils.h (3)

FATAL (83-94)

INFO (35-44)

INFO (46-47)

ggml/src/ggml-remotingfrontend/virtgpu.cpp (2)

remote_call_finish (551-567)

remote_call_finish (552-555)

ggml/src/ggml-remotingfrontend/venus_cs_ggml-rpc-front.cpp (1)

ggml/src/ggml-remotingfrontend/virtgpu-utils.h (1)

FATAL (83-94)

ggml/src/ggml-remotingfrontend/ggml-backend-buffer.cpp (2)

ggml/src/ggml-remotingfrontend/virtgpu-forward-buffer.cpp (12)

apir_buffer_get_base (3-21)

apir_buffer_get_base (4-4)

apir_buffer_set_tensor (23-65)

apir_buffer_set_tensor (24-25)

apir_buffer_get_tensor (68-76)

apir_buffer_get_tensor (69-70)

apir_buffer_get_tensor (78-114)

apir_buffer_get_tensor (79-80)

apir_buffer_clear (117-132)

apir_buffer_clear (118-119)

apir_buffer_free_buffer (135-148)

apir_buffer_free_buffer (136-136)

ggml/src/ggml-remotingbackend/shared/apir_backend.h (2)

start_timer (91-95)

stop_timer (98-108)

ggml/src/ggml-remotingfrontend/virtgpu-shm.cpp (3)

ggml/src/ggml-remotingfrontend/virtgpu-utils.h (3)

align64 (102-107)

INFO (35-44)

INFO (46-47)

ggml/src/ggml-remotingfrontend/virtgpu.h (1)

virtgpu_ioctl (101-105)

ggml/src/ggml-remotingfrontend/virtgpu-utils.cpp (2)

util_sparse_array_get (125-186)

util_sparse_array_get (126-126)

ggml/src/ggml-remotingfrontend/virtgpu-shm.h (1)

ggml/src/ggml-remotingfrontend/virtgpu-shm.cpp (4)

virtgpu_shmem_create (79-111)

virtgpu_shmem_create (80-80)

virtgpu_shmem_destroy (71-77)

virtgpu_shmem_destroy (72-73)

ggml/src/ggml-remotingbackend/backend-dispatched.cpp (3)

ggml/src/ggml-backend-reg.cpp (4)

reg (241-254)

reg (241-241)

reg (309-332)

reg (309-309)

ggml/src/ggml-remotingfrontend/virtgpu-utils.h (4)

FATAL (83-94)

ERROR (72-81)

INFO (35-44)

INFO (46-47)

ggml/src/ggml-remotingbackend/backend-utils.h (2)

ERROR (52-55)

INFO (42-45)

ggml/src/ggml-remotingbackend/backend-utils.h (1)

ggml/src/ggml-remotingfrontend/virtgpu-utils.h (5)

INFO (35-44)

INFO (46-47)

WARNING (61-70)

ERROR (72-81)

FATAL (83-94)

tools/run/run.cpp (1)

src/llama-batch.cpp (2)

llama_batch_get_one (851-863)

llama_batch_get_one (851-853)

ggml/src/ggml-remotingbackend/shared/venus_cs_ggml.h (6)

ggml/src/ggml-remotingbackend/shared/venus_cs.h (7)

vn_encode (136-142)

vn_cs_decoder_use_inplace (81-92)

vn_cs_encoder_write (108-123)

vn_cs_decoder_read (98-106)

vn_encode_uint32_t (342-346)

vn_decode_uint32_t (348-352)

vn_decode_uint64_t_array_inplace (211-215)

ggml/src/ggml-remotingbackend/shared/venus_cs_ggml-rpc.cpp (8)

serialize_tensor (17-46)

serialize_tensor (18-18)

deserialize_tensor (48-78)

deserialize_tensor (49-49)

serialize_graph (96-117)

serialize_graph (97-97)

deserialize_graph (144-167)

deserialize_graph (145-145)

ggml/src/ggml.c (2)

ggml_tensor_overhead (1356-1358)

ggml_init (1487-1527)

ggml/src/ggml-remotingbackend/venus_cs_ggml-rpc-back.cpp (4)

deserialize_tensor (33-68)

deserialize_tensor (34-34)

deserialize_graph (95-118)

deserialize_graph (96-96)

ggml/src/ggml-remotingbackend/backend-convert.h (2)

ggml_buffer_type_to_apir_handle (11-15)

ggml_buffer_to_apir_handle (5-9)

ggml/src/ggml-remotingfrontend/virtgpu-utils.h (2)

FATAL (83-94)

WARNING (61-70)

ggml/src/ggml-remotingfrontend/virtgpu.h (1)

ggml/src/ggml-remotingfrontend/virtgpu.cpp (10)

vn_log (347-361)

vn_log (348-348)

create_virtgpu (185-237)

create_virtgpu (186-186)

remote_call_prepare (510-549)

remote_call_prepare (511-514)

remote_call (569-668)

remote_call (570-575)

remote_call_finish (551-567)

remote_call_finish (552-555)

ggml/src/ggml-remotingfrontend/ggml-backend.cpp (3)

ggml/src/ggml-remotingbackend/shared/apir_backend.h (2)

start_timer (91-95)

stop_timer (98-108)

ggml/src/ggml-remotingfrontend/virtgpu-forward-backend.cpp (2)

apir_backend_graph_compute (9-54)

apir_backend_graph_compute (10-10)

ggml/src/ggml-remotingfrontend/ggml-backend-reg.cpp (2)

ggml_backend_remoting_frontend_reg (132-159)

ggml_backend_remoting_frontend_reg (132-132)

ggml/src/ggml-remotingbackend/venus_cs_ggml-rpc-back.cpp (2)

ggml/src/ggml.c (7)

ggml_new_tensor_4d (1740-1749)

ggml_nbytes (1203-1226)

ggml_set_name (1808-1815)

ggml_tensor_overhead (1356-1358)

ggml_graph_overhead_custom (6700-6702)

ggml_init (1487-1527)

ggml_new_graph_custom (6708-6750)

ggml/src/ggml-remotingbackend/shared/venus_cs_ggml-rpc.cpp (4)

deserialize_tensor (48-78)

deserialize_tensor (49-49)

create_node (119-142)

create_node (120-123)

ggml/src/ggml-remotingfrontend/virtgpu-forward-backend.cpp (5)

ggml/src/ggml-remotingbackend/shared/venus_cs_ggml.h (4)

vn_serialize_ggml_cgraph (140-145)

vn_encode_virtgpu_shmem_res_id (128-131)

vn_encode_cgraph_data (147-152)

vn_decode_ggml_status (121-124)

ggml/src/ggml-remotingfrontend/virtgpu-shm.cpp (4)

virtgpu_shmem_create (79-111)

virtgpu_shmem_create (80-80)

virtgpu_shmem_destroy (71-77)

virtgpu_shmem_destroy (72-73)

ggml/src/ggml-remotingfrontend/virtgpu-utils.h (2)

WARNING (61-70)

FATAL (83-94)

ggml/src/ggml-remotingbackend/shared/venus_cs.h (2)

vn_encode_size_t (387-392)

vn_cs_new_encoder (37-46)

ggml/src/ggml-remotingfrontend/virtgpu.cpp (2)

remote_call_finish (551-567)

remote_call_finish (552-555)

ggml/src/ggml-remotingfrontend/ggml-backend-device.cpp (3)

ggml/src/ggml-remotingfrontend/virtgpu-forward-device.cpp (16)

apir_device_get_name (27-53)

apir_device_get_name (28-28)

apir_device_get_description (55-77)

apir_device_get_description (56-56)

apir_device_get_type (79-102)

apir_device_get_type (80-80)

apir_device_get_memory (104-138)

apir_device_get_memory (105-105)

apir_device_supports_op (140-158)

apir_device_supports_op (141-141)

apir_device_get_props (178-201)

apir_device_get_props (179-183)

apir_device_get_buffer_type (160-176)

apir_device_get_buffer_type (161-161)

apir_device_buffer_from_ptr (203-237)

apir_device_buffer_from_ptr (204-206)

ggml/src/ggml-remotingfrontend/ggml-metal-remoting.cpp (2)

ggml_metal_device_supports_op (20-254)

ggml_metal_device_supports_op (20-20)

ggml/src/ggml-remotingfrontend/ggml-backend.cpp (2)

ggml_backend_remoting_device_init (73-87)

ggml_backend_remoting_device_init (73-73)

ggml/src/ggml-remotingfrontend/virtgpu-forward-buffer.cpp (5)

ggml/src/ggml-remotingbackend/shared/venus_cs.h (4)

vn_encode_apir_buffer_host_handle_t (530-534)

vn_decode_uintptr_t (550-554)

vn_encode_size_t (387-392)

vn_encode_uint8_t (150-154)

ggml/src/ggml-remotingfrontend/virtgpu.cpp (2)

remote_call_finish (551-567)

remote_call_finish (552-555)

ggml/src/ggml-remotingfrontend/virtgpu-utils.h (4)

INFO (35-44)

INFO (46-47)

FATAL (83-94)

WARNING (61-70)

ggml/src/ggml-remotingbackend/shared/venus_cs_ggml.h (2)

vn_encode_ggml_tensor (39-44)

vn_encode_virtgpu_shmem_res_id (128-131)

ggml/src/ggml-remotingfrontend/virtgpu-shm.cpp (4)

virtgpu_shmem_create (79-111)

virtgpu_shmem_create (80-80)

virtgpu_shmem_destroy (71-77)

virtgpu_shmem_destroy (72-73)

ggml/src/ggml-remotingbackend/backend-dispatched.h (6)

ggml/src/ggml-remotingbackend/backend-dispatched.cpp (2)

backend_dispatch_initialize (19-47)

backend_dispatch_initialize (19-19)

ggml/src/ggml-remotingbackend/backend-dispatched-device.cpp (18)

backend_reg_get_device_count (9-18)

backend_reg_get_device_count (9-9)

backend_device_get_name (20-31)

backend_device_get_name (20-20)

backend_device_get_description (33-45)

backend_device_get_description (34-34)

backend_device_get_type (47-56)

backend_device_get_type (48-48)

backend_device_get_memory (58-70)

backend_device_get_memory (59-59)

backend_device_supports_op (72-83)

backend_device_supports_op (73-73)

backend_device_get_buffer_type (85-95)

backend_device_get_buffer_type (86-86)

backend_device_get_props (97-111)

backend_device_get_props (98-98)

backend_device_buffer_from_ptr (113-142)

backend_device_buffer_from_ptr (114-114)

ggml/src/ggml-remotingbackend/backend-dispatched-buffer-type.cpp (10)

backend_buffer_type_get_name (9-22)

backend_buffer_type_get_name (10-10)

backend_buffer_type_get_alignment (24-34)

backend_buffer_type_get_alignment (25-25)

backend_buffer_type_get_max_size (36-46)

backend_buffer_type_get_max_size (37-37)

backend_buffer_type_is_host (48-58)

backend_buffer_type_is_host (49-49)

backend_buffer_type_alloc_buffer (60-81)

backend_buffer_type_alloc_buffer (61-61)

ggml/src/ggml-remotingbackend/backend-dispatched-buffer.cpp (10)

backend_buffer_get_base (12-22)

backend_buffer_get_base (13-13)

backend_buffer_set_tensor (24-71)

backend_buffer_set_tensor (25-25)

backend_buffer_get_tensor (73-109)

backend_buffer_get_tensor (74-74)

backend_buffer_clear (111-125)

backend_buffer_clear (112-112)

backend_buffer_free_buffer (127-143)

backend_buffer_free_buffer (128-128)

ggml/src/ggml-remotingbackend/backend-dispatched-backend.cpp (2)

backend_graph_compute (13-58)

backend_graph_compute (14-14)

ggml/src/ggml-remotingbackend/backend-dispatched-metal.cpp (2)

backend_metal_get_device_context (14-41)

backend_metal_get_device_context (15-15)

ggml/src/ggml-remotingfrontend/ggml-remoting.h (4)

ggml/src/ggml-remotingbackend/backend-convert.h (2)

ggml_buffer_type_to_apir_handle (11-15)

ggml_buffer_to_apir_handle (5-9)

ggml/src/ggml-remotingfrontend/ggml-backend-reg.cpp (2)

ggml_backend_remoting_get_device (49-52)

ggml_backend_remoting_get_device (49-49)

ggml/src/ggml-remotingfrontend/ggml-backend.cpp (2)

ggml_backend_remoting_device_init (73-87)

ggml_backend_remoting_device_init (73-73)

ggml/src/ggml-remotingfrontend/ggml-backend-device.cpp (2)

ggml_backend_remoting_device_get_buffer_type (129-144)

ggml_backend_remoting_device_get_buffer_type (130-130)

ggml/src/ggml-remotingfrontend/virtgpu-forward-device.cpp (4)

ggml/src/ggml-remotingbackend/shared/venus_cs.h (10)

vn_decode_int32_t (235-239)

vn_decode_array_size_unchecked (293-299)

vn_cs_decoder_alloc_array (493-498)

vn_decode_char_array (466-476)

vn_decode_uint32_t (348-352)

vn_decode_size_t (394-400)

vn_decode_bool_t (508-512)

vn_decode_apir_buffer_type_host_handle_t (522-526)

vn_encode_size_t (387-392)

vn_decode_apir_buffer_host_handle_t (536-540)

ggml/src/ggml-remotingfrontend/virtgpu-utils.h (3)

INFO (35-44)

INFO (46-47)

FATAL (83-94)

ggml/src/ggml-remotingfrontend/virtgpu.cpp (2)

remote_call_finish (551-567)

remote_call_finish (552-555)

ggml/src/ggml-remotingfrontend/virtgpu-shm.cpp (2)

virtgpu_shmem_create (79-111)

virtgpu_shmem_create (80-80)

ggml/src/ggml-remotingbackend/backend-dispatched-device.cpp (3)

ggml/src/ggml-remotingbackend/shared/venus_cs.h (7)

vn_encode_int32_t (229-233)

vn_encode_array_size (274-278)

vn_encode_char_array (459-464)

vn_encode_uint32_t (342-346)

vn_encode_size_t (387-392)

vn_encode_bool_t (502-506)

vn_decode_size_t (394-400)

ggml/src/ggml-remotingbackend/shared/venus_cs_ggml.h (4)

vn_decode_ggml_tensor_inplace (212-236)

vn_encode_ggml_buffer_type (69-73)

vn_decode_virtgpu_shmem_res_id (133-136)

vn_encode_ggml_buffer (98-102)

ggml/src/ggml-remotingfrontend/virtgpu-utils.h (1)

FATAL (83-94)

ggml/src/ggml-remotingfrontend/virtgpu-forward.h (5)

ggml/src/ggml-remotingfrontend/virtgpu-forward-device.cpp (18)

apir_device_get_count (3-25)

apir_device_get_count (4-4)

apir_device_get_name (27-53)

apir_device_get_name (28-28)

apir_device_get_description (55-77)

apir_device_get_description (56-56)

apir_device_get_type (79-102)

apir_device_get_type (80-80)

apir_device_get_memory (104-138)

apir_device_get_memory (105-105)

apir_device_supports_op (140-158)

apir_device_supports_op (141-141)

apir_device_get_buffer_type (160-176)

apir_device_get_buffer_type (161-161)

apir_device_get_props (178-201)

apir_device_get_props (179-183)

apir_device_buffer_from_ptr (203-237)

apir_device_buffer_from_ptr (204-206)

ggml/src/ggml-remotingfrontend/virtgpu-forward-buffer-type.cpp (10)

apir_buffer_type_get_name (3-29)

apir_buffer_type_get_name (4-4)

apir_buffer_type_get_alignment (31-51)

apir_buffer_type_get_alignment (32-32)

apir_buffer_type_get_max_size (53-73)

apir_buffer_type_get_max_size (54-54)

apir_buffer_type_is_host (75-95)

apir_buffer_type_is_host (76-76)

apir_buffer_type_alloc_buffer (97-119)

apir_buffer_type_alloc_buffer (98-98)

ggml/src/ggml-remotingfrontend/virtgpu-forward-buffer.cpp (12)

apir_buffer_get_base (3-21)

apir_buffer_get_base (4-4)

apir_buffer_set_tensor (23-65)

apir_buffer_set_tensor (24-25)

apir_buffer_get_tensor (68-76)

apir_buffer_get_tensor (69-70)

apir_buffer_get_tensor (78-114)

apir_buffer_get_tensor (79-80)

apir_buffer_clear (117-132)

apir_buffer_clear (118-119)

apir_buffer_free_buffer (135-148)

apir_buffer_free_buffer (136-136)

ggml/src/ggml-remotingfrontend/virtgpu-forward-backend.cpp (2)

apir_backend_graph_compute (9-54)

apir_backend_graph_compute (10-10)

ggml/src/ggml-remotingfrontend/virtgpu-forward-metal.cpp (2)

apir_metal_get_device_context (3-20)

apir_metal_get_device_context (4-4)

ggml/src/ggml-remotingfrontend/ggml-metal-remoting.cpp (3)

ggml/src/ggml-remotingfrontend/virtgpu-forward-metal.cpp (2)

apir_metal_get_device_context (3-20)

apir_metal_get_device_context (4-4)

ggml/src/ggml.c (5)

ggml_get_unary_op (1794-1797)

ggml_is_contiguous (1386-1388)

ggml_get_glu_op (1799-1802)

ggml_is_contiguous_1 (1394-1396)

ggml_is_contiguous_rows (1419-1423)

ggml/src/ggml-impl.h (1)

ggml_get_op_params_i32 (151-154)

ggml/src/ggml-remotingbackend/backend-dispatched-buffer.cpp (5)

ggml/src/ggml-remotingbackend/shared/venus_cs_ggml.h (3)

vn_decode_ggml_buffer (104-112)

vn_decode_ggml_tensor (46-236)

vn_decode_virtgpu_shmem_res_id (133-136)

ggml/src/ggml-remotingbackend/shared/venus_cs.h (3)

vn_encode_uintptr_t (544-548)

vn_decode_size_t (394-400)

vn_decode_uint8_t (156-160)

ggml/src/ggml-remotingbackend/shared/apir_backend.h (2)

start_timer (91-95)

stop_timer (98-108)

ggml/src/ggml-remotingfrontend/virtgpu-utils.h (2)

FATAL (83-94)

WARNING (61-70)

ggml/src/ggml-remotingbackend/venus_cs_ggml-rpc-back.cpp (2)

untrack_backend_buffer (17-26)

untrack_backend_buffer (18-18)

ggml/src/ggml-remotingbackend/backend-internal.h (2)

ggml/src/ggml-remotingbackend/backend.cpp (6)

apir_backend_initialize (49-114)

apir_backend_initialize (49-49)

apir_backend_deinit (23-47)

apir_backend_deinit (23-23)

apir_backend_dispatcher (116-150)

apir_backend_dispatcher (116-119)

ggml/src/ggml-remotingbackend/backend-dispatched-metal.cpp (1)

ggml_backend_metal_get_device_context_fct (9-12)

ggml/src/ggml-remotingbackend/backend-dispatched-buffer-type.cpp (3)

ggml/src/ggml-remotingbackend/shared/venus_cs_ggml.h (2)

vn_decode_ggml_buffer_type (75-82)

vn_encode_ggml_buffer (98-102)

ggml/src/ggml-remotingbackend/shared/venus_cs.h (5)

vn_encode_array_size (274-278)

vn_encode_char_array (459-464)

vn_encode_size_t (387-392)

vn_encode_bool_t (502-506)

vn_decode_size_t (394-400)

ggml/src/ggml-remotingbackend/shared/venus_cs_ggml-rpc.cpp (2)

track_backend_buffer (12-15)

track_backend_buffer (13-13)

ggml/src/ggml-remotingfrontend/ggml-backend-host-buffer-type.cpp (3)

ggml/src/ggml-remotingfrontend/virtgpu-utils.h (4)

WARNING (61-70)

FATAL (83-94)

INFO (35-44)

INFO (46-47)

ggml/src/ggml-remotingfrontend/virtgpu-shm.cpp (2)

virtgpu_shmem_destroy (71-77)

virtgpu_shmem_destroy (72-73)

ggml/src/ggml-remotingfrontend/virtgpu-forward-device.cpp (2)

apir_device_buffer_from_ptr (203-237)

apir_device_buffer_from_ptr (204-206)

ggml/src/ggml-remotingbackend/shared/venus_cs_ggml-rpc.h (3)

ggml/src/ggml-remotingbackend/shared/venus_cs_ggml-rpc.cpp (14)

serialize_tensor (17-46)

serialize_tensor (18-18)

serialize_graph (96-117)

serialize_graph (97-97)

track_backend_buffer (12-15)

track_backend_buffer (13-13)

add_tensor (80-94)

add_tensor (81-81)

deserialize_tensor (48-78)

deserialize_tensor (49-49)

create_node (119-142)

create_node (120-123)

deserialize_graph (144-167)

deserialize_graph (145-145)

ggml/src/ggml-remotingfrontend/venus_cs_ggml-rpc-front.cpp (6)

serialize_tensor (12-48)

serialize_tensor (13-13)

serialize_graph (66-87)

serialize_graph (67-67)

add_tensor (50-64)

add_tensor (51-51)

ggml/src/ggml-remotingbackend/venus_cs_ggml-rpc-back.cpp (12)

track_backend_buffer (12-15)

track_backend_buffer (13-13)

untrack_backend_buffer (17-26)

untrack_backend_buffer (18-18)

get_track_backend_buffers (28-31)

get_track_backend_buffers (29-29)

deserialize_tensor (33-68)

deserialize_tensor (34-34)

create_node (70-93)

create_node (71-74)

deserialize_graph (95-118)

deserialize_graph (96-96)

ggml/src/ggml-remotingbackend/backend-dispatched-backend.cpp (5)

ggml/src/ggml-remotingbackend/shared/apir_backend.h (2)

start_timer (91-95)

stop_timer (98-108)

ggml/src/ggml-remotingbackend/shared/venus_cs_ggml.h (3)

vn_decode_virtgpu_shmem_res_id (133-136)

vn_decode_ggml_cgraph (154-167)

vn_encode_ggml_status (116-119)

ggml/src/ggml-remotingfrontend/virtgpu-utils.h (2)

FATAL (83-94)

ERROR (72-81)

ggml/src/ggml-remotingbackend/shared/venus_cs.h (2)

vn_decode_size_t (394-400)

vn_cs_new_decoder (27-35)

ggml/src/ggml.c (2)

ggml_graph_node (6885-6893)

ggml_op_desc (1273-1283)

ggml/src/ggml-remotingfrontend/virtgpu-utils.h (2)

ggml/src/ggml-remotingfrontend/virtgpu-utils.cpp (8)

thks_bye (189-195)

thks_bye (189-189)

breakpoint (197-200)

breakpoint (197-197)

util_sparse_array_get (125-186)

util_sparse_array_get (126-126)

util_sparse_array_init (33-41)

util_sparse_array_init (34-35)

ggml/src/ggml-remotingbackend/backend-utils.h (3)

INFO (42-45)

WARNING (47-50)

ERROR (52-55)

ggml/src/ggml-remotingfrontend/virtgpu.cpp (6)

ggml/src/ggml-remotingfrontend/virtgpu-utils.h (7)

FATAL (83-94)

INFO (35-44)

INFO (46-47)

ERROR (72-81)

WARNING (61-70)

MESSAGE (50-59)

os_time_sleep (126-133)

ggml/src/ggml-remotingbackend/shared/venus_cs.h (3)

vn_encode_uint32_t (342-346)

vn_decode_uint32_t (348-352)

vn_encode_int32_t (229-233)

ggml/src/ggml-remotingbackend/shared/apir_backend.h (3)

apir_backend_initialize_error (123-139)

start_timer (91-95)

stop_timer (98-108)

ggml/src/ggml-remotingfrontend/virtgpu-utils.cpp (2)

util_sparse_array_init (33-41)

util_sparse_array_init (34-35)

ggml/src/ggml-remotingfrontend/virtgpu-shm.cpp (2)

virtgpu_shmem_create (79-111)

virtgpu_shmem_create (80-80)

ggml/src/ggml-remotingfrontend/virtgpu.h (2)

vn_log (55-98)

virtgpu_ioctl (101-105)

ggml/src/ggml-remotingfrontend/ggml-backend-buffer-type.cpp (3)

ggml/src/ggml-remotingfrontend/virtgpu-utils.h (1)

FATAL (83-94)

ggml/src/ggml-remotingfrontend/virtgpu-forward-device.cpp (2)

apir_device_buffer_from_ptr (203-237)

apir_device_buffer_from_ptr (204-206)

ggml/src/ggml-remotingfrontend/virtgpu-forward-buffer-type.cpp (8)

apir_buffer_type_get_name (3-29)

apir_buffer_type_get_name (4-4)

apir_buffer_type_get_alignment (31-51)

apir_buffer_type_get_alignment (32-32)

apir_buffer_type_get_max_size (53-73)

apir_buffer_type_get_max_size (54-54)

apir_buffer_type_is_host (75-95)

apir_buffer_type_is_host (76-76)

ggml/src/ggml-remotingfrontend/ggml-backend-reg.cpp (4)

ggml/src/ggml-remotingfrontend/virtgpu.cpp (2)

create_virtgpu (185-237)

create_virtgpu (186-186)

ggml/src/ggml-remotingfrontend/virtgpu-forward-device.cpp (2)

apir_device_get_count (3-25)

apir_device_get_count (4-4)

ggml/src/ggml-remotingfrontend/ggml-metal-remoting.cpp (2)

get_metal_dev_context (4-18)

get_metal_dev_context (4-4)

ggml/src/ggml-remotingbackend/shared/apir_backend.h (1)

show_timer (110-121)

ggml/src/ggml-remotingbackend/backend.cpp (6)

ggml/src/ggml-remotingbackend/venus_cs_ggml-rpc-back.cpp (4)

get_track_backend_buffers (28-31)

get_track_backend_buffers (29-29)

untrack_backend_buffer (17-26)

untrack_backend_buffer (18-18)

ggml/src/ggml-remotingbackend/backend-utils.h (2)

INFO (42-45)

ERROR (52-55)

ggml/src/ggml-remotingbackend/shared/apir_backend.h (1)

show_timer (110-121)

ggml/src/ggml-remotingbackend/backend-dispatched.cpp (3)

ggml_backend_reg_fct (23-23)

backend_dispatch_initialize (19-47)

backend_dispatch_initialize (19-19)

ggml/src/ggml-remotingbackend/backend-dispatched-metal.cpp (1)

ggml_backend_metal_get_device_context_fct (9-12)

ggml/src/ggml-remotingbackend/backend-dispatched.h (1)

backend_dispatch_command_name (52-88)

ggml/src/ggml-remotingbackend/shared/venus_cs_ggml-rpc.cpp (3)

ggml/src/ggml.c (7)

ggml_new_tensor_4d (1740-1749)

ggml_nbytes (1203-1226)

ggml_set_name (1808-1815)

ggml_tensor_overhead (1356-1358)

ggml_graph_overhead_custom (6700-6702)

ggml_init (1487-1527)

ggml_new_graph_custom (6708-6750)

ggml/src/ggml-remotingfrontend/venus_cs_ggml-rpc-front.cpp (2)

serialize_tensor (12-48)

serialize_tensor (13-13)

ggml/src/ggml-remotingbackend/venus_cs_ggml-rpc-back.cpp (2)

deserialize_tensor (33-68)

deserialize_tensor (34-34)

ggml/src/ggml-remotingbackend/shared/venus_cs.h (1)

ggml/src/ggml-remotingfrontend/virtgpu-utils.h (1)

FATAL (83-94)

ggml/src/ggml-remotingbackend/shared/apir_backend.h (2)

ggml/src/ggml-remotingbackend/backend-utils.h (1)

INFO (42-45)

ggml/src/ggml-remotingfrontend/virtgpu-utils.h (2)

INFO (35-44)

INFO (46-47)

ggml/src/ggml-remotingfrontend/virtgpu-utils.cpp (1)

ggml/src/ggml-remotingfrontend/virtgpu-utils.h (2)

INFO (35-44)

INFO (46-47)

🪛 Shellcheck (0.11.0)

prepare.backend.sh