Running large language models on a single GPU for throughput-oriented scenarios.
-
Updated
Oct 28, 2024 - Python
Running large language models on a single GPU for throughput-oriented scenarios.
Run Mixtral-8x7B models in Colab or consumer desktops
PyTorch native quantization and sparsity for training and inference
dpdk infrastructure for software acceleration. Currently working on RX and ACL pre-filter
DPU-Powered File System Virtualization over virtio-fs
A collection of tests for the Open vSwitch HW offload.
A Dynamic Programming Offloading Algorithm for Mobile Cloud Computing
A lightweight framework that enables serverless users to reduce their bills by harvesting non-serverless compute resources such as their VMs, on-premise servers, or personal computers.
LeapIO: Efficient and Portable Virtual NVMe Storage on ARM SoCs (ASPLOS'20)
A framework for IoT devices to offload tasks to the cloud, resulting in efficient computation and decreased cloud costs.
Monero hardware wallet protocol implementation for Trezor, agent
The container-based cloud platform for mobile code offloading
Code for paper "Real-time Neural Network Inference on Extremely Weak Devices: Agile Offloading with Explainable AI" (MobiCom'22)
Monero wallet Trezor integration documentation
Backend.AI Client Library for Python
基于 DPDK 和智能网卡的流量卸载试验. A flow offloading prototype base on DPDK and Mellanox/Nvidia SmartNIC.
A Pandas-inspired data analysis project with lazy semantics and query-offloading to SQLite
Examples of using OpenMP offload with dgemm in the target region
Add a description, image, and links to the offloading topic page so that developers can more easily learn about it.
To associate your repository with the offloading topic, visit your repo's landing page and select "manage topics."