Skip to content

HuaweiCloudDeveloper/llama.cpp-image

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 

Repository files navigation

llama.cpp Inference Framework

English | 简体中文

Table of Contents

Repository Introduction

llama.cpp is written purely in C/C++ without any external dependencies, delivering optimal performance and cross-platform compatibility that runs smoothly from embedded devices to high-performance servers.

Specially optimized for large language model inference, it supports quantization techniques and hardware acceleration, significantly reducing memory usage and computational costs while maintaining model quality.

Supports multiple hardware platforms including CPU, GPU (CUDA, OpenCL, Metal), and is compatible with operating systems like Windows, macOS, and Linux to meet various deployment needs.
This product is provided as a pre-installed image on Huawei Cloud's HCE2.0 system (Kunpeng architecture).

This project provides the open-source image product llama.cpp Inference Framework, which comes pre-installed with the llama.cpp inference framework and its related runtime environment, along with deployment templates. Follow the usage guide to easily enjoy an "out-of-the-box" efficient experience.

System Requirements:

  • CPU: 2vCPUs or higher
  • RAM: 4GB or more
  • Disk: At least 40GB

Prerequisites

Register a Huawei Account and Activate Huawei Cloud

Image Specifications

Image Specification Features Notes
llama.cpp-b5834-Kunpeng Deployed on Kunpeng Cloud Server + Huawei Cloud EulerOS 2.0 64bit

Getting Help

How to Contribute

  • Fork this repository and submit merge requests
  • Synchronize updates to README.md based on your open-source image information

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •