You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Fine-tuning is the process of adapting a base language model to a specific task or domain by continuing its training on a smaller, task-specific dataset. This microservice provides OpenAI Fine-tuning API compatible interfaces for users to easily submit fine-tuning jobs in a consistent way. Based on Ray unified framework, it is a scalable solution for distributed LLM fine-tuning.
This RFC aims to introduce OPEA LLM Fine-tuning microserive design. The objective is to address the overall architecture, workflow and key design decisions.
Motivation
LLM serving is already an integral part of OPEA. Adding fine-tuning microservice complements the ability of OPEA to customize users' own models to adapt to specific task or domain by using their own datasets.
Design Proposal
The Fine-tuning microservice provides the following features that is OpenAI compatible:
Create fine-tuning job
List fine-tuning jobs
List fine-tuning events
List fine-tuning checkpoints
Retrieve fine-tuning job
Cancel fine-tuning
The following figure shows the architecture of Fine-tuning microservice:
In the Kubernetes cluster, KubeRay operator fully manages the lifecycle of custom resource definition (CRD) "RayCluster", including cluster creation and deletion, autoscaling, and ensuring fault tolerance. When fine-tuning service processes requests and submit Ray job to the RayCluster, it can scale worker nodes based on resource requirements.
For deployment, given an existing Kubernetes cluster, take the following steps:
Deploy a KubeRay Operator using Helm Charts.
Build custom LLM Fine-tuning image and upload it to the cluster.
Create Ray Cluster with Autoscaling.
Start LLM Fine-tuning service.
Alternatives Considered
N/A
Compatibility
This microservice provides OpenAI compatible fine-tuning interfaces. See the following documents for details:
LLM Fine-tuning Microservice
Fine-tuning is the process of adapting a base language model to a specific task or domain by continuing its training on a smaller, task-specific dataset. This microservice provides OpenAI Fine-tuning API compatible interfaces for users to easily submit fine-tuning jobs in a consistent way. Based on Ray unified framework, it is a scalable solution for distributed LLM fine-tuning.
RFC Content
Author
xwu99
Status
Drafting, will change to Under Review when ready
Objective
This RFC aims to introduce OPEA LLM Fine-tuning microserive design. The objective is to address the overall architecture, workflow and key design decisions.
Motivation
LLM serving is already an integral part of OPEA. Adding fine-tuning microservice complements the ability of OPEA to customize users' own models to adapt to specific task or domain by using their own datasets.
Design Proposal
The Fine-tuning microservice provides the following features that is OpenAI compatible:
The following figure shows the architecture of Fine-tuning microservice:
In the Kubernetes cluster, KubeRay operator fully manages the lifecycle of custom resource definition (CRD) "RayCluster", including cluster creation and deletion, autoscaling, and ensuring fault tolerance. When fine-tuning service processes requests and submit Ray job to the RayCluster, it can scale worker nodes based on resource requirements.
For deployment, given an existing Kubernetes cluster, take the following steps:
Alternatives Considered
N/A
Compatibility
This microservice provides OpenAI compatible fine-tuning interfaces. See the following documents for details:
Miscs
Task List:
The text was updated successfully, but these errors were encountered: