Skip to content

Criss-Wang/dpai

Repository files navigation

build docs lint codecov License commit

Introduction

Deployable AI aims to enable quick inference serving in local environment in various styles of your choice.

Getting Started

Installation

To install this package, the easiest is to run pip install dpai. If you prefer directly install using this repo code, you can clone it and run make command directly.

Basic Usage

  1. save your model in .joblib format. Example:

    from joblib import dump
    
    your_model_artifact = {
        "model": your_model,
        # other metadata
        "tokenizer": ...,
        "quantization": ...,
        ...
    }
    
    dump(your_model_artifact, "MODEL_ARTIFACT_PATH.joblib")
  2. Create inference script inference.py with two functions input_fn and predict_fn (similar to how sagemaker inference does). Usually you'll create an inference file for each model you register. Example:

    def input_fn(data):
        processed_data_for_model_input = ...  # some transformation logic
        return processed_data_for_model_input
    
    def predict_fn(input, model):
        result = model(input)
        return result
  3. Register model: run deployaible register --name=your_model_name --model_path=your_model_path --inference_path=your_inference_path

  4. Serve your model: run deployaible serve --port=your_port You will get a backend running on your_port (default is 9000). A sample endpoint will be localhost:9000/your_model_name/predict.

  5. Format your data input in JSON style: {"data": your_input_data}. Make sure it aligns with the input_fn your infrence script

  6. Test endpoint: example request

    curl -X POST -H "Content-Type: application/json" -d '{"data": ["val"]}' http://localhost:9100/GPT4/predict
  7. You can also the APIs via swagger UI on http://localhost:your_port/docs

Sample notebook

Highlights

  • Supports multiple types of model serving
  • Sample UI
  • Works on Linux/MacOS/Windows

Limitations

  • Currently only supported request type is application/json.

Documentation

See the doc here

About

Deployable AI is a simple toolkit to serve your ML model inferences quickly

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors