# 在 MLFlow 中使用自定义 Python 封装

本 notebook 展示如何在 Phi-3 mini 4K Instruct 示例中使用自定义 Python 封装。

In [None]:
# 导入所需的库
import mlflow
from mlflow.models import infer_signature
import onnxruntime_genai as og

### 为 Phi-3 Mini 4K 模型定义 Python 类

In [2]:
# 自定义 Python 类
class Phi3Model(mlflow.pyfunc.PythonModel):
    def load_context(self, context):
        # 从工件中检索模型
        model_path = context.artifacts["phi3-mini-onnx"]
        model_options = {
             "max_length": 300,
             "temperature": 0.2,         
        }
    
        # 定义模型
        self.phi3_model = og.Model(model_path)
        self.params = og.GeneratorParams(self.phi3_model)
        self.params.set_search_options(**model_options)
        
        # 定义分词器（Tokenizer）
        self.tokenizer = og.Tokenizer(self.phi3_model)

    def predict(self, context, model_input):
        # 从输入中检索提示
        prompt = model_input["prompt"][0]
        self.params.input_ids = self.tokenizer.encode(prompt)

        # 生成模型的响应
        response = self.phi3_model.generate(self.params)

        return self.tokenizer.decode(response[0][len(self.params.input_ids):])

### 生成 MLFlow 工件

In [3]:
# 使用自定义 Python 模型生成 MLflow 模型
input_example = {"prompt": "<|system|>You are a helpful AI assistant.<|end|><|user|>What is the capital of Spain?<|end|><|assistant|>"}
artifact_path = "phi3_mlflow_model"

with mlflow.start_run() as run:
    model_info = mlflow.pyfunc.log_model(
        artifact_path = artifact_path,
        python_model = Phi3Model(),
        artifacts = {
            "phi3-mini-onnx": "cpu_and_mobile/cpu-int4-rtn-block-32-acc-level-4",
        },
        input_example = input_example,
        signature = infer_signature(input_example, ["Run"]),
        extra_pip_requirements = ["torch", "onnxruntime_genai", "numpy"],
    )

2024/06/18 14:17:34 INFO mlflow.models.utils: We convert input dictionaries to pandas DataFrames such that each key represents a column, collectively constituting a single row of data. If you would like to save data as multiple rows, please convert your data to a pandas DataFrame before passing to input_example.


Downloading artifacts:   0%|          | 0/10 [00:00<?, ?it/s]

### Running Phi-3 as MLFlow model

In [4]:
# 加载 Phi-3 MLFlow 模型
loaded_model = mlflow.pyfunc.load_model(
    model_uri = model_info.model_uri
)

In [5]:
# 检查 MLFlow 模型的签名
loaded_model.metadata.signature

inputs: 
  ['prompt': string (required)]
outputs: 
  [string (required)]
params: 
  None

In [6]:
# 测试加载的模型
response = loaded_model.predict(
    {"prompt": "<|system|>You are a stand-up comedian.<|end|><|user|>Tell me a joke about atom<|end|><|assistant|>",}
)
print(response)

 Alright, here's a little atom-related joke for you!

Why don't electrons ever play hide and seek with protons?

Because good luck finding them when they're always "sharing" their electrons!

Remember, this is all in good fun, and we're just having a little atomic-level humor!
