-
-
Notifications
You must be signed in to change notification settings - Fork 8.9k
Closed as not planned
Closed as not planned
Copy link
Labels
Description
I'd like to deploy my custom model on managed Azure machine learning real-time endpoint. So, I've registered my model, but it's not clear to me how to write a proper scoring script in azure when the model is intended to receive OpenAI API compatible calls. The docs in azure provide this basic sample:
import os
import logging
import json
import numpy
import joblib
def init():
"""
This function is called when the container is initialized/started, typically after create/update of the deployment.
You can write the logic here to perform init operations like caching the model in memory
"""
global model
# AZUREML_MODEL_DIR is an environment variable created during deployment.
# It is the path to the model folder (./azureml-models/$MODEL_NAME/$VERSION)
# Please provide your model's folder name if there is one
model_path = os.path.join(
os.getenv("AZUREML_MODEL_DIR"), "model/sklearn_regression_model.pkl"
)
# deserialize the model file back into a sklearn model
model = joblib.load(model_path)
logging.info("Init complete")
def run(raw_data):
"""
This function is called for every invocation of the endpoint to perform the actual scoring/prediction.
In the example we extract the data from the json input and call the scikit-learn model's predict()
method and return the result back
"""
logging.info("model 1: request received")
data = json.loads(raw_data)["data"]
data = numpy.array(data)
result = model.predict(data)
logging.info("Request processed")
return result.tolist()
Has anyone done similar thing?
How would you like to use vllm
I want to run inference of molmo7B (https://huggingface.co/allenai/Molmo-7B-D-0924) on managed Azure real-time endpoint. I don't know how to integrate it with vllm.
Before submitting a new issue...
- Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.