This repo shows a dead-simple example of serving a linear regression model with PyTorch.
For learning purposes, the "model" is spilt into three concepts:
- Initial Dataset
- Training
- Inference
The model is trained using an initial dataset generated by ./models/dataset/gen_data.py.
The data looks like this:
The model is trained using ./model/train.py. The output of the training is a pre-trained serialized model saved in a .pth file.
We can now load the model .pth file and use it to make predictions. This is done in ./model/inference.py.
Example:
$ python model/inference.py
enter value: 9.2
prediction: 24.539400100708008First package the model files and custom torchserve handler:
torch-model-archiver -f \
--model-name model \
--export-path serve/ \
--version 1.0 \
--serialized-file model/model.pth \
--extra-files "model/model.py" \
--handler serve/handler.pyThen run the server:
torchserve \
--start --ncs \
--model-store serve/ \
--models model.mar \
--log-config serve/log4j2.xml
torchserve --start --log-config /path/to/custom/log4j2.xmlFinally, test it:
curl --location --request GET 'http://localhost:8080/predictions/model' \
--header 'Content-Type: application/json' \
--data '{
"data": 5.0
}'
[
16.075029373168945
]

