This is the final project in the Udacity Machine Learning DevOps Engineer NanoDegree. The code was forked from the course repository.
A random forest classifier was trained on the Census Income Data Set from the UCI Machine Learning Repository to predict whether a person makes more than $50,000/year based on their census response. Information on the model can be found in the model card.
The input data and exported model artifacts were tracked using DVC with the artifacts store on Amazon S3. A CI/CD pipeline was set up using GitHub actions to lint the code
using flake8
and run tests with pytest
.
A FastAPI API was developed to allow users to POST input data to the model and receive a prediction from the model. The API documentation provides examples on how to use the API.
If the CI/CD pipeline passes, the model is the deployed on Heroku where users can interact with it (after the dyno spins up if it was idle).
Get a welcome message:
Using curl
:
curl -X 'GET' \
'https://udacity-fastapi-model.herokuapp.com/' \
-H 'accept: application/json'
Response:
{
"greeting": "Welcome to the FastAPI Census Income Data Set Model API!"
}
Get a salary prediction from the model:
Using curl
:
curl -X 'POST' \
'https://udacity-fastapi-model.herokuapp.com/predict' \
-H 'accept: application/json' \
-H 'Content-Type: application/json' \
-d '{
"age": 25,
"workclass": "State-gov",
"fnlgt": 77516,
"education": "Bachelors",
"education-num": 13,
"marital-status": "Never-married",
"occupation": "Adm-clerical",
"relationship": "Not-in-family",
"race": "White",
"sex": "Male",
"capital-gain": 2174,
"capital-loss": 0,
"hours-per-week": 40,
"native-country": "United-States"
}'
Response:
{
"prediction": "<=50K"
}