Who is this for:
- 🙋 You know how to build and save
sklearn
models - 🙋♀️ You have a Heroku account (or you have 1 minute spare time to create one)
- 🙋♂️ You want to know how to deploy an sklearn model as a REST API and you're more of a doer than a reader
README Outline:
- 📝 Process overview: Step-by-step instructions for creating and deploying an sklearn model to Heroku.
- 📄 Files/descriptions: All files used in the project laid out with descriptions of how they contribute to the model build and deploy.
- 🐍 Python files: Files for building the model, serving the model, and requesting a prediction from the deployed model.
- 🌐 Other important files: 2 files needed for deploying to Heroku.
- 💾 Data files: A csv for building the example model and the saved model itself.
(Assumes that you'll be working in a git repo and that you have a basic knowledge of git)
- Build and save a model to be deployed using
sklearn
andpickle
(seemodel_build.py
). - Create a
flask
API endpoint for users to request predictions from your model saved on a server (seeserver.py
). - Deploy the model to Heroku
- Create a Heroku account if you haven't already
- Create a
Procfile
that specifies how Heroku should run your project- We specify to run
server:app
- Theserver
represents the file name to run (server.py
) & theapp
corresponds to how we named the flask app object inserver.py
(app = Flask(__name__)
)
- We specify to run
- Install the Heroku CLI and run
heroku login
from the command line - Run
heroku create
from the command line to do the initial heroku app setup (running the command like this will create a random inital app name likebattery-horse-staple
). Note that this will add remote branches to your repo; rungit remote -v
to confirm. - Run
git push heroku master
(orgit push heroku main
depending on name of your repos main branch) to deploy the application to Heroku. This will take a minute as it installs all of the requirements found in the project'srequirements.txt
(or other python requirements file type likePipfile
frompipenv
). - Locate the URL to your app from the output of running
git push heroku master
(i.e. you'll see a line likehttp://frozen-island-12625.herokuapp.com/ deployed to Heroku
). - Request a prediction from your deployed model (see
request_prediction.py
). The URL for the request will be a combination of the URL found in step 6 and the name of the route specified inserver.py
. For example, the URL shown inrequest_prediction.py
has the base URLhttp://frozen-island-12625.herokuapp.com
and the routepredict
so the resulting URL ishttp://frozen-island-12625.herokuapp.com/predict
.
Each file is heavily commented to give further insight into the contents.
These are in order of how they should be viewed/how they should be created if redoing this process. The model needs to be built by model_build.py
before serving the model with sever.py
, and the server needs to be live before requesting a prediction with request_prediction.py
.
A script to build the model to be deployed. The model is built using the following from sklearn
: ColumnTransformer()
, Pipeline()
, and GridSearchCV()
. The model used is a RandomForestClassifier()
. The final model is then saved to a file using pickle
.
A script to server the model once on heroku (or locally). Contents include: loading the pickled model file and creating a flask
app with an API endpoint to make predictions with the loaded model.
A script to request a prediction from the deployed REST API model using the requests
package.
How Heroku will know how to serve your model. In this file you specify what app to run and what file that app can be found in. We use the gunicorn
package to run our app. For a more complete discussion on the Procfile
see Heroku's documentation.
Lists out the requirements for your project (package names and versions). This is a good resource for more.
Note, your project doesn't need to be a deployed model for this file to be useful. Whenever starting a new project, I like to have this item on my getting started TODO list. See another requirements.txt
example in this minimal_python_package
repo.
Pickled model file created by model_build.py
. See pickle
's documentation for more.
The example data used to build a predictive model to predict a penguin's species. The data is sourced from the seaborn
package.