minimal_sklearn_model_deploy 🤖🧠☁️

Who is this for:

  • 🙋 You know how to build and save sklearn models
  • 🙋‍♀️ You have a Heroku account (or you have 1 minute spare time to create one)
  • 🙋‍♂️ You want to know how to deploy an sklearn model as a REST API and you're more of a doer than a reader

README Outline:

  • 📝 Process overview: Step-by-step instructions for creating and deploying an sklearn model to Heroku.
  • 📄 Files/descriptions: All files used in the project laid out with descriptions of how they contribute to the model build and deploy.
    • 🐍 Python files: Files for building the model, serving the model, and requesting a prediction from the deployed model.
    • 🌐 Other important files: 2 files needed for deploying to Heroku.
    • 💾 Data files: A csv for building the example model and the saved model itself.

Process overview

(Assumes that you'll be working in a git repo and that you have a basic knowledge of git)

  1. Build and save a model to be deployed using sklearn and pickle (see
  2. Create a flask API endpoint for users to request predictions from your model saved on a server (see
  3. Deploy the model to Heroku
    1. Create a Heroku account if you haven't already
    2. Create a Procfile that specifies how Heroku should run your project
      • We specify to run server:app - The server represents the file name to run ( & the app corresponds to how we named the flask app object in (app = Flask(__name__))
    3. Install the Heroku CLI and run heroku login from the command line
    4. Run heroku create from the command line to do the initial heroku app setup (running the command like this will create a random inital app name like battery-horse-staple). Note that this will add remote branches to your repo; run git remote -v to confirm.
    5. Run git push heroku master (or git push heroku main depending on name of your repos main branch) to deploy the application to Heroku. This will take a minute as it installs all of the requirements found in the project's requirements.txt (or other python requirements file type like Pipfile from pipenv).
    6. Locate the URL to your app from the output of running git push heroku master (i.e. you'll see a line like deployed to Heroku).
    7. Request a prediction from your deployed model (see The URL for the request will be a combination of the URL found in step 6 and the name of the route specified in For example, the URL shown in has the base URL and the route predict so the resulting URL is


Each file is heavily commented to give further insight into the contents.

Python files

These are in order of how they should be viewed/how they should be created if redoing this process. The model needs to be built by before serving the model with, and the server needs to be live before requesting a prediction with

A script to build the model to be deployed. The model is built using the following from sklearn: ColumnTransformer(), Pipeline(), and GridSearchCV(). The model used is a RandomForestClassifier(). The final model is then saved to a file using pickle.

A script to server the model once on heroku (or locally). Contents include: loading the pickled model file and creating a flask app with an API endpoint to make predictions with the loaded model.

A script to request a prediction from the deployed REST API model using the requests package.

Other important files

How Heroku will know how to serve your model. In this file you specify what app to run and what file that app can be found in. We use the gunicorn package to run our app. For a more complete discussion on the Procfile see Heroku's documentation.

Lists out the requirements for your project (package names and versions). This is a good resource for more.

Note, your project doesn't need to be a deployed model for this file to be useful. Whenever starting a new project, I like to have this item on my getting started TODO list. See another requirements.txt example in this minimal_python_package repo.

Data files

Pickled model file created by See pickle's documentation for more.

The example data used to build a predictive model to predict a penguin's species. The data is sourced from the seaborn package.

