This tutorial covers the basics of how to work with data and build predictive models using the ClosedLoop platform. We will use a synthetic data set to create a risk predictor for Emergency Department (ED) utilization. Each section of the tutorial will walk through one step of the process, starting with initial exploration of the raw data and continuing through deployment of the model in an operational setting. After completing this tutorial you will be ready to build models on your own data.
Code, or Point-and-Click?
In this tutorial, you will create a machine learning model with Python using the ClosedLoop API (Application Programming Interface). ClosedLoop also has a full featured web application that allows you to build models and analyze results using a point and click approach. For a non-coding introduction to the ClosedLoop platform, please log in at: https://apps.closedloop.ai and follow the web-based tutorial.
Even for coders, following along in the UI may be useful. The API and the UI utilize the same platform and data, so changes made through the API are reflected in the UI and vice versa. Throughout the tutorial when we introduce a new API component, we provide a link to where that same capability is available in the UI.
This tutorial assumes a basic familiarity with Python and requires that you have an API username and password from ClosedLoop. If you don't have one, please register here or contact firstname.lastname@example.org to create an account and we’ll send you a username and password within 48 hours.
The first notebook describes how to set up the system. The only prerequisite is to have Python 3.6 or later installed on your system, and to have a way to run Jupyter notebooks.
For obvious privacy reasons, we do not use real data in this tutorial. We use a simulated set of healthcare administrative claims for a population of 124,000 people. The ClosedLoop platform has built in support for a wide variety of healthcare data, including electronic medical records (EMR), claims, surveys and assessments, genomic data, location-based data, and patient-generated data. With custom data adapters, virtually any data that can be linked back to a patient can be used for data analysis and predictive modeling.
If you would like to try out the ClosedLoop platform using your own data, please contact email@example.com.
Each section of the tutorial covers a different step in the development of a predictive model. There is one Jupyter notebook for each section. Each notebook can be run individually, but it is best to do the tutorial in order since each section builds on the previous one.
- 1 - Setup - Covers how to install the ClosedLoop python package and configure your login.
- 2 - Quickstart - Define a predictive model and get results in a few lines of code.
- 3 - Data Exploration & Quality Control - Covers the basics of querying data within the system and assessing data quality
- 4 - Creating Features with CL Expressions - Introduces the CL Expression language to create fatures.
- 5 - Defining a Predictor - Defines the outcome for ED Utilization and creates the actual predictor
- 6 - Selecting Features For Your Model - Explains ClosedLoop's standard and custom features and the feature catalog.
- 7 - Defining a Training Population - Introduces how to select and manage groups of people within the platform
- 8 - Creating a Model & Reviewing Results - Train, test, and analyze the results for the model