-
Notifications
You must be signed in to change notification settings - Fork 4
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* Added intro to MyAutoML to docs * Updated process flow diagrams * Updated process flow diagrams * Unhide ToC in user guide and reference * Add missing model.png * Updated README.md to include a link to the latest documentation. * Added labels to modelling diagram
- Loading branch information
1 parent
a4626a5
commit 9bbf160
Showing
15 changed files
with
320 additions
and
80 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,33 @@ | ||
.. _environment: | ||
|
||
=========== | ||
Environment | ||
=========== | ||
|
||
To get the most out of MyAutoML you will need to install and setup several components in your environment for MyAutoML | ||
to work with. Please have a look at the :ref:`ml_process` to see where these components fit in. | ||
|
||
|
||
.. _model-registry-mlflow: | ||
|
||
Model Registry: MLflow | ||
~~~~~~~~~~~~~~~~~~~~~~ | ||
|
||
As a model registry we work with `MLflow <https://mlflow.org>`__. MLflow has two separate modules helping us to keep | ||
a good record of our models: | ||
|
||
- MLflow Tracking | ||
- MLflow Model Registry | ||
|
||
In the :ref:`ml_process`, when we refer to a Model Registry, we mean both of these MLflow components above: every | ||
trained model is tracked in the MLflow Tracking Server. Additionally, some will be registered with a registered model | ||
name in the MLflow Model Registry. In the prediction process, a model is loaded from the MLflow Model Registry. | ||
|
||
Please refer to the `installation instructions <https://mlflow.org/docs/latest/quickstart.html#installing-mlflow>`__ and | ||
`MLflow Tracking Servers <https://mlflow.org/docs/latest/tracking.html#mlflow-tracking-servers>`__ to get you started. | ||
In order to use the MLflow Model Registry, you will need to setup an MLflow Tracking Server with a database Backend | ||
Store, such as SQLite or PostgreSQL. | ||
|
||
.. toctree:: | ||
:maxdepth: 2 | ||
:hidden: |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,125 @@ | ||
.. _ml_process: | ||
|
||
======================== | ||
Machine Learning Process | ||
======================== | ||
|
||
The main two processes that we aim to cover with MyAutoML are the training and predicting processes. They are two | ||
separate processes, one for training a model and one for making predictions using a trained model. Each process is | ||
executed by running a Python script, e.g. :code:`train.py` and :code:`predict.py`. This can be as simple or as complex | ||
as you like: you can run the scripts manually (you can even run the code from a Jupyter notebook), or as an automated | ||
script in a Docker container on a Kubernetes platform scheduled by Airflow. | ||
|
||
|
||
Training | ||
-------- | ||
|
||
The purpose of the training process is to start with some data, process it with a certain algorithm and produce a | ||
model that captures the interesting patterns in the training data. | ||
|
||
.. figure:: ../images/training-process.png | ||
:width: 100% | ||
:align: center | ||
|
||
|
||
Predicting | ||
---------- | ||
|
||
The goal of the prediction process is to use a (trained) model and apply it to some new data to make predictions. | ||
A prediction script can make predictions for a batch of items, or it can spawn an API for real-time, on-demand | ||
predictions. | ||
|
||
.. figure:: ../images/prediction-process.png | ||
:width: 100% | ||
:align: center | ||
|
||
|
||
Calibrating | ||
----------------- | ||
|
||
In some classification use cases we need to `calibrate <https://scikit-learn.org/stable/modules/calibration.html>`_ | ||
the output of our models to actual probabilities, rather than generic scores. While sometimes this can be done | ||
directly in the training process, in other cases it is more pragmatic to train a model first, and perform the | ||
calibration separately using the following process: | ||
|
||
.. figure:: ../images/calibrating-process.png | ||
:width: 100% | ||
:align: center | ||
|
||
|
||
Further reading | ||
--------------- | ||
|
||
.. raw:: html | ||
|
||
<div class="container"> | ||
<div id="accordion" class="shadow tutorial-accordion"> | ||
|
||
<div class="card tutorial-card"> | ||
<div class="card-header collapsed card-link" data-toggle="collapse" data-target="#collapse_1"> | ||
<div class="d-flex flex-row tutorial-card-header-1"> | ||
<div class="d-flex flex-row tutorial-card-header-2"> | ||
<button class="btn btn-dark btn-sm"></button> | ||
Training a model | ||
</div> | ||
</div> | ||
</div> | ||
<div id="collapse_1" class="collapse" data-parent="#accordion"> | ||
<div class="card-body"> | ||
|
||
`Wikipedia: Training, validation, and test sets | ||
<https://en.wikipedia.org/wiki/Training,_validation,_and_test_sets/>`_ | ||
|
||
`Machine Learning Mastery: How to Use ROC Curves and Precision-Recall Curves for Classification in Python | ||
<https://machinelearningmastery.com/roc-curves-and-precision-recall-curves-for-classification-in-python/>`_ | ||
|
||
.. raw:: html | ||
|
||
</div> | ||
</div> | ||
</div> | ||
<div class="card tutorial-card"> | ||
<div class="card-header collapsed card-link" data-toggle="collapse" data-target="#collapse_2"> | ||
<div class="d-flex flex-row tutorial-card-header-1"> | ||
<div class="d-flex flex-row tutorial-card-header-2"> | ||
<button class="btn btn-dark btn-sm"></button> | ||
Making predictions | ||
</div> | ||
</div> | ||
</div> | ||
<div id="collapse_2" class="collapse" data-parent="#accordion"> | ||
<div class="card-body"> | ||
|
||
.. raw:: html | ||
|
||
</div> | ||
</div> | ||
</div> | ||
<div class="card tutorial-card"> | ||
<div class="card-header collapsed card-link" data-toggle="collapse" data-target="#collapse_3"> | ||
<div class="d-flex flex-row tutorial-card-header-1"> | ||
<div class="d-flex flex-row tutorial-card-header-2"> | ||
<button class="btn btn-dark btn-sm"></button> | ||
Model calibration | ||
</div> | ||
</div> | ||
</div> | ||
<div id="collapse_3" class="collapse" data-parent="#accordion"> | ||
<div class="card-body"> | ||
|
||
`Machine Learning Mastery: How and When to Use a Calibrated Classification Model with scikit-learn | ||
<https://machinelearningmastery.com/calibrated-classification-model-in-scikit-learn/>`_ | ||
|
||
.. raw:: html | ||
|
||
</div> | ||
</div> | ||
</div> | ||
|
||
</div> | ||
</div> | ||
|
||
|
||
.. toctree:: | ||
:maxdepth: 2 | ||
:hidden: |
Oops, something went wrong.