Drift guide 1066 (#1447)

* added blank file * draft * small fixes to feature_importance.rst * drift guide draft * drift guide draft * drift guide draft * Apply suggestions from code review Co-authored-by: Noam Bressler <noamzbr@gmail.com> * Fixes * Fixes * measure instead of statistical test * Fixes * Fixes * Fixes * Bressler was right about grammar * Apply suggestions from code review Co-authored-by: Noam Bressler <noamzbr@gmail.com> * PR Fixes * Edited tabular docs * Finished vision docs * Added images * Some small fixes * Apply suggestions from code review Co-authored-by: shir22 <33841818+shir22@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: shir22 <33841818+shir22@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: shir22 <33841818+shir22@users.noreply.github.com> * small changes * commetns * Apply suggestions from code review Co-authored-by: shir22 <33841818+shir22@users.noreply.github.com> * Fixed comments * Additional changes * Additional changes * Fixed links * Fixed links * Fixed links * Fixed links * Fixed comments * Apply suggestions from code review Co-authored-by: Noam Bressler <noamzbr@gmail.com> * Last Bressler comment * Fixed comments Added to index * . Co-authored-by: Noam Bressler <noamzbr@gmail.com> Co-authored-by: shir22 <33841818+shir22@users.noreply.github.com>
deepchecks · May 29, 2022 · e87b2cd · e87b2cd
1 parent 40df64d
commit e87b2cd
Show file tree

Hide file tree

Showing 14 changed files with 485 additions and 244 deletions.
diff --git a/docs/source/_static/images/general/deepchecks_label_drift.png b/docs/source/_static/images/general/deepchecks_label_drift.png
diff --git a/docs/source/_static/images/general/patterns-of-concept-drift.png b/docs/source/_static/images/general/patterns-of-concept-drift.png
diff --git a/docs/source/_static/images/general/types-of-drift.png b/docs/source/_static/images/general/types-of-drift.png
diff --git a/docs/source/checks/tabular/model_evaluation/plot_train_test_prediction_drift.py b/docs/source/checks/tabular/model_evaluation/plot_train_test_prediction_drift.py
@@ -7,45 +7,31 @@
 
 **Structure:**
 
-* `What is prediction drift? <#what-is-prediction-drift>`__
+* `What Is Prediction Drift? <#what-is-prediction-drift>`__
 * `Generate Data <#generate-data>`__
 * `Build Model <#build-model>`__
 * `Run check <#run-check>`__
 
 What Is Prediction Drift?
-===========================
-The term drift (and all it's derivatives) is used to describe any change in the data compared
-to the data the model was trained on. Prediction drift refers to the case in which a change
-in the data (data/feature drift) has happened and as a result, the distribution of the
-models' prediction has changed.
+========================
+Drift is simply a change in the distribution of data over time, and it is
+also one of the top reasons why machine learning model's performance degrades
+over time.
 
+Prediction drift is when drift occurs in the prediction itself.
 Calculating prediction drift is especially useful in cases
 in which labels are not available for the test dataset, and so a drift in the predictions
 is our only indication that a changed has happened in the data that actually affects model
-predictions. If labels are available, it's also recommended to run the `Label Drift Check
-</examples/tabular/checks/distribution/examples/plot_train_test_label_drift.html>`__.
-
-There are two main causes for prediction drift:
-
-* A change in the sample population. In this case, the underline phenomenon we're trying
-  to predict behaves the same, but we're not getting the same types of samples. For example,
-  Iris Virginica stops growing and is not being predicted by the model trained to classify Iris species.
-* Concept drift, which means that the underline relation between the data and
-  the label has changed.
-  For example, we're trying to predict income based on food spending, but ongoing inflation effect prices.
-  It's important to note that concept drift won't necessarily result in prediction drift, unless it affects features that
-  are of high importance to the model.
-
-How Does the TrainTestPredictionDrift Check Work?
-=================================================
-There are many methods to detect drift, that usually include statistical methods
-that aim to measure difference between 2 distributions.
-We experimented with various approaches and found that for detecting drift between 2
-one-dimensional distributions, the following 2 methods give the best results:
-
-* For regression problems, the `ramer's V <https://en.wikipedia.org/wiki/Cram%C3%A9r%27s_V>`__
-* For classification problems, the `Wasserstein Distance (Earth Mover's Distance) <https://en.wikipedia.org/wiki/Wasserstein_metric>`__
+predictions. If labels are available, it's also recommended to run the
+:doc:`Label Drift check </checks_gallery/tabular/train_test_validation/plot_train_test_label_drift>`.
 
+For more information on drift, please visit our :doc:`drift guide </user-guide/general/drift_guide>`.
+
+How Deepchecks Detects Prediction Drift
+------------------------------------
+
+This check detects prediction drift by using :ref:`univariate measures <drift_detection_by_univariate_measure>`
+on the prediction output.
 """
 
 #%%

diff --git a/docs/source/checks/tabular/train_test_validation/plot_train_test_feature_drift.py b/docs/source/checks/tabular/train_test_validation/plot_train_test_feature_drift.py
@@ -14,46 +14,22 @@
 
 What is a feature drift?
 ========================
-Data drift is simply a change in the distribution of data over time. It is
-also one of the top reasons of a machine learning model performance degrades
+Drift is simply a change in the distribution of data over time, and it is
+also one of the top reasons why machine learning model's performance degrades
 over time.
 
-Causes of data drift include:
+Feature drift is a data drift that occurs in a single feature in the dataset.
 
-* Upstream process changes, such as a sensor being replaced that changes the
-  units of measurement from inches to centimeters.
-* Data quality issues, such as a broken sensor always reading 0.
-* Natural drift in the data, such as mean temperature changing with the seasons.
-* Change in relation between features, or covariate shift.
+For more information on drift, please visit our :doc:`drift guide </user-guide/general/drift_guide>`.
 
-Feature drift is such drift in a single feature in the dataset.
+How Deepchecks Detects Feature Drift
+------------------------------------
 
-In the context of machine learning, drift between the training set and the
-test set will likely make the model to be prone to errors. In other words,
-this means that the model was trained on data that is different from the
-current test data, thus it will probably make more mistakes predicting the
-target variable.
+This check detects feature drift by using :ref:`univariate measures <drift_detection_by_univariate_measure>`
+on each feature column separately.
+Another possible method for drift detection is by :ref:`a domain classifier <drift_detection_by_domain_classifier>`
+which is used in the :doc:`Whole Dataset Drift check </checks_gallery/tabular/train_test_validation/plot_whole_dataset_drift>`.
 
-How deepchecks detects feature drift
-------------------------------------
-There are many methods to detect feature drift. Some of them include
-training a classifier that detects which samples come from a known
-distribution and defines the drift by the accuracy of this classifier. For
-more information, refer to the :doc:`Whole Dataset Drift check
-</checks_gallery/tabular/train_test_validation/plot_whole_dataset_drift>`.
-
-Other approaches include statistical methods aim to measure difference
-between distribution of 2 given sets. We exprimented with various approaches
-and found that for detecting drift in a single feature, the following 2
-methods give the best results:
-
-* `Cramer's V <https://en.wikipedia.org/wiki/Cram%C3%A9r%27s_V>`__
-* `Wasserstein metric (Earth Movers Distance) <https://en.wikipedia.org/wiki/Wasserstein_metric>`__
-
-For numerical features, the check uses the Earth Movers Distance method
-and for the categorical features it uses the PSI. The check calculates drift
-between train dataset and test dataset per feature, using these 2 statistical
-measures.
 """
 
 #%%

diff --git a/docs/source/checks/tabular/train_test_validation/plot_train_test_label_drift.py b/docs/source/checks/tabular/train_test_validation/plot_train_test_label_drift.py
@@ -2,6 +2,32 @@
 """
 Train Test Label Drift
 **********************
+
+This notebooks provides an overview for using and understanding label drift check.
+
+**Structure:**
+
+* `What Is Label Drift? <#what-is-label-drift>`__
+* `Run Check on a Classification Label <#run-check-on-a-classification-label>`__
+* `Run Check on a Regression Label <#run-check-on-a-regression-label>`__
+* `Add a Condition <#run-check>`__
+
+What Is Label Drift?
+========================
+Drift is simply a change in the distribution of data over time, and it is
+also one of the top reasons why machine learning model's performance degrades
+over time.
+
+Label drift is when drift occurs in the label itself.
+
+For more information on drift, please visit our :doc:`drift guide </user-guide/general/drift_guide>`.
+
+How Deepchecks Detects Label Drift
+------------------------------------
+
+This check detects label drift by using :ref:`univariate measures <drift_detection_by_univariate_measure>`
+on the label column.
+
 """
 
 #%%
@@ -15,14 +41,17 @@
 from deepchecks.tabular.checks import TrainTestLabelDrift
 
 #%%
-# Generate data - Classification label
+# Run Check on a Classification Label
 # ====================================
 
+# Generate data:
+# --------------
+
 np.random.seed(42)
 
 train_data = np.concatenate([np.random.randn(1000,2), np.random.choice(a=[1,0], p=[0.5, 0.5], size=(1000, 1))], axis=1)
 #Create test_data with drift in label:
-test_data = np.concatenate([np.random.randn(1000,2), np.random.choice(a=[1,0], p=[0.35, 0.65], size=(1000, 1))], axis=1) 
+test_data = np.concatenate([np.random.randn(1000,2), np.random.choice(a=[1,0], p=[0.35, 0.65], size=(1000, 1))], axis=1)
 
 df_train = pd.DataFrame(train_data, columns=['col1', 'col2', 'target'])
 df_test = pd.DataFrame(test_data, columns=['col1', 'col2', 'target'])
@@ -36,16 +65,19 @@
 
 #%%
 # Run Check
-# =========
+# ===============================
 
 check = TrainTestLabelDrift()
 result = check.run(train_dataset=train_dataset, test_dataset=test_dataset)
 result
 
 #%%
-# Generate data - Regression label
+# Run Check on a Regression Label
 # ================================
 
+# Generate data:
+# --------------
+
 train_data = np.concatenate([np.random.randn(1000,2), np.random.randn(1000, 1)], axis=1)
 test_data = np.concatenate([np.random.randn(1000,2), np.random.randn(1000, 1)], axis=1)
 
@@ -59,14 +91,15 @@
 
 #%%
 # Run check
-# =========
+# ---------
 
 check = TrainTestLabelDrift()
 result = check.run(train_dataset=train_dataset, test_dataset=test_dataset)
 result
 
 #%%
-# Add condition
+# Add a Condition
+# ===============
 
 check_cond = TrainTestLabelDrift().add_condition_drift_score_not_greater_than()
 check_cond.run(train_dataset=train_dataset, test_dataset=test_dataset)
diff --git a/docs/source/checks/tabular/train_test_validation/plot_whole_dataset_drift.py b/docs/source/checks/tabular/train_test_validation/plot_whole_dataset_drift.py
@@ -8,53 +8,31 @@
 
 **Structure:**
 
-* `What is a dataset drift? <#what-is-a-dataset-drift>`__
+* `What Is Multivariate Drift? <#what-is-a-multivariate-drift>`__
 * `Loading the Data <#loading-the-data>`__
-* `Run the check <#run-the-check>`__
-* `Define a condition <#define-a-condition>`__
-
-What is a dataset drift?
-========================
-A whole dataset drift, or a multivariate dataset drift, occurs when the
-statistical properties of our input feature change, denoted by a change
-in the distribution P(X).
-
-Causes of data drift include:
-
-* Upstream process changes, such as a sensor being replaced that changes
-  the units of measurement from inches to centimeters.
-* Data quality issues, such as a broken sensor always reading 0.
-* Natural drift in the data, such as mean temperature changing with the seasons.
-* Change in relation between features, or covariate shift.
-
-The difference between a :doc:`feature drift
-</checks_gallery/tabular/train_test_validation/plot_train_test_feature_drift>`
-(or univariate dataset drift) and a multivariate drift is that in the
-latter the data drift occures in more that one feature.
-
-In the context of machine learning, drift between the training set and the
-test means that the model was trained on data that is different from the
-current test data, thus it will probably make more mistakes predicting the
-target variable.
-
-How deepchecks detects dataset drift
+* `Run the Check <#run-the-check>`__
+* `Define a Condition <#define-a-condition>`__
+
+What Is Multivariate Drift?
+==============================
+
+Drift is simply a change in the distribution of data over time, and it is
+also one of the top reasons why machine learning model's performance degrades
+over time.
+
+A multivariate drift is a drift that occurs in more than one feature at a time,
+and may even affect the relationships between those features, which are undetectable by
+univariate drift methods.
+The whole dataset drift check tries to detect multivariate drift between the two input datasets.
+
+For more information on drift, please visit our :doc:`drift guide </user-guide/general/drift_guide>`.
+
+How Deepchecks Detects Dataset Drift
 ------------------------------------
-There are many methods to detect feature drift. Some of them are statistical
-methods that aim to measure difference between distribution of 2 given sets.
-This methods are more suited to univariate distributions and are primarily
-used to detect drift between 2 subsets of a single feature.
-
-Measuring a multivariate data drift is a bit more challenging. In the whole
-dataset drift check, the multivariate drift is measured by training a classifier
-that detects which samples come from a known distribution and defines the
-drift by the accuracy of this classifier.
-
-Practically, the check concatanates the train and the test sets, and assigns
-label 0 to samples that come from the training set, and 1 to those who are
-from the test set. Then, we train a binary classifer of type
-`Histogram-based Gradient Boosting Classification Tree
-<https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.HistGradientBoostingClassifier.html>`__, and measure the
-drift score from the AUC score of this classifier.
+
+This check detects multivariate drift by using :ref:`a domain classifier <drift_detection_by_domain_classifier>`.
+Other methods to detect drift include :ref:`univariate measures <drift_detection_by_univariate_measure>`
+which is used in other checks, such as :doc:`Train Test Feature Drift check </checks_gallery/tabular/train_test_validation/plot_train_test_feature_drift>`.
 """
 
 #%%
@@ -89,7 +67,7 @@
 train_ds.label_name
 
 #%%
-# Run the check
+# Run the Check
 # =============
 from deepchecks.tabular.checks import WholeDatasetDrift
 
@@ -129,7 +107,7 @@
 # contributed the most to that drift. This is reasonable since the sampling
 # was biased based on that feature.
 #
-# Define a condition
+# Define a Condition
 # ==================
 # Now, we define a condition that enforce the whole dataset drift score must be
 # below 0.1. A condition is deepchecks' way to validate model and data quality,