VectorInstitute · amrit110 · May 22, 2024 · May 22, 2024
diff --git a/docs/source/evaluation.rst b/docs/source/evaluation.rst
@@ -5,15 +5,16 @@ The Evaluation API equips you with a rich toolbox to assess your models across k
 dimensions. Dive into detailed performance metrics, unveil potential fairness
 concerns, and gain granular insights through data slicing.
 
-Key capabilities:
+Key capabilities
+****************
 
-    * Performance: Employ a robust selection of common metrics to evaluate your
+    * **Performance**: Employ a robust selection of common metrics to evaluate your
       model's effectiveness and identify areas for improvement.
-    * Fairness: Uncover and analyze potential biases within your model to ensure
-      responsible and equitable outcomes.
-    * Data slicing: Isolate the model's behavior on specific subsets of your
+    * **Data slicing**: Isolate the model's behavior on specific subsets of your
       data, revealing performance nuances across demographics, features, or other
       important characteristics.
+    * **Fairness**: Uncover and analyze potential biases within your model to ensure
+      responsible and equitable outcomes.
 
     .. image:: https://github.com/VectorInstitute/cyclops/assets/8986523/416170db-1265-42a3-a3c1-d34558b72b65
 

diff --git a/docs/source/examples/metrics.ipynb b/docs/source/examples/metrics.ipynb
@@ -6,7 +6,7 @@
    "source": [
     "# Breast Cancer Classification and Evaluation\n",
     "\n",
-    "The Breast Cancer dataset is a well-suited example for demonstrating Cyclops features due to its two distinct classes (binary classification) and complete absence of missing values. This clean and organized structure makes it an ideal starting point for exploring Cyclops Evaluator."
+    "The Breast Cancer dataset is a well-suited example for demonstrating CyclOps features due to its two distinct classes (binary classification) and complete absence of missing values. This clean and organized structure makes it an ideal starting point for exploring CyclOps Evaluator."
    ]
   },
   {
@@ -27,7 +27,8 @@
     "from cyclops.evaluate.fairness import evaluate_fairness\n",
     "from cyclops.evaluate.metrics import BinaryAccuracy, create_metric\n",
     "from cyclops.evaluate.metrics.experimental import BinaryAUROC, BinaryAveragePrecision\n",
-    "from cyclops.evaluate.metrics.experimental.metric_dict import MetricDict"
+    "from cyclops.evaluate.metrics.experimental.metric_dict import MetricDict\n",
+    "from cyclops.report.plot.classification import ClassificationPlotter"
    ]
   },
   {
@@ -86,7 +87,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "Now we can use Cyclops evaluation metrics to evaluate our model's performance. You can either use each metric individually by calling them, or define a ``MetricDict`` object.\n",
+    "Now we can use CyclOps evaluation metrics to evaluate our model's performance. You can either use each metric individually by calling them, or define a ``MetricDict`` object.\n",
     "Here, we show both methods."
    ]
   },
@@ -172,7 +173,7 @@
     "spec_list = [\n",
     "    {\n",
     "        \"worst radius\": {\n",
-    "            \"min_value\": 10.0,\n",
+    "            \"min_value\": 14.0,\n",
     "            \"max_value\": 15.0,\n",
     "            \"min_inclusive\": True,\n",
     "            \"max_inclusive\": False,\n",
@@ -181,7 +182,15 @@
     "    {\n",
     "        \"worst radius\": {\n",
     "            \"min_value\": 15.0,\n",
-    "            \"max_value\": 37.0,\n",
+    "            \"max_value\": 17.0,\n",
+    "            \"min_inclusive\": True,\n",
+    "            \"max_inclusive\": False,\n",
+    "        },\n",
+    "    },\n",
+    "    {\n",
+    "        \"worst texture\": {\n",
+    "            \"min_value\": 23.1,\n",
+    "            \"max_value\": 28.7,\n",
     "            \"min_inclusive\": True,\n",
     "            \"max_inclusive\": False,\n",
     "        },\n",
@@ -190,13 +199,39 @@
     "slice_spec = SliceSpec(spec_list)"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Intersectional slicing"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "When subpopulation slices are specified using the ``SliceSpec``, sometimes we wish create combinations of intersectional slices. We can use the ``intersections`` argument to specify this."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "slice_spec = SliceSpec(spec_list, intersections=2)\n",
+    "slice_spec"
+   ]
+  },
   {
    "cell_type": "markdown",
    "metadata": {},
    "source": [
     "### Preparing Result\n",
     "\n",
-    "Cyclops Evaluator takes data as a HuggingFace Dataset object, so we combine predictions and features in a dataframe, and create a `Dataset` object:"
+    "CyclOps Evaluator takes data as a HuggingFace Dataset object, so we combine predictions and features in a dataframe, and create a `Dataset` object:"
    ]
   },
   {
@@ -219,7 +254,6 @@
    "source": [
     "# Create Dataset object\n",
     "breast_cancer_data = Dataset.from_pandas(df)\n",
-    "\n",
     "breast_cancer_sliced_result = evaluator.evaluate(\n",
     "    dataset=breast_cancer_data,\n",
     "    metrics=metric_collection,  # type: ignore[list-item]\n",
@@ -233,7 +267,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "And here's the evaluation result for the data slices we defined:"
+    "We can visualize the ``BinaryF1Score`` and ``BinaryPrecision`` for the different slices"
    ]
   },
   {
@@ -242,7 +276,22 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "breast_cancer_sliced_result"
+    "# Extracting the metric values for all the slices.\n",
+    "slice_metrics = {\n",
+    "    slice_name: {\n",
+    "        metric_name: metric_value\n",
+    "        for metric_name, metric_value in slice_results.items()\n",
+    "        if metric_name in [\"BinaryF1Score\", \"BinaryPrecision\"]\n",
+    "    }\n",
+    "    for slice_name, slice_results in breast_cancer_sliced_result[\n",
+    "        \"model_for_preds_prob\"\n",
+    "    ].items()\n",
+    "}\n",
+    "# Plotting the metric values for all the slices.\n",
+    "plotter = ClassificationPlotter(task_type=\"binary\", class_names=[\"0\", \"1\"])\n",
+    "plotter.set_template(\"plotly_white\")\n",
+    "slice_metrics_plot = plotter.metrics_comparison_bar(slice_metrics)\n",
+    "slice_metrics_plot.show()"
    ]
   },
   {
@@ -280,7 +329,7 @@
  ],
  "metadata": {
   "kernelspec": {
-   "display_name": "cyclops",
+   "display_name": "Python 3 (ipykernel)",
    "language": "python",
    "name": "python3"
   },
@@ -294,9 +343,9 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.10.11"
+   "version": "3.10.12"
   }
  },
  "nbformat": 4,
- "nbformat_minor": 2
+ "nbformat_minor": 4
 }
diff --git a/docs/source/tutorials/kaggle/heart_failure_prediction.ipynb b/docs/source/tutorials/kaggle/heart_failure_prediction.ipynb
@@ -1374,7 +1374,7 @@
  ],
  "metadata": {
   "kernelspec": {
-   "display_name": "Python 3",
+   "display_name": "Python 3 (ipykernel)",
    "language": "python",
    "name": "python3"
   },
@@ -1388,7 +1388,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.8.10"
+   "version": "3.10.12"
   }
  },
  "nbformat": 4,