HelmholtzAI-Consultants-Munich · neuronflow · Dec 4, 2023 · Dec 4, 2023
diff --git a/xai-for-tabular-data/Tutorial_SHAP.ipynb b/xai-for-tabular-data/Tutorial_SHAP.ipynb
@@ -17,7 +17,7 @@
    "source": [
     "# Model-Agnostic Interpretation with SHAP\n",
     "\n",
-    "In this Notebook we will demonstrate how to use the SHapley Additive exPlanations (SHAP) method and interpret its results.\n",
+    "In this Notebook, we will demonstrate how to use the SHapley Additive exPlanations (SHAP) method and interpret its results.\n",
     "\n",
     "--------"
    ]
@@ -43,7 +43,7 @@
    "id": "1c99c0f2",
    "metadata": {},
    "source": [
-    "Now that you opened the notebook in Google Colab follow the next step:\n",
+    "Now that you opened the notebook in Google Colab, follow the next step:\n",
     "\n",
     "1. Run this cell to connect your Google Drive to Colab and install packages\n",
     "2. Allow this notebook to access your Google Drive files. Click on 'Yes', and select your account.\n",
@@ -148,7 +148,7 @@
    "source": [
     "## The California Housing Dataset: Data and Model Loading\n",
     "\n",
-    "In this notebook, we will work with the **California Housing dataset**, containing 20,640 median house values for California districts (expressed in $100,000), which are described by eight numeric features. Each row in the dataset represents a block of houses, not a single household. The data pertains to the house prices found in a given California district and some summary statistics about them based on the 1990 census data. Our goal is to **predict price** of house blocks and find the most predictive features.\n",
+    "In this notebook, we will work with the **California Housing dataset**, containing 20,640 median house values for California districts (expressed in $100,000), described by eight numeric features. Each row in the dataset represents a block of houses, not a single household. The data pertains to the house prices found in a given California district and some summary statistics about them based on the 1990 census data. Our goal is to **predict price** of house blocks and find the most predictive features.\n",
     "\n",
     "<center><img src=\"https://github.com/HelmholtzAI-Consultants-Munich/XAI-Tutorials/blob/main/docs/source/_figures/dataset_california_housing.jpg?raw=true\" width=\"900\" /></center>\n",
     "\n",
@@ -160,7 +160,7 @@
    "id": "641755c5",
    "metadata": {},
    "source": [
-    "In the notebook [*Dataset-Housing.ipynb*](../data_and_models/Dataset-Housing.ipynb), we explain how to do the exploratory data analysis, preprocess the data and in the notebook [*Model-RandomForest.ipynb*](../data_and_models/Model-RandomForest.ipynb) we train a Random Forest model with the given data. This notebook focuses on the interpretation of the trained model and not on the data pre-processing or model training part. Hence, here we load the data and the model that we saved in the previous notebook."
+    "In the notebook [*Dataset-Housing.ipynb*](../data_and_models/Dataset-Housing.ipynb), we explain how to do the exploratory data analysis, preprocess the data and in the notebook [*Model-RandomForest.ipynb*](../data_and_models/Model-RandomForest.ipynb) we train a Random Forest model with the given data. This notebook focuses on the interpretation of the trained model and not on the data pre-processing or model training part. Hence, here we load the data and model saved in the previous notebook."
    ]
   },
   {
@@ -667,9 +667,9 @@
     "cell_marker": "'''"
    },
    "source": [
-    "The average prediction for all houses in all the census blocks is labeled as the *base value* here, which is about 2.08. The predicted median house price in this census block is 2.21 and is labeled as the *f(x)*.\n",
+    "The average prediction for all houses in all the census blocks is labeled as the *base value* here, which is about 2.08. The predicted median house price in this census block is 2.21, labeled as the *f(x)*.\n",
     "\n",
-    "Features that increase the predicted price from the *base value* are colored in red and are distinguished from each other by arrows pointing to the right. Features that decrease the predicted price are colored in blue with left-pointing arrows. Features with larger effects on the prediction, occupy more space in the row of arrows. The two sets of features point to the *output value*. The names of the features and their values are printed below the row of arrows.\n",
+    "Features that increase the predicted price from the *base value* are colored in red and are distinguished from each other by arrows pointing to the right. Features that decrease the predicted price are colored in blue with left-pointing arrows. Features with larger effects on the prediction, occupy more space in the row of arrows. The two sets of features point to the *output value*. The features' names and values are printed below the row of arrows.\n",
     "\n",
     "You can find more advanced use cases for decision and force plots [here](https://shap.readthedocs.io/en/latest/example_notebooks/api_examples/plots/decision_plot.html)."
    ]
@@ -1413,9 +1413,9 @@
    "source": [
     "### Global Explanations\n",
     "\n",
-    "For the global explanations we can visualize a combined bar plot that shows the average absolute SHAP values stacked per class.\n",
+    "For the global explanations, we can visualize a combined bar plot that shows the average absolute SHAP values stacked per class.\n",
     "\n",
-    "*Note: the shap.plots.bar() fucntion of the new package does currently not work for multi-class classiciation problem. Instaed we have to use the old shap.summary_plot() function.*"
+    "*Note: the shap.plots.bar() function of the new package does currently not work for multi-class classification problems. Instead, we have to use the old shap.summary_plot() function.*"
    ]
   },
   {
@@ -1462,7 +1462,7 @@
      "name": "stderr",
      "output_type": "stream",
      "text": [
-      "No data for colormapping provided via 'c'. Parameters 'vmin', 'vmax' will be ignored\n"
+      "No data for color mapping was provided via 'c'. Parameters 'vmin', 'vmax' will be ignored\n"
      ]
     },
     {
@@ -1544,7 +1544,7 @@
      "name": "stderr",
      "output_type": "stream",
      "text": [
-      "No data for colormapping provided via 'c'. Parameters 'vmin', 'vmax' will be ignored\n"
+      "No data for color mapping was provided via 'c'. Parameters 'vmin', 'vmax' will be ignored\n"
      ]
     },
     {