HelmholtzAI-Consultants-Munich · neuronflow · Dec 4, 2023 · Dec 4, 2023
diff --git a/xai-model-for-1d-data/Tutorial_attention_map_for_text.ipynb b/xai-model-for-1d-data/Tutorial_attention_map_for_text.ipynb
@@ -14,11 +14,18 @@
     "# XAI in Deep Learning-Based Signal Analysis: Attention Maps for Text"
    ]
   },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  },
   {
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "In this Notebook, we will show how to produce attention maps for textual data. \n",
+    "This Notebook shows how to produce attention maps for textual data. \n",
     "\n",
     "--------"
    ]
@@ -31,8 +38,8 @@
     "\n",
     "### Setup Colab environment\n",
     "\n",
-    "If you installed the packages and requirements on your  machine, you can skip this section and start from the import section.\n",
-    "Otherwise, you can follow and execute the tutorial on your browser. To start working on the notebook, click on the following button. This will open this page in the Colab environment and you will be able to execute the code on your own.\n",
+    "If you installed the packages and requirements on your machine, you can skip this section and start from the import section.\n",
+    "Otherwise, you can follow and execute the tutorial on your browser. To start working on the notebook, click on the following button. This will open this page in the Colab environment, and you will be able to execute the code on your own.\n",
     "\n",
     "<a href=\"https://colab.research.google.com/github/HelmholtzAI-Consultants-Munich/Zero2Hero---Introduction-to-XAI/blob/Juelich-2023/xai-model-for-1d-data/Tutorial_attention_map_for_text.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
    ]
@@ -41,13 +48,13 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "Now that you opened the notebook in Google Colab, follow the next step:\n",
+    "Now that you opened the notebook in Google Colab follow the next step:\n",
     "\n",
     "1. Run this cell to connect your Google Drive to Colab and install packages\n",
     "2. Allow this notebook to access your Google Drive files. Click on 'Yes', and select your account.\n",
     "3. \"Google Drive for desktop wants to access your Google Account\". Click on 'Allow'.\n",
     "   \n",
-    "At this point, a folder has been created in your Drive, and you can navigate it through the lefthand panel in Colab. You might also have received an email that informs you about the access on your Google Drive."
+    "A folder has been created in your Drive, and you can navigate it through the lefthand panel in Colab. You might also receive an email that informs you about the access on your Google Drive."
    ]
   },
   {
@@ -60,7 +67,7 @@
     "drive.mount('/content/drive')\n",
     "%cd /content/drive/MyDrive\n",
     "!git clone --branch Juelich-2023 https://github.com/HelmholtzAI-Consultants-Munich/XAI-Tutorials.git\n",
-    "%cd XAI-Tutorials/xai-model-for-1d-data"
+    "%cd XAI-Tutorials/xai-model-for-1d-data\n"
    ]
   },
   {
@@ -81,7 +88,7 @@
     "import matplotlib.ticker as ticker\n",
     "import torch\n",
     "\n",
-    "from transformers import AutoTokenizer, AutoModelForSeq2SeqLM"
+    "from transformers import AutoTokenizer, AutoModelForSeq2SeqLM\n"
    ]
   },
   {
@@ -90,7 +97,7 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "weights_path = \"../data_and_models/t5_small_weights\""
+    "weights_path = \"../data_and_models/t5_small_weights\"\n"
    ]
   },
   {
@@ -113,7 +120,7 @@
    "source": [
     "!mkdir ../data_and_models/t5_small_weights/\n",
     "!curl -L \"https://www.dropbox.com/scl/fi/r3zc8w4551l9nyq08cnso/t5.zip?rlkey=vcwmz0cuzx80irainvsfs8gsm&dl=0\" > ../data_and_models/t5_small_weights/t5_small.zip\n",
-    "!unzip /p/project/training2324/benassou1/XAI-Tutorials/data_and_models/t5_small_weights/t5_small.zip -d ../data_and_models/t5_small_weights/"
+    "!unzip /p/project/training2324/benassou1/XAI-Tutorials/data_and_models/t5_small_weights/t5_small.zip -d ../data_and_models/t5_small_weights/\n"
    ]
   },
   {
@@ -137,14 +144,21 @@
     "    ax.xaxis.set_major_locator(ticker.MultipleLocator(1))\n",
     "    ax.yaxis.set_major_locator(ticker.MultipleLocator(1))\n",
     "\n",
-    "    plt.show()"
+    "    plt.show()\n"
    ]
   },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  },
   {
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "We have fine-tuned a sequence-to-sequence model using the huggingface library for translation from English to French. Let's load our fine-tuned model as well as our tokenizer. "
+    "We have fine-tuned a sequence-to-sequence model using the huggingface library to translate English to French. Let's load our fine-tuned model as well as our tokenizer. "
    ]
   },
   {
@@ -163,7 +177,7 @@
    ],
    "source": [
     "model = AutoModelForSeq2SeqLM.from_pretrained(weights_path)\n",
-    "tokenizer = AutoTokenizer.from_pretrained(weights_path)"
+    "tokenizer = AutoTokenizer.from_pretrained(weights_path)\n"
    ]
   },
   {
@@ -179,7 +193,7 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "text = \"translate from English to French: I want to go to the cinema.\""
+    "text = \"translate from English to French: I want to go to the cinema.\"\n"
    ]
   },
   {
@@ -208,7 +222,7 @@
    ],
    "source": [
     "inputs = tokenizer(text, return_tensors=\"pt\")\n",
-    "inputs"
+    "inputs\n"
    ]
   },
   {
@@ -236,7 +250,7 @@
     "\n",
     "cross_attention = attention.cross_attentions\n",
     "encoder_attention = attention.encoder_attentions\n",
-    "decoder_attention = attention.decoder_attentions"
+    "decoder_attention = attention.decoder_attentions\n"
    ]
   },
   {
@@ -278,7 +292,7 @@
    ],
    "source": [
     "decoded_input = tokenizer.convert_ids_to_tokens(inputs[\"input_ids\"][0])\n",
-    "decoded_input"
+    "decoded_input\n"
    ]
   },
   {
@@ -309,7 +323,7 @@
    ],
    "source": [
     "decoded_output = tokenizer.convert_ids_to_tokens(output[0])\n",
-    "decoded_output"
+    "decoded_output\n"
    ]
   },
   {
@@ -349,7 +363,7 @@
    ],
    "source": [
     "avg_encoder_attention = torch.stack(encoder_attention).mean(0).mean(1).squeeze(0)\n",
-    "showAttention(decoded_input, decoded_input, avg_encoder_attention)"
+    "showAttention(decoded_input, decoded_input, avg_encoder_attention)\n"
    ]
   },
   {
@@ -358,7 +372,7 @@
    "source": [
     "- Since this is the last encoder layer, the self-attention mechanism highlights how each word (or subword) in the input sequence attends to every other word (or subword). This is important for encoding the context around each word.\n",
     "- Lighter squares indicate higher average attention weights, suggesting that the model is considering those input positions more when encoding the word for translation.\n",
-    "- Darker squares indicate lower average attention weights, suggesting those input positions are less focused on by the model when encoding the word.\n",
+    "Darker squares indicate lower average attention weights, suggesting that the model is less focused on those input positions when encoding the word.\n",
     "- If you see a strong diagonal line of lighter squares, it means that words are mostly attending to themselves and their immediate neighbors, which is typical as words often have the most context with adjacent words.\n",
     "- Off-Diagonal Attention**: Lighter squares away from the diagonal could indicate that the model is capturing long-range dependencies within the input sequence, which is crucial for understanding the full context of the sentence.\n",
     "- Special Tokens**: Attention to special tokens like \"<s>\" and \"</s>\" can indicate how the model handles the beginning and end of the input sequence.\n",
@@ -400,7 +414,7 @@
    ],
    "source": [
     "avg_decoder_attention = torch.stack(decoder_attention).mean(0).mean(1).squeeze(0)\n",
-    "showAttention(decoded_output, decoded_output, avg_decoder_attention)"
+    "showAttention(decoded_output, decoded_output, avg_decoder_attention)\n"
    ]
   },
   {
@@ -440,7 +454,7 @@
     "## **Cross-Attention**\n",
     "\n",
     "In cross-attention, the attention mechanism considers two different sequences. In our example, one sequence is the source sentence (the English sentence), and the other is the target sentence (the translated sentence).\n",
-    "A cross-attention map visualizes how elements of one sequence (English sentence) are attended to when processing each element of the other sequence (say, the target sentence). In other words how much each input token contributed to generating each output token.\n",
+    "A cross-attention map visualizes how elements of one sequence (English sentence) are attended to when processing each element of the other sequence (say, the target sentence). In other words, how much did each input token contribute to generating each output token?\n",
     "\n",
     "To plot the cross attention, we average the heads of the last attention layer of the model and plot it."
    ]
@@ -473,7 +487,7 @@
    ],
    "source": [
     "avg_attention = torch.stack(cross_attention).mean(0).mean(1).squeeze(0)\n",
-    "showAttention(decoded_input, decoded_output, avg_attention)"
+    "showAttention(decoded_input, decoded_output, avg_attention)\n"
    ]
   },
   {