Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

language fixes #28

Merged
merged 1 commit into from
Dec 4, 2023
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
58 changes: 36 additions & 22 deletions xai-model-for-1d-data/Tutorial_attention_map_for_text.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -14,11 +14,18 @@
"# XAI in Deep Learning-Based Signal Analysis: Attention Maps for Text"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In this Notebook, we will show how to produce attention maps for textual data. \n",
"This Notebook shows how to produce attention maps for textual data. \n",
"\n",
"--------"
]
Expand All @@ -31,8 +38,8 @@
"\n",
"### Setup Colab environment\n",
"\n",
"If you installed the packages and requirements on your machine, you can skip this section and start from the import section.\n",
"Otherwise, you can follow and execute the tutorial on your browser. To start working on the notebook, click on the following button. This will open this page in the Colab environment and you will be able to execute the code on your own.\n",
"If you installed the packages and requirements on your machine, you can skip this section and start from the import section.\n",
"Otherwise, you can follow and execute the tutorial on your browser. To start working on the notebook, click on the following button. This will open this page in the Colab environment, and you will be able to execute the code on your own.\n",
"\n",
"<a href=\"https://colab.research.google.com/github/HelmholtzAI-Consultants-Munich/Zero2Hero---Introduction-to-XAI/blob/Juelich-2023/xai-model-for-1d-data/Tutorial_attention_map_for_text.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
]
Expand All @@ -41,13 +48,13 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Now that you opened the notebook in Google Colab, follow the next step:\n",
"Now that you opened the notebook in Google Colab follow the next step:\n",
"\n",
"1. Run this cell to connect your Google Drive to Colab and install packages\n",
"2. Allow this notebook to access your Google Drive files. Click on 'Yes', and select your account.\n",
"3. \"Google Drive for desktop wants to access your Google Account\". Click on 'Allow'.\n",
" \n",
"At this point, a folder has been created in your Drive, and you can navigate it through the lefthand panel in Colab. You might also have received an email that informs you about the access on your Google Drive."
"A folder has been created in your Drive, and you can navigate it through the lefthand panel in Colab. You might also receive an email that informs you about the access on your Google Drive."
]
},
{
Expand All @@ -60,7 +67,7 @@
"drive.mount('/content/drive')\n",
"%cd /content/drive/MyDrive\n",
"!git clone --branch Juelich-2023 https://github.com/HelmholtzAI-Consultants-Munich/XAI-Tutorials.git\n",
"%cd XAI-Tutorials/xai-model-for-1d-data"
"%cd XAI-Tutorials/xai-model-for-1d-data\n"
]
},
{
Expand All @@ -81,7 +88,7 @@
"import matplotlib.ticker as ticker\n",
"import torch\n",
"\n",
"from transformers import AutoTokenizer, AutoModelForSeq2SeqLM"
"from transformers import AutoTokenizer, AutoModelForSeq2SeqLM\n"
]
},
{
Expand All @@ -90,7 +97,7 @@
"metadata": {},
"outputs": [],
"source": [
"weights_path = \"../data_and_models/t5_small_weights\""
"weights_path = \"../data_and_models/t5_small_weights\"\n"
]
},
{
Expand All @@ -113,7 +120,7 @@
"source": [
"!mkdir ../data_and_models/t5_small_weights/\n",
"!curl -L \"https://www.dropbox.com/scl/fi/r3zc8w4551l9nyq08cnso/t5.zip?rlkey=vcwmz0cuzx80irainvsfs8gsm&dl=0\" > ../data_and_models/t5_small_weights/t5_small.zip\n",
"!unzip /p/project/training2324/benassou1/XAI-Tutorials/data_and_models/t5_small_weights/t5_small.zip -d ../data_and_models/t5_small_weights/"
"!unzip /p/project/training2324/benassou1/XAI-Tutorials/data_and_models/t5_small_weights/t5_small.zip -d ../data_and_models/t5_small_weights/\n"
]
},
{
Expand All @@ -137,14 +144,21 @@
" ax.xaxis.set_major_locator(ticker.MultipleLocator(1))\n",
" ax.yaxis.set_major_locator(ticker.MultipleLocator(1))\n",
"\n",
" plt.show()"
" plt.show()\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We have fine-tuned a sequence-to-sequence model using the huggingface library for translation from English to French. Let's load our fine-tuned model as well as our tokenizer. "
"We have fine-tuned a sequence-to-sequence model using the huggingface library to translate English to French. Let's load our fine-tuned model as well as our tokenizer. "
]
},
{
Expand All @@ -163,7 +177,7 @@
],
"source": [
"model = AutoModelForSeq2SeqLM.from_pretrained(weights_path)\n",
"tokenizer = AutoTokenizer.from_pretrained(weights_path)"
"tokenizer = AutoTokenizer.from_pretrained(weights_path)\n"
]
},
{
Expand All @@ -179,7 +193,7 @@
"metadata": {},
"outputs": [],
"source": [
"text = \"translate from English to French: I want to go to the cinema.\""
"text = \"translate from English to French: I want to go to the cinema.\"\n"
]
},
{
Expand Down Expand Up @@ -208,7 +222,7 @@
],
"source": [
"inputs = tokenizer(text, return_tensors=\"pt\")\n",
"inputs"
"inputs\n"
]
},
{
Expand Down Expand Up @@ -236,7 +250,7 @@
"\n",
"cross_attention = attention.cross_attentions\n",
"encoder_attention = attention.encoder_attentions\n",
"decoder_attention = attention.decoder_attentions"
"decoder_attention = attention.decoder_attentions\n"
]
},
{
Expand Down Expand Up @@ -278,7 +292,7 @@
],
"source": [
"decoded_input = tokenizer.convert_ids_to_tokens(inputs[\"input_ids\"][0])\n",
"decoded_input"
"decoded_input\n"
]
},
{
Expand Down Expand Up @@ -309,7 +323,7 @@
],
"source": [
"decoded_output = tokenizer.convert_ids_to_tokens(output[0])\n",
"decoded_output"
"decoded_output\n"
]
},
{
Expand Down Expand Up @@ -349,7 +363,7 @@
],
"source": [
"avg_encoder_attention = torch.stack(encoder_attention).mean(0).mean(1).squeeze(0)\n",
"showAttention(decoded_input, decoded_input, avg_encoder_attention)"
"showAttention(decoded_input, decoded_input, avg_encoder_attention)\n"
]
},
{
Expand All @@ -358,7 +372,7 @@
"source": [
"- Since this is the last encoder layer, the self-attention mechanism highlights how each word (or subword) in the input sequence attends to every other word (or subword). This is important for encoding the context around each word.\n",
"- Lighter squares indicate higher average attention weights, suggesting that the model is considering those input positions more when encoding the word for translation.\n",
"- Darker squares indicate lower average attention weights, suggesting those input positions are less focused on by the model when encoding the word.\n",
"Darker squares indicate lower average attention weights, suggesting that the model is less focused on those input positions when encoding the word.\n",
"- If you see a strong diagonal line of lighter squares, it means that words are mostly attending to themselves and their immediate neighbors, which is typical as words often have the most context with adjacent words.\n",
"- Off-Diagonal Attention**: Lighter squares away from the diagonal could indicate that the model is capturing long-range dependencies within the input sequence, which is crucial for understanding the full context of the sentence.\n",
"- Special Tokens**: Attention to special tokens like \"<s>\" and \"</s>\" can indicate how the model handles the beginning and end of the input sequence.\n",
Expand Down Expand Up @@ -400,7 +414,7 @@
],
"source": [
"avg_decoder_attention = torch.stack(decoder_attention).mean(0).mean(1).squeeze(0)\n",
"showAttention(decoded_output, decoded_output, avg_decoder_attention)"
"showAttention(decoded_output, decoded_output, avg_decoder_attention)\n"
]
},
{
Expand Down Expand Up @@ -440,7 +454,7 @@
"## **Cross-Attention**\n",
"\n",
"In cross-attention, the attention mechanism considers two different sequences. In our example, one sequence is the source sentence (the English sentence), and the other is the target sentence (the translated sentence).\n",
"A cross-attention map visualizes how elements of one sequence (English sentence) are attended to when processing each element of the other sequence (say, the target sentence). In other words how much each input token contributed to generating each output token.\n",
"A cross-attention map visualizes how elements of one sequence (English sentence) are attended to when processing each element of the other sequence (say, the target sentence). In other words, how much did each input token contribute to generating each output token?\n",
"\n",
"To plot the cross attention, we average the heads of the last attention layer of the model and plot it."
]
Expand Down Expand Up @@ -473,7 +487,7 @@
],
"source": [
"avg_attention = torch.stack(cross_attention).mean(0).mean(1).squeeze(0)\n",
"showAttention(decoded_input, decoded_output, avg_attention)"
"showAttention(decoded_input, decoded_output, avg_attention)\n"
]
},
{
Expand Down