<a href="https://colab.research.google.com/github/harshitabhambhani/ML-DL-models/blob/main/Attention_model.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

The attention model is a mechanism used in machine learning and deep learning to improve the performance of tasks that involve sequential data, such as natural language processing and computer vision. The attention mechanism allows the model to focus on specific parts of the input sequence when making predictions, rather than considering the entire input sequence at once. This helps the model capture long-range dependencies and improves its ability to handle variable-length input.

The attention mechanism works by assigning different weights to different parts of the input sequence, indicating the importance of each part for the current prediction. These weights are dynamically adjusted during the training process based on the context and the current state of the model.

In [None]:
import tensorflow as tf
from tensorflow.keras.layers import Input, Embedding, LSTM, Dense, Attention
from tensorflow.keras.models import Model

# Define the input sequence length and vocabulary size
sequence_length = 10
vocab_size = 1000

# Define the encoder
encoder_inputs = Input(shape=(sequence_length,))
encoder_embedding = Embedding(input_dim=vocab_size, output_dim=64)(encoder_inputs)
encoder_lstm = LSTM(128, return_sequences=True)(encoder_embedding)

# Define the decoder
decoder_inputs = Input(shape=(sequence_length,))
decoder_embedding = Embedding(input_dim=vocab_size, output_dim=64)(decoder_inputs)
decoder_lstm = LSTM(128, return_sequences=True)(decoder_embedding)

# Apply attention layer
attention = Attention()([encoder_lstm, decoder_lstm])

# Concatenate attention output with decoder LSTM output
decoder_combined_context = tf.concat([decoder_lstm, attention], axis=-1)

# Apply a dense layer to produce the output sequence
decoder_outputs = Dense(vocab_size, activation='softmax')(decoder_combined_context)

# Create the model
model = Model([encoder_inputs, decoder_inputs], decoder_outputs)

# Compile the model
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

# Print the model summary
model.summary()

Model: "model"
__________________________________________________________________________________________________
 Layer (type)                Output Shape                 Param #   Connected to                  
 input_6 (InputLayer)        [(None, 10)]                 0         []                            
                                                                                                  
 input_5 (InputLayer)        [(None, 10)]                 0         []                            
                                                                                                  
 embedding_5 (Embedding)     (None, 10, 64)               64000     ['input_6[0][0]']             
                                                                                                  
 embedding_4 (Embedding)     (None, 10, 64)               64000     ['input_5[0][0]']             
                                                                                              