# 2014 - RNNs in Encoder-Decoder Architectures

[2014 RNNs in Encoder-Decoder architectures](https://en.wikipedia.org/wiki/Recurrent_neural_network)  
RNNs were a significant advancement, capable of computing document embeddings and adding word context. They grew to include LSTM (1997) for long-term dependencies and Bidirectional RNN (1997) for context understanding. Encoder-Decoder RNNs (2014) improved on this method.

```NOTE TOSELF: Create an LSTM notebook and add the architecture for RNNs also add Vanishing gradients notebook or add to LSTM notebook.```

[Fundamentals of Recurrent Neural Network (RNN) and Long Short-Term Memory (LSTM) Network](https://arxiv.org/abs/1808.03314)

In this notebook, we will explore the concept of Recurrent Neural Networks (RNNs) and how they are used in Encoder-Decoder architectures, which significantly improved the ability to model sequences and context in text data. We'll implement a basic Encoder-Decoder model using RNNs for sequence-to-sequence tasks.

### Step-by-Step Explanation:

1. **Understanding RNNs**: RNNs are a type of neural network designed to handle sequential data by maintaining a hidden state that captures information about previous inputs in the sequence.
2. **Encoder-Decoder Architecture**: This architecture consists of two RNNs: an Encoder that processes the input sequence and compresses it into a fixed-length context vector, and a Decoder that generates the output sequence based on this context vector.
3. **Training an Encoder-Decoder Model**: We'll train a simple Encoder-Decoder model using a sample dataset.
4. **Generating Sequences**: Using the trained model, we will generate sequences to demonstrate how the Encoder-Decoder architecture works.

### Implementation

#### 1. Install Required Libraries

Before running the code, ensure you have the necessary libraries installed. You can use:

```bash
!pip install tensorflow numpy
```

#### 2. Import Libraries and Define the Dataset


In [2]:
#pip install tensorflow numpy

In [3]:
import numpy as np
import tensorflow as tf
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Input, LSTM, Dense

# Define a simple dataset
input_texts = [
    "hello",
    "world",
    "machine",
    "learning",
    "encoder",
    "decoder"
]

output_texts = [
    "hola",
    "mundo",
    "máquina",
    "aprendizaje",
    "codificador",
    "decodificador"
]

# Create character sets
input_characters = sorted(set(''.join(input_texts)))
output_characters = sorted(set(''.join(output_texts)))
num_encoder_tokens = len(input_characters)
num_decoder_tokens = len(output_characters)

# Create a mapping of characters to integers
input_token_index = dict([(char, i) for i, char in enumerate(input_characters)])
output_token_index = dict([(char, i) for i, char in enumerate(output_characters)])

#### 3. Preprocess the Data

In [4]:
# Define maximum sequence lengths
max_encoder_seq_length = max([len(txt) for txt in input_texts])
max_decoder_seq_length = max([len(txt) for txt in output_texts])

# Vectorize the input and output texts
encoder_input_data = np.zeros((len(input_texts), max_encoder_seq_length, num_encoder_tokens), dtype='float32')
decoder_input_data = np.zeros((len(output_texts), max_decoder_seq_length, num_decoder_tokens), dtype='float32')
decoder_target_data = np.zeros((len(output_texts), max_decoder_seq_length, num_decoder_tokens), dtype='float32')

for i, (input_text, target_text) in enumerate(zip(input_texts, output_texts)):
    for t, char in enumerate(input_text):
        encoder_input_data[i, t, input_token_index[char]] = 1.0
    for t, char in enumerate(target_text):
        decoder_input_data[i, t, output_token_index[char]] = 1.0
        if t > 0:
            decoder_target_data[i, t - 1, output_token_index[char]] = 1.0

#### 4. Build the Encoder-Decoder Model

In [6]:
# Define the Encoder
encoder_inputs = Input(shape=(None, num_encoder_tokens))
encoder_lstm = LSTM(256, return_state=True)
encoder_outputs, state_h, state_c = encoder_lstm(encoder_inputs)
encoder_states = [state_h, state_c]

# Define the Decoder
decoder_inputs = Input(shape=(None, num_decoder_tokens))
decoder_lstm = LSTM(256, return_sequences=True, return_state=True)
decoder_outputs, _, _ = decoder_lstm(decoder_inputs, initial_state=encoder_states)
decoder_dense = Dense(num_decoder_tokens, activation='softmax')
decoder_outputs = decoder_dense(decoder_outputs)

# Define the model
model = Model([encoder_inputs, decoder_inputs], decoder_outputs)

# Compile the model
model.compile(optimizer='rmsprop', loss='categorical_crossentropy', metrics=['accuracy'])

#### 5. Train the Model

In [7]:
# Train the model
model.fit([encoder_input_data, decoder_input_data], decoder_target_data,
          batch_size=64,
          epochs=100,
          validation_split=0.2)

Epoch 1/100
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 3s/step - accuracy: 0.1731 - loss: 1.2776 - val_accuracy: 0.1538 - val_loss: 2.4385
Epoch 2/100
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 110ms/step - accuracy: 0.4038 - loss: 1.2682 - val_accuracy: 0.1538 - val_loss: 2.4369
Epoch 3/100
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 115ms/step - accuracy: 0.5962 - loss: 1.2603 - val_accuracy: 0.1538 - val_loss: 2.4353
Epoch 4/100
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 89ms/step - accuracy: 0.6538 - loss: 1.2528 - val_accuracy: 0.2308 - val_loss: 2.4334
Epoch 5/100
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 82ms/step - accuracy: 0.7308 - loss: 1.2452 - val_accuracy: 0.2308 - val_loss: 2.4309
Epoch 6/100
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 82ms/step - accuracy: 0.7500 - loss: 1.2367 - val_accuracy: 0.2308 - val_loss: 2.4276
Epoch 7/100
[1m1/1[0m [32m━━━━━━━━━━━

<keras.src.callbacks.history.History at 0x1275be6ce60>

#### 6. Define Inference Models for Prediction

In [8]:
# Define the encoder model for inference
encoder_model = Model(encoder_inputs, encoder_states)

# Define the decoder model for inference
decoder_state_input_h = Input(shape=(256,))
decoder_state_input_c = Input(shape=(256,))
decoder_states_inputs = [decoder_state_input_h, decoder_state_input_c]
decoder_outputs, state_h, state_c = decoder_lstm(
    decoder_inputs, initial_state=decoder_states_inputs)
decoder_states = [state_h, state_c]
decoder_outputs = decoder_dense(decoder_outputs)
decoder_model = Model(
    [decoder_inputs] + decoder_states_inputs,
    [decoder_outputs] + decoder_states)

#### 7. Generate Sequences


In [9]:
# some decoding errors that I need to fix for this one
def decode_sequence(input_seq):
    # Encode the input as state vectors
    states_value = encoder_model.predict(input_seq)

    # Generate empty target sequence of length 1
    target_seq = np.zeros((1, 1, num_decoder_tokens))

    # Sampling loop for a batch of sequences
    stop_condition = False
    decoded_sentence = ''
    while not stop_condition:
        output_tokens, h, c = decoder_model.predict([target_seq] + states_value)

        # Sample a token
        sampled_token_index = np.argmax(output_tokens[0, -1, :])
        sampled_char = output_characters[sampled_token_index]
        decoded_sentence += sampled_char

        # Exit condition: either hit max length or find a stop character
        if len(decoded_sentence) > max_decoder_seq_length:
            stop_condition = True

        # Update the target sequence (length 1)
        target_seq = np.zeros((1, 1, num_decoder_tokens))
        target_seq[0, 0, sampled_token_index] = 1.0

        # Update states
        states_value = [h, c]

    return decoded_sentence


# Test the model
for seq_index in range(len(input_texts)):
    input_seq = encoder_input_data[seq_index: seq_index + 1]
    decoded_sentence = decode_sequence(input_seq)
    print('-')
    print('Input:', input_texts[seq_index])
    print('Decoded:', decoded_sentence)

[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 132ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 136ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 21ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 23ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 23ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 23ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 22ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 28ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 26ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 40ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 28ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 23ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 24ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 

### Explanation of Encoder-Decoder Architectures

#### Recurrent Neural Networks (RNNs)

RNNs are a type of neural network designed to handle sequential data by maintaining a hidden state that captures information about previous inputs in the sequence. However, vanilla RNNs struggle with long-term dependencies due to issues like vanishing gradients, which LSTM and GRU units aim to mitigate. (LOOK UP LSTM, GRU, Vanishing Gradient)

#### Encoder-Decoder Architectures

Encoder-Decoder architectures consist of two main components:
- **Encoder**: Processes the input sequence and compresses the information into a fixed-length context vector (hidden state).
- **Decoder**: Takes the context vector and generates the output sequence.

These architectures are widely used in various sequence-to-sequence tasks such as machine translation, text summarization, and conversational modeling.

### Mathematical Notation

Given a sequence of words $ x_1, x_2, \ldots, x_T $:
- **Encoder**: The encoder processes each word $ x_t $ and updates its hidden state $ h_t $ using:
  $$
  h_t = f(h_{t-1}, x_t)
  $$
  Where $ f $ is the RNN function (e.g., LSTM or GRU).

- **Decoder**: The decoder generates each word $ y_t $ in the output sequence using the context vector and the previous hidden state:
  $$
  s_t = g(s_{t-1}, y_{t-1}, c)
  $$
  Where $ g $ is the RNN function, $ s_t $ is the hidden state of the decoder, and $ c $ is the context vector from the encoder.





The overall goal of this notebook is to demonstrate the application of Recurrent Neural Networks (RNNs) within an Encoder-Decoder architecture for sequence-to-sequence tasks. Specifically, it showcases how to build and train a basic Encoder-Decoder model using RNNs to translate or map input sequences to output sequences.

### Summary of the Notebook:
1. **Introduction to RNNs and Encoder-Decoder Architecture**: The notebook introduces RNNs and how they are used in Encoder-Decoder architectures for tasks that involve transforming an input sequence into an output sequence, such as translation.

2. **Creating a Simple Dataset**: A small dataset of simple word pairs is defined, where each input word (e.g., "hello") is mapped to an output word in another language (e.g., "hola"). This serves as a minimal example to train the Encoder-Decoder model.

3. **Building the Model**: The notebook constructs an Encoder-Decoder model using Long Short-Term Memory (LSTM) units:
   - **Encoder**: Encodes the input sequence into a fixed-length context vector.
   - **Decoder**: Decodes this context vector to generate the output sequence.

4. **Training the Model**: The model is trained on the dataset to learn how to map each input sequence to the corresponding output sequence.

5. **Generating Sequences**: The trained model is used to generate or predict output sequences for given input sequences, demonstrating how the Encoder-Decoder architecture processes and produces sequences.

### Key Takeaways:
- **RNNs for Sequence Modeling**: Showcased how RNNs are effective for handling sequential data by maintaining a hidden state that captures information from previous inputs.
- **Encoder-Decoder Architecture**: Illustrated how this architecture works for sequence-to-sequence tasks by first encoding an input sequence into a context vector and then decoding this vector to generate an output sequence.
- **Practical Implementation**: Provided a hands-on example using TensorFlow and Keras to build, train, and use an Encoder-Decoder model for simple word-to-word translation tasks.

This notebook serves as an introductory demonstration of how RNNs in Encoder-Decoder architectures can be applied to tasks such as translation, text summarization, and other sequence transformations. This notebook demonstrates how to provide a basic understanding of how these models process sequences and generate output. It builds upon the concepts of word embeddings and sequential modeling introduced by earlier models like Word2Vec.

## Alternative Examples you could create with this type of Encoder-Decoder architecture with RNNs

Here are some alternative examples that can be used to showcase the Encoder-Decoder architecture with RNNs. Each of these examples is designed to highlight different aspects of sequence-to-sequence learning, such as translation, transformation, or summarization.

### 1. **Date Format Conversion**
   - **Task**: Convert dates from one format to another.
   - **Input**: "01-01-2024"
   - **Output**: "January 1, 2024"
   - **Purpose**: This task demonstrates how the Encoder-Decoder model can learn to map a structured input sequence to a more natural language output format.
   - **Benefits**: 
     - Shows the model’s ability to understand and generate different sequence patterns.
     - Useful for practical applications like data preprocessing and natural language understanding.

### 2. **Math Equation to Verbal Description**
   - **Task**: Convert simple arithmetic equations into their word forms.
   - **Input**: "3 + 5"
   - **Output**: "three plus five equals eight"
   - **Purpose**: Demonstrates the model's ability to interpret and generate sequences based on arithmetic logic.
   - **Benefits**:
     - Highlights how sequence-to-sequence models can be used for educational tools.
     - Illustrates the model’s capability to understand numerical context.

### 3. **Reversing Sentences**
   - **Task**: Reverse the words in a sentence.
   - **Input**: "The quick brown fox"
   - **Output**: "fox brown quick The"
   - **Purpose**: A simple task to demonstrate the model’s ability to handle sequence manipulation.
   - **Benefits**:
     - Provides an easy-to-understand example of sequence transformation.
     - Can be a starting point for understanding more complex tasks like summarization.

### 4. **Translation of Phrases**
   - **Task**: Translate short phrases from one language to another.
   - **Input**: "Good morning"
   - **Output**: "Buenos días"
   - **Purpose**: Shows the Encoder-Decoder model’s strength in handling language translation.
   - **Benefits**:
     - Directly applicable to real-world use cases like language translation services.
     - Illustrates the model's understanding of context and semantics.

### 5. **Text Summarization**
   - **Task**: Summarize longer sentences into shorter phrases.
   - **Input**: "The quick brown fox jumps over the lazy dog because it was feeling very energetic and playful."
   - **Output**: "Energetic fox jumps."
   - **Purpose**: Demonstrates how the Encoder-Decoder architecture can be used for text summarization by learning to capture the essence of a longer text.
   - **Benefits**:
     - Useful for creating more advanced applications in text processing.
     - Showcases the ability to condense information while preserving meaning.

### 6. **Sequence Number Mapping**
   - **Task**: Map a sequence of numbers to a verbal description.
   - **Input**: "123"
   - **Output**: "one hundred twenty-three"
   - **Purpose**: Demonstrates how the model can handle digit-to-word conversion, useful in various applications such as voice assistants.
   - **Benefits**:
     - Highlights the model's ability to understand and verbalize numerical data.
     - Useful for building interactive voice-based systems.

### Selecting an Alternative Example:
- **Complexity**: Choose an example that matches the complexity level you're comfortable with and want to demonstrate. Simple tasks like reversing sentences are easier to implement, while summarization or translation can be more complex.
- **Application**: Consider the real-world applicability of the task. For instance, translation and date format conversion are common use cases that resonate well with practical applications.
- **Demonstration**: If the goal is to show the model's capability in understanding context and generating coherent outputs, translation and summarization are good choices.

### Recommended Example for Implementation:
Given the previous translation example in the notebook, **Date Format Conversion** can be a suitable next step. It's simple yet effective in demonstrating how an Encoder-Decoder model can handle structured data and convert it into a natural language format, which can be particularly illustrative for those new to sequence-to-sequence modeling.

TODO: Show architecture

In [10]:
import numpy as np
import tensorflow as tf
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Input, LSTM, Dense
from tensorflow.keras.utils import to_categorical

#### Step 2: Define the Dataset
Let's define a small dataset of date conversions. For simplicity, we'll use a few examples:

In [11]:
# Define a simple dataset with start and end tokens
input_dates = [
    "01-01-2024",
    "02-14-2024",
    "12-25-2023",
    "07-04-2024",
    "11-11-2023",
    "10-31-2024"
]

output_dates = [
    "<start> January 1, 2024 <end>",
    "<start> February 14, 2024 <end>",
    "<start> December 25, 2023 <end>",
    "<start> July 4, 2024 <end>",
    "<start> November 11, 2023 <end>",
    "<start> October 31, 2024 <end>"
]

# Create character sets
input_characters = sorted(set(''.join(input_dates)))
output_characters = sorted(set(''.join(output_dates)))

# Make sure to include '<start>' and '<end>' in the character set
output_characters.extend(['<start>', '<end>'])
output_characters = sorted(set(output_characters))

num_encoder_tokens = len(input_characters)
num_decoder_tokens = len(output_characters)

# Create a mapping of characters to integers
input_token_index = dict([(char, i) for i, char in enumerate(input_characters)])
output_token_index = dict([(char, i) for i, char in enumerate(output_characters)])

reverse_output_token_index = dict((i, char) for char, i in output_token_index.items())


#### Step 3: Preprocess the Data
Convert the dates into a format suitable for training the model.

In [12]:
# Define maximum sequence lengths
max_encoder_seq_length = max([len(txt) for txt in input_dates])
max_decoder_seq_length = max([len(txt) for txt in output_dates])

# Vectorize the input and output dates
encoder_input_data = np.zeros((len(input_dates), max_encoder_seq_length, num_encoder_tokens), dtype='float32')
decoder_input_data = np.zeros((len(output_dates), max_decoder_seq_length, num_decoder_tokens), dtype='float32')
decoder_target_data = np.zeros((len(output_dates), max_decoder_seq_length, num_decoder_tokens), dtype='float32')

for i, (input_text, target_text) in enumerate(zip(input_dates, output_dates)):
    for t, char in enumerate(input_text):
        encoder_input_data[i, t, input_token_index[char]] = 1.0
    for t, char in enumerate(target_text):
        decoder_input_data[i, t, output_token_index[char]] = 1.0
        if t > 0:
            decoder_target_data[i, t - 1, output_token_index[char]] = 1.0


#### Step 4: Build the Encoder-Decoder Model

In [13]:
from tensorflow.keras.layers import Dropout

# Define the Encoder
encoder_inputs = Input(shape=(None, num_encoder_tokens))
encoder_lstm = LSTM(256, return_state=True, recurrent_dropout=0.2)  # Add dropout
encoder_outputs, state_h, state_c = encoder_lstm(encoder_inputs)
encoder_states = [state_h, state_c]

# Define the Decoder
decoder_inputs = Input(shape=(None, num_decoder_tokens))
decoder_lstm = LSTM(256, return_sequences=True, return_state=True, recurrent_dropout=0.2)  # Add dropout
decoder_outputs, _, _ = decoder_lstm(decoder_inputs, initial_state=encoder_states)
decoder_dense = Dense(num_decoder_tokens, activation='softmax')
decoder_outputs = decoder_dense(decoder_outputs)

# Define the model
model = Model([encoder_inputs, decoder_inputs], decoder_outputs)

# Compile the model
model.compile(optimizer='rmsprop', loss='categorical_crossentropy', metrics=['accuracy'])


#### Step 5: Train the Model

In [14]:
# Reset stdout to default
#sys.stdout = sys.__stdout__

In [15]:
from tensorflow.keras.callbacks import EarlyStopping

# Define early stopping
early_stopping = EarlyStopping(monitor='val_loss', patience=10, restore_best_weights=True)

# Train the model with early stopping
#%%capture training_output
model.fit([encoder_input_data, decoder_input_data], decoder_target_data,
          batch_size=64,
          epochs=500,  # Set a higher maximum but rely on early stopping
          validation_split=0.2,
          verbose=2,  # One line per epoch
          callbacks=[early_stopping])




Epoch 1/500
1/1 - 4s - 4s/step - accuracy: 0.0000e+00 - loss: 3.1584 - val_accuracy: 0.2097 - val_loss: 3.2701
Epoch 2/500
1/1 - 0s - 111ms/step - accuracy: 0.2581 - loss: 3.1276 - val_accuracy: 0.2419 - val_loss: 3.2455
Epoch 3/500
1/1 - 0s - 114ms/step - accuracy: 0.2823 - loss: 3.0970 - val_accuracy: 0.2419 - val_loss: 3.2145
Epoch 4/500
1/1 - 0s - 96ms/step - accuracy: 0.2661 - loss: 3.0617 - val_accuracy: 0.2258 - val_loss: 3.1612
Epoch 5/500
1/1 - 0s - 93ms/step - accuracy: 0.2581 - loss: 3.0023 - val_accuracy: 0.2097 - val_loss: 3.0359
Epoch 6/500
1/1 - 0s - 94ms/step - accuracy: 0.2500 - loss: 2.8654 - val_accuracy: 0.1774 - val_loss: 2.9446
Epoch 7/500
1/1 - 0s - 92ms/step - accuracy: 0.2177 - loss: 2.7456 - val_accuracy: 0.1290 - val_loss: 2.9588
Epoch 8/500
1/1 - 0s - 94ms/step - accuracy: 0.1532 - loss: 2.7655 - val_accuracy: 0.2097 - val_loss: 2.8779
Epoch 9/500
1/1 - 0s - 90ms/step - accuracy: 0.2500 - loss: 2.6418 - val_accuracy: 0.2097 - val_loss: 2.8617
Epoch 10/500
1/

<keras.src.callbacks.history.History at 0x1275a6c5dc0>

In [17]:
# print out results of training
#print(training_output.stdout)

#### Step 6: Define Inference Models for Prediction

In [18]:
# Define the encoder model for inference
encoder_model = Model(encoder_inputs, encoder_states)

# Define the decoder model for inference
decoder_state_input_h = Input(shape=(256,))
decoder_state_input_c = Input(shape=(256,))
decoder_states_inputs = [decoder_state_input_h, decoder_state_input_c]
decoder_outputs, state_h, state_c = decoder_lstm(
    decoder_inputs, initial_state=decoder_states_inputs)
decoder_states = [state_h, state_c]
decoder_outputs = decoder_dense(decoder_outputs)
decoder_model = Model(
    [decoder_inputs] + decoder_states_inputs,
    [decoder_outputs] + decoder_states)


#### Step 7: Generate Sequences

In [19]:
def decode_sequence(input_seq):
    # Encode the input as state vectors
    states_value = encoder_model.predict(input_seq)

    # Generate empty target sequence of length 1 with the start token
    target_seq = np.zeros((1, 1, num_decoder_tokens))
    target_seq[0, 0, output_token_index['<start>']] = 1.0

    # Sampling loop for a batch of sequences
    stop_condition = False
    decoded_sentence = ''
    max_decoder_seq_length = 50  # Set a maximum length for the decoded sequence
    while not stop_condition:
        output_tokens, h, c = decoder_model.predict([target_seq] + states_value)

        # Sample a token
        sampled_token_index = np.argmax(output_tokens[0, -1, :])
        sampled_char = reverse_output_token_index[sampled_token_index]

        # Check for end token
        if sampled_char == '<end>':
            stop_condition = True
        else:
            decoded_sentence += sampled_char

        # Break the loop if it reaches max length
        if len(decoded_sentence) > max_decoder_seq_length:
            stop_condition = True

        # Update the target sequence (length 1)
        target_seq = np.zeros((1, 1, num_decoder_tokens))
        target_seq[0, 0, sampled_token_index] = 1.0

        # Update states
        states_value = [h, c]

    return decoded_sentence.strip()  # Strip any leading/trailing whitespace

# Test the model
for seq_index in range(len(input_dates)):
    input_seq = encoder_input_data[seq_index: seq_index + 1]
    decoded_sentence = decode_sequence(input_seq)
    print('-')
    print('Input:', input_dates[seq_index])
    print('Decoded:', decoded_sentence)


[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 262ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 190ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 26ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 25ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 24ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 25ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 25ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 26ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 26ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 25ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 24ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 26ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 25ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 

### Explanation:
- Data Preparation: The dates are encoded into one-hot vectors for both input and output.
- Model Training: We train an LSTM-based Encoder-Decoder model to learn the mapping between the date formats.
- Inference: We use the trained model to decode and predict the natural language format of the dates.

### Conclusion:
This notebook demonstrates how an Encoder-Decoder architecture can be used for the task of converting dates from a numerical format to a more natural language format. It highlights the model's ability to learn and generate sequences, showcasing the power of RNN-based Encoder-Decoder models for sequence-to-sequence tasks.