# Imports

In [None]:
import torch
from torch import nn
import tensorflow.compat.v1 as tf
tf.disable_v2_behavior()

# Computational Graphs

Let's demonstrate the difference between static and dynamic computational graphs using PyTorch (for dynamic graphs) and TensorFlow 1.x (for static graphs). We'll implement a simple example of a dynamic sentence encoder that adapts to the length of the input sentence.

Let's start with the dynamic graph approach using PyTorch:

In [None]:
# PyTorch implementation (Dynamic Graph)
class DynamicSentenceEncoder(nn.Module):
    def __init__(self, vocab_size, embedding_dim, hidden_dim):
        super().__init__()
        self.embedding = nn.Embedding(vocab_size, embedding_dim)
        self.rnn = nn.GRU(embedding_dim, hidden_dim, batch_first=True)
        
    def forward(self, sentence):
        embedded = self.embedding(sentence)
        _, hidden = self.rnn(embedded)
        return hidden.squeeze(0)

# Example usage of PyTorch model
vocab_size, embedding_dim, hidden_dim = 1000, 50, 100
pytorch_model = DynamicSentenceEncoder(vocab_size, embedding_dim, hidden_dim)

# Different length sentences
short_sentence = torch.LongTensor([[1, 2, 3, 4, 5]])
long_sentence = torch.LongTensor([[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]])

# Process sentences of different lengths
short_encoding = pytorch_model(short_sentence)
long_encoding = pytorch_model(long_sentence)

print("PyTorch (Dynamic) - Short sentence encoding shape:", short_encoding.shape)
print("PyTorch (Dynamic) - Long sentence encoding shape:", long_encoding.shape)

# TensorFlow 1.x implementation (Static Graph)
class StaticSentenceEncoder:
    def __init__(self, vocab_size, embedding_dim, hidden_dim, max_seq_length):
        self.inputs = tf.placeholder(tf.int32, shape=[None, max_seq_length])
        self.seq_lengths = tf.placeholder(tf.int32, shape=[None])
        
        embedding = tf.get_variable("embedding", [vocab_size, embedding_dim])
        embedded = tf.nn.embedding_lookup(embedding, self.inputs)
        
        cell = tf.nn.rnn_cell.GRUCell(hidden_dim)
        _, self.state = tf.nn.dynamic_rnn(cell, embedded, sequence_length=self.seq_lengths, dtype=tf.float32)

# Example usage of TensorFlow model
tf.reset_default_graph()
max_seq_length = 10
tf_model = StaticSentenceEncoder(vocab_size, embedding_dim, hidden_dim, max_seq_length)

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    
    # Process sentences of different lengths
    short_sentence = [[1, 2, 3, 4, 5, 0, 0, 0, 0, 0]]  # Padded to max_seq_length
    long_sentence = [[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]]
    
    short_encoding = sess.run(tf_model.state, 
                              feed_dict={tf_model.inputs: short_sentence, tf_model.seq_lengths: [5]})
    long_encoding = sess.run(tf_model.state, 
                             feed_dict={tf_model.inputs: long_sentence, tf_model.seq_lengths: [10]})
    
    print("TensorFlow (Static) - Short sentence encoding shape:", short_encoding.shape)
    print("TensorFlow (Static) - Long sentence encoding shape:", long_encoding.shape)

Now, let me explain the key differences and advantages of each approach:

1. PyTorch (Dynamic Graph):
   - The model definition is more straightforward and intuitive.
   - We can process sentences of different lengths without modification to the model or padding.
   - The computational graph is built on-the-fly during the forward pass.
   - It's easier to debug as you can use standard Python debugging tools.
   - Changes to the model structure can be made easily without recompiling.

2. TensorFlow 1.x (Static Graph):
   - We need to define placeholders for inputs and sequence lengths.
   - The maximum sequence length must be defined beforehand.
   - Shorter sentences need to be padded to the maximum length.
   - The graph is defined first, then compiled and executed.
   - It's more efficient for repeated executions once compiled.
   - Debugging can be more challenging as the execution happens in a separate session.

The key advantage of the dynamic graph (PyTorch) in this NLP context is its flexibility with variable-length inputs. In the static graph approach, we had to set a maximum sequence length and pad shorter sentences, which can be inefficient for datasets with widely varying sentence lengths.

This example demonstrates why dynamic graphs are particularly useful in NLP tasks, where input structures can vary significantly between samples. They allow for more natural handling of variable-length sequences and enable easier experimentation with different model architectures during research and development.

# Modifications

Certainly! I'll demonstrate how changes to the model structure can be made easily in PyTorch (dynamic graph) compared to TensorFlow 1.x (static graph). We'll modify our sentence encoder to include an additional layer and change the RNN type.

Let's start with the PyTorch example:

In [None]:
# PyTorch implementation (Dynamic Graph)
class DynamicSentenceEncoder(nn.Module):
    def __init__(self, vocab_size, embedding_dim, hidden_dim):
        super().__init__()
        self.embedding = nn.Embedding(vocab_size, embedding_dim)
        self.rnn = nn.GRU(embedding_dim, hidden_dim, batch_first=True)
        self.fc = nn.Linear(hidden_dim, hidden_dim // 2)
        
    def forward(self, sentence):
        embedded = self.embedding(sentence)
        _, hidden = self.rnn(embedded)
        output = self.fc(hidden.squeeze(0))
        return output

# Example usage of PyTorch model
vocab_size, embedding_dim, hidden_dim = 1000, 50, 100
pytorch_model = DynamicSentenceEncoder(vocab_size, embedding_dim, hidden_dim)

# Process a sentence
sentence = torch.LongTensor([[1, 2, 3, 4, 5]])
encoding = pytorch_model(sentence)
print("PyTorch (Dynamic) - Original encoding shape:", encoding.shape)

# Modify the model structure
pytorch_model.rnn = nn.LSTM(embedding_dim, hidden_dim, batch_first=True)
pytorch_model.fc = nn.Sequential(
    nn.Linear(hidden_dim, hidden_dim // 2),
    nn.ReLU(),
    nn.Linear(hidden_dim // 2, hidden_dim // 4)
)

# Process the same sentence with the modified model
new_encoding = pytorch_model(sentence)
print("PyTorch (Dynamic) - Modified encoding shape:", new_encoding.shape)

# TensorFlow 1.x implementation (Static Graph)
class StaticSentenceEncoder:
    def __init__(self, vocab_size, embedding_dim, hidden_dim, max_seq_length):
        self.inputs = tf.placeholder(tf.int32, shape=[None, max_seq_length])
        self.seq_lengths = tf.placeholder(tf.int32, shape=[None])
        
        embedding = tf.get_variable("embedding", [vocab_size, embedding_dim])
        embedded = tf.nn.embedding_lookup(embedding, self.inputs)
        
        cell = tf.nn.rnn_cell.GRUCell(hidden_dim)
        _, state = tf.nn.dynamic_rnn(cell, embedded, sequence_length=self.seq_lengths, dtype=tf.float32)
        
        self.output = tf.layers.dense(state, hidden_dim // 2)

# Example usage of TensorFlow model
tf.reset_default_graph()
max_seq_length = 10
tf_model = StaticSentenceEncoder(vocab_size, embedding_dim, hidden_dim, max_seq_length)

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    
    # Process a sentence
    sentence = [[1, 2, 3, 4, 5, 0, 0, 0, 0, 0]]  # Padded to max_seq_length
    encoding = sess.run(tf_model.output, 
                        feed_dict={tf_model.inputs: sentence, tf_model.seq_lengths: [5]})
    print("TensorFlow (Static) - Original encoding shape:", encoding.shape)

# To modify the TensorFlow model, we need to redefine the entire graph
tf.reset_default_graph()

class ModifiedStaticSentenceEncoder:
    def __init__(self, vocab_size, embedding_dim, hidden_dim, max_seq_length):
        self.inputs = tf.placeholder(tf.int32, shape=[None, max_seq_length])
        self.seq_lengths = tf.placeholder(tf.int32, shape=[None])
        
        embedding = tf.get_variable("embedding", [vocab_size, embedding_dim])
        embedded = tf.nn.embedding_lookup(embedding, self.inputs)
        
        cell = tf.nn.rnn_cell.LSTMCell(hidden_dim)
        _, state = tf.nn.dynamic_rnn(cell, embedded, sequence_length=self.seq_lengths, dtype=tf.float32)
        
        hidden = tf.layers.dense(state.h, hidden_dim // 2, activation=tf.nn.relu)
        self.output = tf.layers.dense(hidden, hidden_dim // 4)

# Use the modified TensorFlow model
tf_model = ModifiedStaticSentenceEncoder(vocab_size, embedding_dim, hidden_dim, max_seq_length)

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    
    # Process the same sentence with the modified model
    new_encoding = sess.run(tf_model.output, 
                            feed_dict={tf_model.inputs: sentence, tf_model.seq_lengths: [5]})
    print("TensorFlow (Static) - Modified encoding shape:", new_encoding.shape)

Now, let's break down the key differences in modifying the model structure:

1. PyTorch (Dynamic Graph):
   - We can directly modify the existing model by changing its attributes:
     ```python
     pytorch_model.rnn = nn.LSTM(embedding_dim, hidden_dim, batch_first=True)
     pytorch_model.fc = nn.Sequential(
         nn.Linear(hidden_dim, hidden_dim // 2),
         nn.ReLU(),
         nn.Linear(hidden_dim // 2, hidden_dim // 4)
     )
     ```
   - These changes take effect immediately, and we can use the modified model right away.
   - No recompilation is needed; the new graph is built on the next forward pass.
   - We can even modify the model structure based on runtime conditions if needed.

2. TensorFlow 1.x (Static Graph):
   - To change the model structure, we need to redefine the entire graph:
     ```python
     tf.reset_default_graph()
     class ModifiedStaticSentenceEncoder:
         # ... (entire new class definition)
     ```
   - We create a new instance of the modified model.
   - The session needs to be reinitialized with the new graph.
   - Any saved state from the previous model is lost unless explicitly handled.
   - Changes cannot be made dynamically during runtime; the entire graph must be predefined.

The PyTorch approach allows for more flexible and incremental changes to the model structure. This is particularly useful in research settings where you might want to experiment with different architectures quickly. You can even write code that modifies the model based on certain conditions or input characteristics.

In contrast, the TensorFlow 1.x static graph approach requires more upfront planning. Once the graph is defined and compiled, it's more difficult to make structural changes. This can be advantageous for production environments where the model structure is fixed, as it allows for optimizations that can improve performance.

It's worth noting that TensorFlow 2.x has adopted an eager execution mode that is more similar to PyTorch's dynamic graph approach, allowing for more flexible model modifications.