# Predicting Sentiment Using a Transformer

<div style="background-color: #f0f8ff; border: 2px solid #4682b4; padding: 10px;">
<a href="https://colab.research.google.com/github/DeepTrackAI/DeepLearningCrashCourse/blob/main/Ch08_Attention/ec08_B_transformer/transformer.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>
<strong>If using Colab/Kaggle:</strong> You need to uncomment the code in the cell below this one.
</div>

In [1]:
# Uncomment if using Colab/Kaggle.
# !pip install contractions datasets deeplay deeptrack spacy

This notebook provides you with a complete code example that predicts the sentiment of movie reviews using a transformer encoder network.

<div style="background-color: #f0f8ff; border: 2px solid #4682b4; padding: 10px;">
<strong>Note:</strong> This notebook contains the Code Example 8-B from the book  

**Deep Learning Crash Course**  
Benjamin Midtvedt, Jesús Pineda, Henrik Klein Moberg, Harshith Bachimanchi, Joana B. Pereira, Carlo Manzo, Giovanni Volpe  
No Starch Press, San Francisco (CA), 2025  
ISBN-13: 9781718503922  

[https://nostarch.com/deep-learning-crash-course](https://nostarch.com/deep-learning-crash-course)

You can find the other notebooks on the [Deep Learning Crash Course GitHub page](https://github.com/DeepTrackAI/DeepLearningCrashCourse).
</div>

## Using the IMDB Dataset

Start by downloading the Large Movie Review Dataset (often referred to as the IMDB dataset, as it’s available at https://huggingface.co/datasets/imdb). It contains 50,000 movie reviews, labeled as positive or negative. The dataset is divided into 25,000 reviews for training and 25,000 reviews for testing.

Download the IMDB dataset ...

In [2]:
from datasets import load_dataset

dataset = load_dataset("imdb")

... splitting the training and validation datasets ...

In [3]:
split = dataset["train"].train_test_split(test_size=0.2,
                                          stratify_by_column="label", seed=42)
train_dataset, val_dataset = split["train"], split["test"]

... and print some example reviews.

In [4]:
import numpy as np
import pandas as pd

samples = train_dataset.select(np.random.randint(0, len(train_dataset), 3))
texts, labels = samples["text"], samples["label"]

df = pd.DataFrame({"Text": texts, "Label": labels})
styled_df = df.style.set_properties(**{"text-align": "left"}).set_table_styles(
    [{"selector": "th", "props": [("text-align", "center")]}]
)
with pd.option_context("display.max_colwidth", None):
    display(styled_df)

Unnamed: 0,Text,Label
0,"This is my favorite classic. It was filmed a little west of Philadelphia, PA when I was 13, in 1957, and released the next year. Then in 1970, I found myself working the very same county as a rookie PA state trooper. I have always enjoyed checking out the different places where scenes were filmed. I knew the owner of the Downingtown Diner well, and he had a road sign out front which told all passing motorists that this was the ""home of the blob"". The theater scene was in Phoenixville, near Valley Forge Park and it is still showing films today!",1
1,"When you typically watch a short film your always afraid that the person creating the film tries to throw too much into it. That's not the case with this one. A great story about a young girl who's had enough and other worldly forces trying to help make things right. Eric Etebari does a wonderful job of representing the spirit of twisted justice and helps to convey the complexities of the blurred line of right and wrong. Both the young girl and the father give great performances in this wonderful short film, but Eric's performance is definitely the show stealer in this story. I definitely recommend this film for it's complexity, performance, and great over all story.",1
2,"I'm not really sure how to even begin to describe how bad this movie is. I like bad films, as they are often the most entertaining. I love bad special effects, bad acting, bad music, and inept direction. With the exception of the music (which was better than I had expected), this movie had all of those qualities. The special effects were amazingly bad. The worst I've seen since my Nintendo 64. Some scenes to watch for include the Thunderchild, the woman being crushed by the mechanical foot, the Big Ben scene, the train wreck... Wow, there are so many bad effects! On the plus side, though, SOME scenes of the alien walkers are well done. The acting was about as bad as it could possibly have been, having been based directly on H.G. Wells' book. For having such good source material, it's almost as though the actors were trying to be so over-the-top as to make it funny. And then there's the mustache... the single most distracting piece of facial hair I've seen in a long time. Of course, only half the movie contains acting. The rest is characters walking around aimlessly and poorly rendered effects shots. To say that Timothy Hines is an inept director would be an injustice to inept directors. With the use of different colored filters between shots for no particular reason, the use of poorly rendered backgrounds for even inside scenes, the bad green screening, it's amazing to me how this man ever got approval to direct a movie. I wouldn't imagine it would be possible to turn a brilliant book into this bad a movie. Bravo, Mr. Hines. Bravo. My advice to anyone who plans to see this movie is to do what I did: have some friends who enjoy bad movies over, drink, play poker while watching it, keep drinking, and maybe you'll make it all the way through. It does make for an excellent bad movie, so have fun and laugh yourself silly with this disaster.",0


### Preprocessing the Reviews

Implement a function to tokenize a sentence ...

In [5]:
import contractions, re, spacy, unicodedata

tokenizers = {"eng": spacy.blank("en"), "spa": spacy.blank("es")}

regular_expression = r"^[a-zA-Z0-9áéíóúüñÁÉÍÓÚÜÑ.,!?¡¿/:()]+$"
pattern = re.compile(unicodedata.normalize("NFC", regular_expression))

def tokenize(text, lang="eng"):
    """Tokenize text."""
    swaps = {"’": "'", "‘": "'", "“": '"', "”": '"', "´": "'", "´´": '"'}
    for old, new in swaps.items():
        text = text.replace(old, new)
    text = contractions.fix(text) if lang == "eng" else text
    tokens = tokenizers[lang](text)
    return [token.text for token in tokens if pattern.match(token.text)]

### Building a Vocabulary

Implement a class to represent a vocabulary ...

In [6]:
class Vocab:
    """Vocabulary as callable dictionary."""

    def __init__(self, vocab_dict, unk_token="<unk>"):
        """Initialize vocabulary."""
        self.vocab_dict, self.unk_token = vocab_dict, unk_token
        self.default_index = vocab_dict.get(unk_token, -1)
        self.index_to_token = {idx: token for token, idx in vocab_dict.items()}

    def __call__(self, token_or_tokens):
        """Return the index(es) for given token or list of tokens."""
        if not isinstance(token_or_tokens, list):
            return self.vocab_dict.get(token_or_tokens, self.default_index)
        else:
            return [self.vocab_dict.get(token, self.default_index)
                    for token in token_or_tokens]

    def set_default_index(self, index):
        """Set default index for unknown tokens."""
        self.default_index = index

    def lookup_token(self, index_or_indices):
        """Retrieve token corresponding to given index or list of indices."""
        if not isinstance(index_or_indices, list):
            return self.index_to_token.get(int(index_or_indices),
                                           self.unk_token)
        else:
            return [self.index_to_token.get(int(index), self.unk_token)
                    for index in index_or_indices]

    def get_tokens(self):
        """Return a list of tokens ordered by their index."""
        tokens = [None] * len(self.index_to_token)
        for index, token in self.index_to_token.items():
            tokens[index] = token
        return tokens

    def __iter__(self):
        """Iterate over the tokens in the vocabulary."""
        return iter(self.vocab_dict)

    def __len__(self):
        """Return the number of tokens in the vocabulary."""
        return len(self.vocab_dict)

    def __contains__(self, token):
        """Check if a token is in the vocabulary."""
        return token in self.vocab_dict

... implement a function to build vocabulary from an iterator ...

In [7]:
from collections import Counter

def build_vocab_from_iterator(iterator, specials=None, min_freq=1):
    """Build vocabulary from an iterator over tokenized sentences."""
    token_freq = Counter(token for tokens in iterator for token in tokens)
    vocab, index = {}, 0
    if specials:
        for token in specials:
            vocab[token] = index
            index += 1
    for token, freq in token_freq.items():
        if freq >= min_freq:
            vocab[token] = index
            index += 1
    return vocab

... create a vocabulary ...

In [8]:
def imdb_iterator(dataset):
    """Iterate over the IMDB dataset."""
    for sample in dataset:
        yield tokenize(sample["text"])

vocab_dict = build_vocab_from_iterator(imdb_iterator(train_dataset),
                                       specials=["<unk>"], min_freq=10)
vocab = Vocab(vocab_dict, unk_token="<unk>")
vocab.set_default_index(vocab(vocab.unk_token))

... and preprocess the training, validation, and testing datasets.

In [9]:
def preprocessing(sample):
    """Preprocess a movie review."""
    sentence = sample["text"]
    tokens = tokenize(unicodedata.normalize("NFC", sentence))
    sequence_of_indices = vocab(tokens)
    sample.update({"sequences": sequence_of_indices})
    return sample

train_dataset = train_dataset.map(preprocessing)
val_dataset = val_dataset.map(preprocessing)
test_dataset = dataset["test"].map(preprocessing)

## Defining the Data Loaders

In [10]:
import torch
from torch.utils.data import DataLoader
from torch_geometric.data import Data

def collate(batch_of_sequences):
    """Prepare a batch of sequences for the model to process."""
    sequences, labels, batch_indices = [], [], []
    for batch_index, sample in enumerate(batch_of_sequences):
        sequence = torch.tensor(sample["sequences"])
        sequences.append(sequence)
        batch_indices.append(torch.ones_like(sequence, dtype=torch.long)
                             * batch_index)
        label = torch.tensor(sample["label"])
        labels.append(label)
    return Data(sequences=torch.cat(sequences),
                batch_indices=torch.cat(batch_indices),
                y=torch.tensor(labels).float())

train_dataloader = \
    DataLoader(train_dataset, batch_size=8, shuffle=True, collate_fn=collate)
val_dataloader = \
    DataLoader(val_dataset, batch_size=8, shuffle=False, collate_fn=collate)
test_dataloader = \
    DataLoader(test_dataset, batch_size=8, shuffle=False, collate_fn=collate)

## Building a Transformer Encoder Layer

Prepare a class to implement a multi-head attention layer ...

In [11]:
import deeplay as dl

class MultiHeadAttentionLayer(dl.DeeplayModule):
    """Multi-head attention layer with masking."""

    def __init__(self, num_features, num_heads):
        """Initialize multi-head attention."""
        super().__init__()
        self.num_features, self.num_heads = num_features, num_heads
        self.head_dim = num_features // num_heads  # Must be integer

        self.Wq = dl.Layer(torch.nn.Linear, num_features, num_features)
        self.Wk = dl.Layer(torch.nn.Linear, num_features, num_features)
        self.Wv = dl.Layer(torch.nn.Linear, num_features, num_features)
        self.Wout = dl.Layer(torch.nn.Linear, num_features, num_features)

    def forward(self, in_sequence, batch_indices):
        """Apply the multi-head attention mechanism to the input sequence."""
        seq_len, embed_dim = in_sequence.shape
        Q = self.Wq(in_sequence)
        Q = Q.view(seq_len, self.num_heads, self.head_dim).permute(1, 0, 2)
        K = self.Wk(in_sequence)
        K = K.view(seq_len, self.num_heads, self.head_dim).permute(1, 0, 2)
        V = self.Wv(in_sequence)
        V = V.view(seq_len, self.num_heads, self.head_dim).permute(1, 0, 2)

        attn_scores = (torch.matmul(Q, K.transpose(-2, -1))
                       / (self.head_dim ** 0.5))

        attn_mask = torch.eq(batch_indices.unsqueeze(1),
                             batch_indices.unsqueeze(0))
        attn_mask = attn_mask.unsqueeze(0)
        attn_scores = attn_scores.masked_fill(attn_mask == False,
                                              float("-inf"))

        attn_weights = torch.nn.functional.softmax(attn_scores, dim=-1)
        attn_output = torch.matmul(attn_weights, V)
        attn_output = attn_output.permute(1, 0, 2).contiguous()
        attn_output = attn_output.view(seq_len, self.num_features)
        return self.Wout(attn_output)

... and a class to implement a transformer encoder layer ...

In [12]:
from torch_geometric.nn.norm import LayerNorm

class TransformerEncoderLayer(dl.DeeplayModule):
    """Transformer encoder layer."""

    def __init__(self, num_features, num_heads, feedforward_dim, dropout=0.0):
        """Initialize transformer encoder layer."""
        super().__init__()

        self.self_attn = MultiHeadAttentionLayer(num_features, num_heads)
        self.attn_dropout = dl.Layer(torch.nn.Dropout, dropout)
        self.attn_skip = dl.Add()
        self.attn_norm = dl.Layer(LayerNorm, num_features, eps=1e-6)

        self.feedforward = dl.Sequential(
            dl.Layer(torch.nn.Linear, num_features, feedforward_dim),
            dl.Layer(torch.nn.ReLU),
            dl.Layer(torch.nn.Linear, feedforward_dim, num_features),
        )
        self.feedforward_dropout = dl.Layer(torch.nn.Dropout, dropout)
        self.feedforward_skip = dl.Add()
        self.feedforward_norm = dl.Layer(LayerNorm, num_features, eps=1e-6)

    def forward(self, in_sequence, batch_indices):
        """Refine sequence via attention and feedforward layers."""
        attns = self.self_attn(in_sequence, batch_indices)
        attns = self.attn_dropout(attns)
        attns = self.attn_skip(in_sequence, attns)
        attns = self.attn_norm(attns, batch_indices)

        out_sequence = self.feedforward(attns)
        out_sequence = self.feedforward_dropout(out_sequence)
        out_sequence = self.feedforward_skip(attns, out_sequence)
        out_sequence = self.feedforward_norm(out_sequence, batch_indices)

        return out_sequence

## Building a Transformer Encoder Model

Build a class to implement a transformer encoder model ...

In [13]:
class TransformerEncoderModel(dl.DeeplayModule):
    """Transformer encoder model."""

    def __init__(self, vocab_size, num_features, num_heads, feedforward_dim,
                 num_layers, out_dim, dropout=0.0):
        """Initialize transformer encoder model."""
        super().__init__()
        self.num_features = num_features

        self.embedding = dl.Layer(torch.nn.Embedding, vocab_size, num_features)

        self.pos_encoder = dl.IndexedPositionalEmbedding(num_features)
        self.pos_encoder.dropout.configure(p=dropout)

        self.transformer_block = dl.LayerList()
        for _ in range(num_layers):
            self.transformer_block.append(TransformerEncoderLayer(
                    num_features, num_heads, feedforward_dim, dropout=dropout,
            ))

        self.out_block = dl.Sequential(
            dl.Layer(torch.nn.Dropout, dropout),
            dl.Layer(torch.nn.Linear, num_features, num_features // 2),
            dl.Layer(torch.nn.ReLU),
            dl.Layer(torch.nn.Linear, num_features // 2, out_dim),
            dl.Layer(torch.nn.Sigmoid),
        )

    def forward(self, dict):
        """Predict sentiment of movie reviews."""
        in_sequence, batch_indices = dict["sequences"], dict["batch_indices"]

        embeddings = self.embedding(in_sequence) * self.num_features ** 0.5
        pos_embeddings = self.pos_encoder(embeddings, batch_indices)

        out_sequence = pos_embeddings
        for transformer_layer in self.transformer_block:
            out_sequence = transformer_layer(out_sequence, batch_indices)

        batch_size = torch.max(batch_indices) + 1
        aggregates = torch.zeros(batch_size, self.num_features,
                                 device=out_sequence.device)
        for batch_index in torch.unique(batch_indices):
            mask = batch_indices == batch_index
            aggregates[batch_index] = out_sequence[mask].mean(dim=0)

        pred_sentiment = self.out_block(aggregates).squeeze()
        return pred_sentiment

... instantiate the transformer encoder model ...

In [14]:
model = TransformerEncoderModel(
    vocab_size=len(vocab), num_features=300, num_heads=12, feedforward_dim=512,
    num_layers=4, out_dim=1, dropout=0.1,
).create()

... and print it out.

In [15]:
print(model)

TransformerEncoderModel(
  (embedding): Embedding(19566, 300)
  (pos_encoder): IndexedPositionalEmbedding(
    (dropout): Dropout(p=0.1, inplace=False)
  )
  (transformer_block): LayerList(
    (0-3): 4 x TransformerEncoderLayer(
      (self_attn): MultiHeadAttentionLayer(
        (Wq): Linear(in_features=300, out_features=300, bias=True)
        (Wk): Linear(in_features=300, out_features=300, bias=True)
        (Wv): Linear(in_features=300, out_features=300, bias=True)
        (Wout): Linear(in_features=300, out_features=300, bias=True)
      )
      (attn_dropout): Dropout(p=0.1, inplace=False)
      (attn_skip): Add()
      (attn_norm): LayerNorm(300, affine=True, mode=graph)
      (feedforward): Sequential(
        (0): Linear(in_features=300, out_features=512, bias=True)
        (1): ReLU()
        (2): Linear(in_features=512, out_features=300, bias=True)
      )
      (feedforward_dropout): Dropout(p=0.1, inplace=False)
      (feedforward_skip): Add()
      (feedforward_norm): La

## Loading Pretrained Embeddings

Download the GloVe embeddings ...

In [16]:
import os
from torchvision.datasets.utils import download_url, extract_archive

glove_folder = ".glove_cache"
if not os.path.exists(glove_folder):
    os.makedirs(glove_folder, exist_ok=True)
    url = "https://nlp.stanford.edu/data/glove.42B.300d.zip"
    download_url(url, glove_folder)
    zip_filepath = os.path.join(glove_folder, "glove.42B.300d.zip")
    extract_archive(zip_filepath, glove_folder)
    os.remove(zip_filepath)

... implement a function to load the GloVe embeddings ...

In [17]:
def load_glove_embeddings(glove_file):
    """Load GloVe embeddings."""
    glove_embeddings = {}
    with open(glove_file, "r", encoding="utf-8") as file:
        for line in file:
            values = line.split()
            word = values[0]
            glove_embeddings[word] = np.round(
                np.asarray(values[1:], dtype="float32"), decimals=6,
            )
    return glove_embeddings

... implement a function to get GloVe embeddings for a vocabulary ...

In [18]:
def get_glove_embeddings(vocab, glove_embeddings, embed_dim):
    """Get GloVe embeddings for a vocabulary."""
    embeddings = torch.zeros((len(vocab), embed_dim), dtype=torch.float32)
    for i, token in enumerate(vocab):
        embedding = glove_embeddings.get(token)
        if embedding is None:
            embedding = glove_embeddings.get(token.lower())
        if embedding is not None:
            embeddings[i] = torch.tensor(embedding, dtype=torch.float32)
    return embeddings

... ad add the GloVe pretrained embeddings.

In [19]:
glove_file = os.path.join(glove_folder, "glove.42B.300d.txt")
glove_embed, embed_dim = load_glove_embeddings(glove_file), 300

model.embedding.weight.data = \
    get_glove_embeddings(vocab.get_tokens(), glove_embed, embed_dim)
model.embedding.weight.requires_grad = False

## Training the Model

Compile the model ...

In [20]:
classifier = dl.BinaryClassifier(
    model=model, optimizer=dl.AdamW(lr=1e-4),
).create()

... and train it.

In [21]:
trainer = dl.Trainer(max_epochs=5, accelerator="cpu")  ###
trainer.fit(classifier, train_dataloader, val_dataloader)

/Users/giovannivolpe/Documents/GitHub/DeepLearningCrashCourse/py_env_book/lib/python3.10/site-packages/lightning/pytorch/trainer/setup.py:177: GPU available but not used. You can set it by doing `Trainer(accelerator='gpu')`.
/Users/giovannivolpe/Documents/GitHub/DeepLearningCrashCourse/py_env_book/lib/python3.10/site-packages/lightning/pytorch/trainer/connectors/logger_connector/logger_connector.py:75: Starting from v1.9.0, `tensorboardX` has been removed as a dependency of the `lightning.pytorch` package, due to potential conflicts with other packages in the ML ecosystem. For this reason, `logger=True` will use `CSVLogger` as the default logger, unless the `tensorboard` or `tensorboardX` packages are found. Please `pip install lightning[extra]` or one of them to enable TensorBoard support by default

  | Name          | Type                    | Params | Mode 
------------------------------------------------------------------
0 | loss          | BCELoss                 | 0      | trai

Sanity Checking: |          | 0/? [00:00<?, ?it/s]

/Users/giovannivolpe/Documents/GitHub/DeepLearningCrashCourse/py_env_book/lib/python3.10/site-packages/lightning/pytorch/trainer/connectors/data_connector.py:424: The 'val_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=10` in the `DataLoader` to improve performance.
/Users/giovannivolpe/Documents/GitHub/DeepLearningCrashCourse/py_env_book/lib/python3.10/site-packages/lightning/pytorch/trainer/connectors/data_connector.py:424: The 'train_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=10` in the `DataLoader` to improve performance.


Training: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

## Evaluating the Trained Model

Test the trained model ... ...

In [22]:
test_results = trainer.test(classifier, test_dataloader)

/Users/giovannivolpe/Documents/GitHub/DeepLearningCrashCourse/py_env_book/lib/python3.10/site-packages/lightning/pytorch/trainer/connectors/data_connector.py:424: The 'test_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=10` in the `DataLoader` to improve performance.


Testing: |          | 0/? [00:00<?, ?it/s]

... and display the model’s prediction on some reviews.

In [23]:
import random

classifier.model.eval()

texts, labels, predictions = [], [], []
for idx in random.sample(range(len(test_dataset)), 3):
    sample = test_dataset[idx]
    input_sequence = torch.tensor(vocab(tokenize(sample["text"]))).long()
    test_input = {
        "sequences": input_sequence,
        "batch_indices": torch.zeros_like(input_sequence, dtype=torch.long),
    }
    probability = classifier.model(test_input)
    prediction = probability > 0.5

    texts.append(sample["text"])
    labels.append(sample["label"])
    predictions.append(prediction.item() * 1)

df = pd.DataFrame({"text": texts, "label": labels, "prediction": predictions})
styled_df = df.style.set_properties(**{"text-align": "left"}).set_table_styles(
    [{"selector": "th", "props": [("text-align", "center")]}]
)
with pd.option_context("display.max_colwidth", None):
    display(styled_df)

Unnamed: 0,text,label,prediction
0,"I was truly and wonderfully surprised at ""O' Brother, Where Art Thou?"" The video store was out of all the movies I was planning on renting, so then I came across this. I came home and as I watched I became engrossed and found myself laughing out loud. The Coen's have made a magnificiant film again. But I think the first time you watch this movie, you get to know the characters. The second time, now that you know them, you laugh sooo hard it could hurt you. I strongly would reccomend ANYONE seeing this because if you are not, you are truly missing a film gem for the ages. 10/10",1,1
1,"I tried watching this movie, but I didn't make it past the first 15 minutes. It's a terrible disappointment, considering the cast, but I can't look past the fact that the dialogue is in English and some of the actors pretending to be Indian are not even close (read: Kristin Kreuk). Considering that India alone has 1/6th of the world's population and one of the biggest movie industries, I don't think it would have been hard for the film-makers to have found an excellent Indian actress to play the part. And I don't say so because of some blind patriotism, but because it's absolutely and totally absurd for a non-Indian to play the role of an Indian/Pakistani. Now some people say that 'as long as she's convincing who cares?' but my point is exactly that she's NOT convincing and never can be - not due to her acting skills, but due to her ethnicity. For example, however good an actor Tom Hanks may be, he'll never be able to play an Australian Aborigine! But that is still minor to the biggest faux pas the film-makers made: having the dialogue in English. It totally destroys the mood, as well as any semblance of authenticity. Had the same movie been made in native languages (Hindi, Urdu, Punjabi) with English subtitles, this may have been an excellent movie. Unfortunately, as things stand, I would not recommend anyone seeing it, apart from film students who want to study ""What not to do"" in movies.",0,0
2,"Despite this production having received a number of poor reviews, it actually holds up quite well for its age. Note also that it is not a BBC programme, it was simply licensed to them by Granada Ventures when the Jane Austen collection was released on DVD. So how does it compare with other adaptations of the same novel? The most well-known version these days is the 1995 film with Amanda Root as Anne Elliott and Ciaran Hinds as Captain Frederick Wentworth. That film was of course shorter but a good snapshot of the story - the earlier version, with Ann Firbank and Bryan Marshall in the same roles, had four hours to tell the story and moved at a more leisurely pace. Firbank is a good ten years too old for her role, but she is very good - Marshall is excellent as Wentworth, a man disappointed in love, and bitter about interference. And hidden in the cast are people who also contribute - Michael Culver, later seen in Cadfael, as Harvill; Richard Vernon, later seen in the Hitchhikers Guide to the Galaxy, as Admiral Croft; Noel Dyson, earlier in Coronation Street, as Mrs Musgrove. One criticism I do have is that the hairstyles are a bit distracting, and that the costumes are awful! Still, this shouldn't detract from a hugely enjoyable Austen adaptation.",1,1
