# Basic Seq2Seq
This is a basic seq2seq implementation to show what can be done for conversational models.  The task we'll train it on is predicting company responses to consumers.

This notebook shows how to prepare the data and construct the Keras model, but will not train quickly!  Instead, it demonstrates how the network progresses toward natural responses, and allows replying to arbitrary text, as shown below.  Unfortunately, getting to interesting results takes longer than an hour on Kaggle's non-GPU notebooks, so you'll need to download the notebook and run on your own machine to get to interesting results.

This configuration tops out at a test loss of ~1.8, and provides nuanced responses to some of the more requests, like "[the I problem](http://www.refinery29.com/2017/11/179790/ios-11-1-bug-keyboard-problem)" for @AppleSupport, after around 6 hours of training on a CUDA 5.0 GPU.

![seq2seq model architecture](https://i.imgur.com/JmuryKu.png)

In [12]:
import re
import random
import time

# print('Library versions:')

import keras
# print(f'keras:{keras.__version__}')
#import pandas as pd
#print(f'pandas:{pd.__version__}')
import sklearn
# print(f'sklearn:{sklearn.__version__}')
import nltk
# print(f'nltk:{nltk.__version__}')
import numpy as np
# print(f'numpy:{np.__version__}')

from sklearn.feature_extraction.text import CountVectorizer
from nltk.tokenize import casual_tokenize

#from tqdm import tqdm_notebook as tqdm # Special jupyter notebook progress bar 💫

## Model Parameters

In [13]:
# 8192 - large enough for demonstration, larger values make network training slower
MAX_VOCAB_SIZE = 394
# seq2seq generally relies on fixed length message vectors - longer messages provide more info
# but result in slower training and larger networks
MAX_MESSAGE_LEN = 20 
# Embedding size for words - gives a trade off between expressivity of words and network size
EMBEDDING_SIZE = 100
# Embedding size for whole messages, same trade off as word embeddings
CONTEXT_SIZE = 100
# Larger batch sizes generally reach the average response faster, but small batch sizes are
# required for the model to learn nuanced responses.  Also, GPU memory limits max batch size.
BATCH_SIZE = 100
# Helps regularize network and prevent overfitting.
DROPOUT = 0.2
# High learning rate helps model reach average response faster, but can make it hard to 
# converge on nuanced responses
LEARNING_RATE=0.005

# Tokens needed for seq2seq
UNK = 1  # words that aren't found in the vocab
PAD = 0  # after message has finished, this fills all remaining vector positions
START = 2  # provided to the model at position 0 for every response predicted

# Implementaiton detail for allowing this to be run in Kaggle's notebook hardware
SUB_BATCH_SIZE = 100


## Data Prep
Here, we'll prepare the data for training our seq2seq model, including:

- Replace screen names with `@__sn__` token to show model the commonality between them
- Build a vocab to turn tokens into integers suitable for our seq2seq model
- Tokenize input and target text into fixed size vectors
- Partition our dataset into train and test sets

### Data Loading and Reshaping
Pulled from [this kernel](https://www.kaggle.com/soaxelbrooke/first-inbound-and-response-tweets).

### Tokenizing and Vocab Build

We'll use NLTK's `casual_tokenize`, which handles a lot of corner cases found in social media data ("casual" text data) along with scitkit learn's `CountVectorizer`.  We won't use the actual `CountVectorizer`, just use it as a convenient vocabulary builder, which we'll apply with functions that turn text into "word indexes" - integers that represent each word - and back.

In [14]:
import pickle

with open('./data/embeddings.pkl', 'rb') as fp:
    our_embedding , idx2word , word2idx = pickle.load(fp)

In [15]:
word2idx

{'$': 274,
 '%': 28,
 "'": 112,
 "''": 336,
 '(': 53,
 ')': 304,
 ',': 307,
 '-': 72,
 '.': 96,
 '1': 31,
 '10': 73,
 '11': 47,
 '1bn': 315,
 '2': 4,
 '2000': 285,
 '2003': 177,
 '2005': 157,
 '27': 281,
 '3': 358,
 '300m': 42,
 '4': 212,
 '42': 43,
 '464,000': 291,
 '5': 361,
 '500m': 127,
 '6': 60,
 '76': 115,
 '8': 256,
 '``': 16,
 'a': 344,
 'about': 152,
 'account': 189,
 'accounts': 297,
 'ad': 54,
 'adjust': 169,
 'advert': 18,
 'advertising': 235,
 'after': 260,
 'against': 278,
 'ahead': 91,
 'alan': 233,
 'alexander': 142,
 'all': 343,
 'almost': 206,
 'already': 249,
 'also': 167,
 'america': 331,
 'amount': 313,
 'an': 253,
 'analysts': 84,
 'and': 119,
 'announce': 20,
 'any': 82,
 'aol': 373,
 'around': 355,
 'as': 371,
 'aside': 26,
 'assets': 368,
 'at': 49,
 'attractive': 130,
 'back': 365,
 'bank': 6,
 'be': 29,
 'been': 237,
 'before': 302,
 'beijing': 387,
 'believe': 74,
 'benefited': 124,
 'bertelsmann': 80,
 'better': 254,
 'big': 81,
 'biggest': 200,
 'bonds': 5

In [16]:
idx2word

{3: 'thought',
 4: '2',
 5: 'higher',
 6: 'bank',
 7: 'newspaper',
 8: 'projecting',
 9: 'produce',
 10: 'timewarner',
 11: 'needed',
 12: 'division',
 13: 'recent',
 14: 'way',
 15: 'hit',
 16: '``',
 17: 'reserves',
 18: 'advert',
 19: 'despite',
 20: 'announce',
 21: 'meaningful',
 22: 'export',
 23: 'lost',
 24: 'settle',
 25: 'own',
 26: 'aside',
 27: 'offset',
 28: '%',
 29: 'be',
 30: 'this',
 31: '1',
 32: 'to',
 33: 'meeting',
 34: 'yawning',
 35: 'keep',
 36: 'subscribers',
 37: 'trillion',
 38: 'he',
 39: 'i',
 40: 'earnings',
 41: 'firm',
 42: '300m',
 43: '42',
 44: 'free',
 45: 'preceding',
 46: 'policy',
 47: '11',
 48: 'window',
 49: 'at',
 50: 'bonds',
 51: 'stake',
 52: 'can',
 53: '(',
 54: 'ad',
 55: 'sinche',
 56: 'increase',
 57: 'sixth',
 58: 'view',
 59: 'deal',
 60: '6',
 61: 'ripe',
 62: 'financial',
 63: 'move',
 64: 'thursday',
 65: 'falls',
 66: 'reached',
 67: 'highest',
 68: 'interest',
 69: 'level',
 70: 'profit',
 71: 'had',
 72: '-',
 73: '10',
 74: 'b

In [17]:
word2idx['__unk__'] = UNK
word2idx['__pad__'] = PAD
word2idx['__start__'] = START

idx2word[UNK] = '__unk__'
idx2word[PAD] = '__pad__'
idx2word[START] =  '__start__'

In [18]:
with open('data/xy.pkl', 'rb') as fp:
    x, y = pickle.load(fp)

In [19]:
y

[[54, 141, 239, 214, 345, 70, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
 [219, 83, 93, 148, 123, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]]

In [20]:
x, y = np.array(x), np.array(y)

In [21]:
y

array([[ 54, 141, 239, 214, 345,  70,   0,   0,   0,   0,   0,   0,   0,
          0,   0,   0,   0,   0,   0,   0],
       [219,  83,  93, 148, 123,   0,   0,   0,   0,   0,   0,   0,   0,
          0,   0,   0,   0,   0,   0,   0]])

### Vocab Helper Functions
These helper functions take strings and turn them into word indexes used by the actual seq2seq models.  This turns something like "This is how we do it." into a padded array of integers, like [153, 4, 643, 48, 94, 54, 8, 0, 0, 0].  We'll apply the `to_word_idx` function to our text data to get our `N x MESSAGE_LEN` training/test data.

In [22]:
import re
def pretokenize(sentence):
    chars = r'([\.\'"])'
    return re.sub(chars, r' \1 ', sentence)

In [23]:
import nltk
def to_word_idx(sentence):
    full_length = [word2idx[word.lower()] if word.lower() in word2idx else 1 for word in nltk.word_tokenize(pretokenize(sentence))] + [0] * 20
    return full_length[:20]

def from_word_idx(word_idxs):
    return ' '.join(idx2word[idx] for idx in word_idxs if idx != PAD).strip()


In [24]:
from_word_idx(to_word_idx('Quarterly profits at US media giant TimeWarner jumped 76% to $1.13bn (ВЈ600m) for the three months to December, from $639m year-earlier.The firm, which is now one of the biggest investors in Google, benefited from sales of high-speed internet connections and higher advert sales. TimeWarner said fourth quarter sales rose 2% to $11.1bn from $10.9bn. Its profits were buoyed by one-off gains which offset a profit dip at Warner Bros, and less users for AOL.Time Warner said on Friday that it now owns 8% of search-engine Google. But its own internet business, AOL, had has mixed fortunes. It lost 464,000 subscribers in the fourth quarter profits were lower than in the preceding three quarters. However, the company said AOL\'s underlying profit before exceptional items rose 8% on the back of stronger internet advertising revenues. It hopes to increase subscribers by offering the online service free to TimeWarner internet customers and will try to sign up AOL\'s existing customers for high-speed broadband. TimeWarner also has to restate 2000 and 2003 results following a probe by the US Securities Exchange Commission (SEC), which is close to concluding.Time Warner\'s fourth quarter profits were slightly better than analysts\' expectations. But its film division saw profits slump 27% to $284m, helped by box-office flops Alexander and Catwoman, a sharp contrast to year-earlier, when the third and final film in the Lord of the Rings trilogy boosted results. For the full-year, TimeWarner posted a profit of $3.36bn, up 27% from its 2003 performance, while revenues grew 6.4% to $42.09bn. "Our financial performance was strong, meeting or exceeding all of our full-year objectives and greatly enhancing our flexibility," chairman and chief executive Richard Parsons said. For 2005, TimeWarner is projecting operating earnings growth of around 5%, and also expects higher revenue and wider profit margins.TimeWarner is to restate its accounts as part of efforts to resolve an inquiry into AOL by US market regulators. It has already offered to pay $300m to settle charges, in a deal that is under review by the SEC. The company said it was unable to estimate the amount it needed to set aside for legal reserves, which it previously set at $500m. It intends to adjust the way it accounts for a deal with German music publisher Bertelsmann\'s purchase of a stake in AOL Europe, which it had reported as advertising revenue. It will now book the sale of its stake in AOL Europe as a loss on the value of that stake.'))

'quarterly profits at us media giant timewarner jumped 76 % to $ 1 . __unk__ ( __unk__ ) for the'

### Train / Test Split
Here, we split our data into training and test sets.  For simplicity, we use a random split, which may result in different distributions between the training and test set, but we won't worry about that for this case.

In [25]:
all_idx = list(range(len(x)))
train_idx = set(random.sample(all_idx, int(0.8 * len(all_idx))))
test_idx = {idx for idx in all_idx if idx not in train_idx}

train_x = x[:]#[list(train_idx)]
test_x = x[:]#[list(test_idx)]
train_y = y[:]#[list(train_idx)]
test_y = y[:]#[list(test_idx)]

assert train_x.shape == train_y.shape
assert test_x.shape == test_y.shape

print('Training data of shape {%d} and test data of shape {%d}.'%train_x.shape)

Training data of shape {2} and test data of shape {20}.


In [26]:
train_x.shape

(2, 20)

## Model Creation
We'll create and compile the model here.  It will consist of the following components:

- Shared word embeddings
  - A shared embedding layer that turns word indexes (a sparse representation) into a dense/compressed representation.  This embeds both the request from the customer, and also the last words uttered by the model that are fed back into the model.
- Encoder RNN
  - In this case, a single LSTM layer.  This encodes the whole input sentence into a context vector (or thought vector) that represents completely what the customer is saying, and produces a single output.
- Decoder RNN
  - This RNN (also an LSTM in this case) decodes the context vector into a string of tokens/utterances.  For each time step, it takes the context vector and the embedded last utterance and produces the next utterance, which is fed back into the model.  More complex and effective models copy the encoder state into the decoder, add more layers of LSTMs, and apply attention mechanisms - but these are out of the scope of this simple example.
- Next Word Dense+Softmax
  - These two layers take the decoder output and turn it into the next word to be uttered.  The dense layer allows the decoder to not map directly to words uttered, and the softmax turns the dense layer output into a probability distribution, from which we pick the most likely next word.

![seq2seq model structure](https://i.imgur.com/JmuryKu.png)

In [27]:
# keras imports, because there are like... A million of them.
from keras.models import Model
from keras.optimizers import Adam
from keras.layers import Dense, Input, LSTM, Dropout, Embedding, RepeatVector, concatenate, \
    TimeDistributed
from keras.utils import np_utils

In [28]:
train_y[:, :-1]

array([[ 54, 141, 239, 214, 345,  70,   0,   0,   0,   0,   0,   0,   0,
          0,   0,   0,   0,   0,   0],
       [219,  83,  93, 148, 123,   0,   0,   0,   0,   0,   0,   0,   0,
          0,   0,   0,   0,   0,   0]])

In [29]:
 def create_model():
    shared_embedding = Embedding(
        output_dim=EMBEDDING_SIZE,
        input_dim=MAX_VOCAB_SIZE,
        input_length=MAX_MESSAGE_LEN,
        name='embedding',   
    )
    # ENCODER
    encoder_input = Input(
        shape=(MAX_MESSAGE_LEN,),
        dtype='int32',
        name='encoder_input',
    )
    
    embedded_input = shared_embedding(encoder_input)
    
    # No return_sequences - since the encoder here only produces a single value for the
    # input sequence provided.
    encoder_rnn = LSTM(
        CONTEXT_SIZE,
        name='encoder',
        dropout=DROPOUT
    )
    
    context = RepeatVector(MAX_MESSAGE_LEN)(encoder_rnn(embedded_input))
    
    # DECODER
    
    last_word_input = Input(
        shape=(MAX_MESSAGE_LEN, ),
        dtype='int32',
        name='last_word_input',
    )
    
    embedded_last_word = shared_embedding(last_word_input)
    # Combines the context produced by the encoder and the last word uttered as inputs
    # to the decoder.
    decoder_input = concatenate([embedded_last_word, context], axis=2)
    
    # return_sequences causes LSTM to produce one output per timestep instead of one at the
    # end of the intput, which is important for sequence producing models.
    decoder_rnn = LSTM(
        CONTEXT_SIZE,
        name='decoder',
        return_sequences=True,
        dropout=DROPOUT
    )
    
    decoder_output = decoder_rnn(decoder_input)
    
    # TimeDistributed allows the dense layer to be applied to each decoder output per timestep
    next_word_dense = TimeDistributed(
        Dense(int(MAX_VOCAB_SIZE / 2), activation='relu'),
        name='next_word_dense',
    )(decoder_output)
    
    next_word = TimeDistributed(
        Dense(MAX_VOCAB_SIZE, activation='softmax'),
        name='next_word_softmax'
    )(next_word_dense)
    
    return Model(inputs=[encoder_input, last_word_input], outputs=[next_word])

s2s_model = create_model()
optimizer = Adam(lr=LEARNING_RATE, clipvalue=5.0)
s2s_model.compile(optimizer='adam', loss='categorical_crossentropy')

## Model Training
We'll train the model here.  After each sub-batch of the dataset, we'll test with static input strings to see how the model is progressing in human readable terms.  Its important to have these tests along with traditional model evaluation to provide a better understanding of how well the model is training.

It's important to pull test strings from the real distribution of the data, also.  It can be hard to really put yourself in customers' shoes when writing test messages, and you will get non-representative results when you provide test examples that don't fit the true distribution of the input data (when your input text doesn't sound like real customer requests).

In [30]:
def add_start_token(y_array):
    """ Adds the start token to vectors.  Used for training data. """
    return np.hstack([
        START * np.ones((len(y_array), 1)),
        y_array[:, :-1],
    ])

def binarize_labels(labels):
    """ Helper function that turns integer word indexes into sparse binary matrices for 
        the expected model output.
    """
    return np.array([np_utils.to_categorical(row, num_classes=MAX_VOCAB_SIZE)
                     for row in labels])

In [31]:
def respond_to(model, text):
    """ Helper function that takes a text input and provides a text output. """
    input_y = add_start_token(PAD * np.ones((1, MAX_MESSAGE_LEN)))
    idxs = np.array(to_word_idx(text)).reshape((1, MAX_MESSAGE_LEN))
    for position in range(MAX_MESSAGE_LEN - 1):
        prediction = model.predict([idxs, input_y]).argmax(axis=2)[0]
        input_y[:,position + 1] = prediction[position]
    return from_word_idx(model.predict([idxs, input_y]).argmax(axis=2)[0])

In [37]:
def train_mini_epoch(model, start_idx, end_idx):
    """ Batching seems necessary in Kaggle Jupyter Notebook environments, since
        `model.fit` seems to freeze on larger batches (somewhere 1k-10k).
    """
    b_train_y = binarize_labels(train_y[start_idx:end_idx])
    input_train_y = add_start_token(train_y[start_idx:end_idx])
    
    model.fit(
        [train_x[start_idx:end_idx], input_train_y], 
        b_train_y,
        epochs=1,
        batch_size=BATCH_SIZE,
    )
    
    rand_idx = 0 #random.sample(list(range(len(test_x))), SUB_BATCH_SIZE)
    #print('Test results:', model.evaluate(
    #    [train_x[rand_idx], add_start_token(train_x[rand_idx])],
    #    binarize_labels(test_y[rand_idx])
    #))
    
    input_strings = [
        'The dollar has hit its highest level against the euro in almost three months after the Federal Reserve head said the US trade deficit is set to stabilise.And Alan Greenspan highlighted the US government\'s willingness to curb spending and rising household savings as factors which may help to reduce it. In late trading in New York, the dollar reached $1.2871 against the euro, from $1.2974 on Thursday. Market concerns about the deficit has hit the greenback in recent months. On Friday, Federal Reserve chairman Mr Greenspan\'s speech in London ahead of the meeting of G7 finance ministers sent the dollar higher after it had earlier tumbled on the back of worse-than-expected US jobs data. "I think the chairman\'s taking a much more sanguine view on the current account deficit than he\'s taken for some time," said Robert Sinche, head of currency strategy at Bank of America in New York. "He\'s taking a longer-term view, laying out a set of conditions under which the current account deficit can improve this year and next."Worries about the deficit concerns about China do, however, remain. China\'s currency remains pegged to the dollar and the US currency\'s sharp falls in recent months have therefore made Chinese export prices highly competitive. But calls for a shift in Beijing\'s policy have fallen on deaf ears, despite recent comments in a major Chinese newspaper that the "time is ripe" for a loosening of the peg. The G7 meeting is thought unlikely to produce any meaningful movement in Chinese policy. In the meantime, the US Federal Reserve\'s decision on 2 February to boost interest rates by a quarter of a point - the sixth such move in as many months - has opened up a differential with European rates. The half-point window, some believe, could be enough to keep US assets looking more attractive, and could help prop up the dollar. The recent falls have partly been the result of big budget deficits, as well as the US\'s yawning current account gap, both of which need to be funded by the buying of US bonds and assets by foreign firms and governments. The White House will announce its budget on Monday, and many commentators believe the deficit will remain at close to half a trillion dollars.'    
    ]
    
    for input_string in input_strings:
        output_string = respond_to(model, input_string)
        print('< "{%s}"'%output_string)


### Train the model!

You can stop training by pressing the stop button - the training code is configured to watch for the `KeyboardInterrupt` exception triggered that way.  Also, it will run until the configured stopping point below.


Let's start the training! 🚀

In [38]:
training_time_limit = 360 * 60  # seconds (notebooks terminate after 1 hour)
start_time = time.time()
stop_after = start_time + training_time_limit

class TimesUpInterrupt(Exception):
    pass

try:
    for epoch in range(100):
        print('Training in epoch {epoch}...')
        for start_idx in range(0, len(train_x), SUB_BATCH_SIZE):
            train_mini_epoch(s2s_model, start_idx, start_idx + SUB_BATCH_SIZE)
            if time.time() > stop_after:
                raise TimesUpInterrupt
except KeyboardInterrupt:
    print("Halting training from keyboard interrupt.")
except TimesUpInterrupt:
    print("Halting after {time.time() - start_time} seconds spent training.")

Training in epoch {epoch}...
Epoch 1/1
< "{ad sales boost time warner profit}"
Training in epoch {epoch}...
Epoch 1/1
< "{ad sales boost time warner profit}"
Training in epoch {epoch}...
Epoch 1/1
< "{ad sales boost time warner profit}"
Training in epoch {epoch}...
Epoch 1/1
< "{ad sales boost time warner profit}"
Training in epoch {epoch}...
Epoch 1/1
< "{ad sales boost time warner}"
Training in epoch {epoch}...
Epoch 1/1
< "{ad sales boost time warner profit}"
Training in epoch {epoch}...
Epoch 1/1
< "{dollar gains on greenspan speech}"
Training in epoch {epoch}...
Epoch 1/1
< "{dollar gains on greenspan speech}"
Training in epoch {epoch}...
Epoch 1/1
< "{dollar gains on greenspan speech}"
Training in epoch {epoch}...
Epoch 1/1
< "{dollar gains on greenspan speech}"
Training in epoch {epoch}...
Epoch 1/1
< "{dollar gains on greenspan speech}"
Training in epoch {epoch}...
Epoch 1/1
< "{dollar gains on greenspan speech}"
Training in epoch {epoch}...
Epoch 1/1
< "{dollar gains on greens

< "{dollar gains on greenspan speech}"
Training in epoch {epoch}...
Epoch 1/1
< "{dollar gains on greenspan speech}"
Training in epoch {epoch}...
Epoch 1/1
< "{dollar gains on greenspan speech}"
Training in epoch {epoch}...
Epoch 1/1
< "{dollar gains on greenspan speech}"
Training in epoch {epoch}...
Epoch 1/1
< "{ad sales boost time warner profit}"
Training in epoch {epoch}...
Epoch 1/1
< "{ad sales boost time warner profit}"
Training in epoch {epoch}...
Epoch 1/1
< "{ad sales boost time warner profit}"
Training in epoch {epoch}...
Epoch 1/1
< "{ad sales boost time warner profit}"
Training in epoch {epoch}...
Epoch 1/1
< "{ad sales boost time warner profit}"
Training in epoch {epoch}...
Epoch 1/1
< "{dollar gains on greenspan speech}"
Training in epoch {epoch}...
Epoch 1/1
< "{dollar gains on greenspan speech}"
Training in epoch {epoch}...
Epoch 1/1
< "{dollar gains on greenspan speech}"
Training in epoch {epoch}...
Epoch 1/1
< "{dollar gains on greenspan speech}"
Training in epoch {e

In [39]:
respond_to(s2s_model,'The dollar has hit its highest level against the euro in almost three months after the Federal Reserve head said the US trade deficit is set to stabilise.And Alan Greenspan highlighted the US government\'s willingness to curb spending and rising household savings as factors which may help to reduce it. In late trading in New York, the dollar reached $1.2871 against the euro, from $1.2974 on Thursday. Market concerns about the deficit has hit the greenback in recent months. On Friday, Federal Reserve chairman Mr Greenspan\'s speech in London ahead of the meeting of G7 finance ministers sent the dollar higher after it had earlier tumbled on the back of worse-than-expected US jobs data. "I think the chairman\'s taking a much more sanguine view on the current account deficit than he\'s taken for some time," said Robert Sinche, head of currency strategy at Bank of America in New York. "He\'s taking a longer-term view, laying out a set of conditions under which the current account deficit can improve this year and next."Worries about the deficit concerns about China do, however, remain. China\'s currency remains pegged to the dollar and the US currency\'s sharp falls in recent months have therefore made Chinese export prices highly competitive. But calls for a shift in Beijing\'s policy have fallen on deaf ears, despite recent comments in a major Chinese newspaper that the "time is ripe" for a loosening of the peg. The G7 meeting is thought unlikely to produce any meaningful movement in Chinese policy. In the meantime, the US Federal Reserve\'s decision on 2 February to boost interest rates by a quarter of a point - the sixth such move in as many months - has opened up a differential with European rates. The half-point window, some believe, could be enough to keep US assets looking more attractive, and could help prop up the dollar. The recent falls have partly been the result of big budget deficits, as well as the US\'s yawning current account gap, both of which need to be funded by the buying of US bonds and assets by foreign firms and governments. The White House will announce its budget on Monday, and many commentators believe the deficit will remain at close to half a trillion dollars.')

'dollar gains on greenspan speech'

In [40]:
import h5py

In [41]:
s2s_model.save("./models/s2s.h5")