Skip to content

[Lab 1 Part 2] - Missing softmax argument in Dense layer #147

@ksadura

Description

@ksadura

In the solution for this task the final RNN model is created as follows:

def build_model(vocab_size, embedding_dim, rnn_units, batch_size):
    model = tf.keras.Sequential([
        tf.keras.layers.Embedding(vocab_size, embedding_dim, batch_input_shape=[batch_size, None]),
        LSTM(rnn_units), 
        tf.keras.layers.Dense(vocab_size)
    ])

    return model

model = build_model(len(vocab), embedding_dim=256, rnn_units=1024, batch_size=32)

Why there's no activation function (softmax) defined in the Dense layer? In the task it's said:

The final output of the LSTM is then fed into a fully connected Dense layer where we'll output a softmax over each character in the vocabulary, and then sample from this distribution to predict the next character.

According to the docs activation function is None if not explicitly declared.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions