In [1]:
import tensorflow as tf

## Scalar


In [111]:
scalar = tf.zeros(shape=())
print(scalar)
print(scalar + scalar)
print(scalar + 1)

print(scalar.numpy())  # Get back the actual scalar value

print(tf.reshape(scalar,
                 shape=(1)).numpy())  # Can reshape into a tensor of 1 element


tf.Tensor(0.0, shape=(), dtype=float32)
tf.Tensor(0.0, shape=(), dtype=float32)
tf.Tensor(1.0, shape=(), dtype=float32)
0.0
[0.]


## Vector


In [5]:
vector = tf.zeros(shape=(5))
print(vector)

tf.Tensor([0. 0. 0. 0. 0.], shape=(5,), dtype=float32)


## Matrix


In [6]:
matrix = tf.zeros(shape=(5, 5))
print(matrix)

tf.Tensor(
[[0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]], shape=(5, 5), dtype=float32)


## Slicing

Basically, single indices tend to make that dimension go away while slices tend to maintain the original form.


In [14]:
matrix = tf.zeros(shape=(5, 5))
print(f'Single Column Index: {matrix[:,1].shape}')
print(f'Multi-Column Index: {matrix[:,1:2].shape}')
print(f'Single Row Index: {matrix[1].shape}'
     )  # Notice it's the same shape as single column index
print(f'Multi-Row Index: {matrix[1:2].shape}'
     )  # Notice it's the transpose of single column index


Single Column Index: (5,)
Multi-Column Index: (5, 1)
Single Row Index: (5,)
Multi-Row Index: (1, 5)


## Dictionary

Although in certain places in TF (eg. dataset), shapes can be dictionaries, this is not generally allowed.


In [19]:
try:
    dictionary = tf.zeros(shape={'field1': (5, 5), 'field2': ()})
except:
    print('nope')

nope


If you wanted to do it in the general case, this is the best you can do.


In [22]:
dictionary = {'field1': tf.zeros(shape=(5, 5)), 'field2': tf.zeros(shape=())}
print(dictionary)

{'field1': <tf.Tensor: shape=(5, 5), dtype=float32, numpy=
array([[0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.]], dtype=float32)>, 'field2': <tf.Tensor: shape=(), dtype=float32, numpy=0.0>}


## Dictionary in Keras

Note the use of **input_shape** instead of **input**.


In [24]:
try:
    tf.keras.layers.Input(shape={
        'field1': (5, 5),
        'field2': ()
    })  # This is also a no-go
except:
    print('nope')

nope


You can use a dictionary to inform the constructor of tf.keras.Model how to unwrap its input (a dictionary here).


In [45]:
input1 = tf.keras.layers.Input(shape=(5, 5))
input2 = tf.keras.layers.Input(shape=())
inputs = {
    'field1': input1,
    'field2': input2
}  # Only used for the final step (use input1 and input2 in graph)
outputs = tf.keras.layers.Dense(3)(
    input1
)  # Technically input2 is being ignored in this network but just example

model = tf.keras.Model(
    inputs=inputs, outputs=outputs
)  # The dictionary is not a tensor but is an outer structure for tensors
print(
    model.input_shape
)  # The shape looks like the one you specified but a batch dimension is added to all tensors

model({
    'field1': tf.zeros(shape=(2, 5, 5)),
    'field2': tf.zeros(shape=(2))
})  # The model accepts a straight dictionary of tensors.


{'field1': (None, 5, 5), 'field2': (None,)}


<tf.Tensor: shape=(2, 5, 3), dtype=float32, numpy=
array([[[0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.]],

       [[0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.]]], dtype=float32)>

## Multiple Tensor Inputs/Outputs in Keras

This is useful if you need to pass in tensors of **different forms** without using ragged tensors.

List of tensors in, list of tensors out.


In [54]:
input1 = tf.keras.layers.Input(shape=(5))
input2 = tf.keras.layers.Input(shape=(3))
inputs = [input1, input2]
output1 = tf.keras.layers.Dense(2)(input1)
output2 = tf.keras.layers.Dense(4)(input2)
outputs = [output1, output2]
model = tf.keras.Model(inputs=inputs, outputs=outputs)
print(f'Input Shape: {model.input_shape}')

t = tf.zeros(shape=(10, 5))  # Batch size 10
u = tf.zeros(shape=(10, 3))

model([t, u])

Input Shape: [(None, 5), (None, 3)]


[<tf.Tensor: shape=(10, 2), dtype=float32, numpy=
 array([[0., 0.],
        [0., 0.],
        [0., 0.],
        [0., 0.],
        [0., 0.],
        [0., 0.],
        [0., 0.],
        [0., 0.],
        [0., 0.],
        [0., 0.]], dtype=float32)>,
 <tf.Tensor: shape=(10, 4), dtype=float32, numpy=
 array([[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]], dtype=float32)>]

## Single Tensor Column Routing

The ChatGPT-approved version uses tf.keras.layers.Lambda, which is not recommended because if you save the model, it saves environment-specific bytecode.

A key thing to note here is that your layer can **return a tuple** and each items in the tuple is a graph node.

NOTE: I did it this way to demonstrate layer classes, but you could just do tensor operations too (which I didn't know when I made this cell).


In [101]:
class SplitLayer(tf.keras.layers.Layer):

    def __init__(self):
        super().__init__()

    def call(self, inputs):
        return (
            inputs[:, :5], inputs[:, 5:]
        )  # You could skip the whole class and just do left,right = (inputs....


inputs = tf.keras.layers.Input(shape=(8))  # Take all 8 columns in input tensor
left, right = SplitLayer()(inputs)  # Split the columns
dense1 = tf.keras.layers.Dense(2, name='dense_2')(left)
dense2 = tf.keras.layers.Dense(4, name='dense_4')(right)

output = tf.keras.layers.Concatenate(axis=-1, name='concat')([dense1, dense2])
model = tf.keras.Model(inputs=inputs, outputs=output)
print(f'Input Shape: {model.input_shape}')

t = tf.zeros(shape=(10, 8))  # Batch size 10

model(t)

Input Shape: (None, 8)


<tf.Tensor: shape=(10, 6), dtype=float32, numpy=
array([[0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0.]], dtype=float32)>

## Dictionary to Tensor Routing

In this example, we take an input dictionary with a one-hot feature and a scalar feature, and return a more typical matrix tensor compatible with dense layers, etc.


In [80]:
input1 = tf.keras.layers.Input(
    shape=(5),
    name='input1')  # Each item in batch is a 5 item array (eg. one-hot)
input2 = tf.keras.layers.Input(shape=(),
                               name='input2')  # Each item in batch is a scalar
inputs = {
    'oh': input1,
    'scal': input2
}  # Input is a dictionary of batched tensors

output = tf.keras.layers.Concatenate(
    axis=-1, name='concat')([input1, tf.expand_dims(input2, axis=-1)])
model = tf.keras.Model(inputs=inputs, outputs=output, name='bla')

model({
    'oh': tf.constant([[1, 0, 0, 0, 0]]),
    'scal': tf.constant([6])
})  # Note that passing naked arrays fails here


<tf.Tensor: shape=(1, 6), dtype=float32, numpy=array([[1., 0., 0., 0., 0., 6.]], dtype=float32)>

## Batch Size


Layers are constructed without the batch dimension in the shape, but **after construction** they have it.


In [83]:
input1 = tf.keras.layers.Input(
    shape=(5))  ## Constructed without batch dimension
print(f'Input Shape: {input1.shape}')  ## Batch dimension added in front

Input Shape: (None, 5)


For a **dictionary**, the batch size is added to **each field separately**.
Also note that you have to **call with a batch**.


In [84]:
input1 = tf.keras.layers.Input(shape=(5, 5))  # No batch size in here
input2 = tf.keras.layers.Input(shape=())
inputs = {'field1': input1, 'field2': input2}
outputs = tf.keras.layers.Dense(3)(input1)

model = tf.keras.Model(inputs=inputs, outputs=outputs)

model({
    'field1': tf.zeros(shape=(2, 5, 5)),
    'field2': tf.zeros(shape=(2))
})  # Batch added to each field


<tf.Tensor: shape=(2, 5, 3), dtype=float32, numpy=
array([[[0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.]],

       [[0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.]]], dtype=float32)>

For a **list of tensors**, the batch size is added to **each entry separately**.


In [85]:
input1 = tf.keras.layers.Input(shape=(5))
input2 = tf.keras.layers.Input(shape=(3))
inputs = [input1, input2]
output1 = tf.keras.layers.Dense(2)(input1)
output2 = tf.keras.layers.Dense(4)(input2)
outputs = [output1, output2]
model = tf.keras.Model(inputs=inputs, outputs=outputs)

t = tf.zeros(shape=(10, 5))  # Batch size 10
u = tf.zeros(shape=(10, 3))

model([t, u])

[<tf.Tensor: shape=(10, 2), dtype=float32, numpy=
 array([[0., 0.],
        [0., 0.],
        [0., 0.],
        [0., 0.],
        [0., 0.],
        [0., 0.],
        [0., 0.],
        [0., 0.],
        [0., 0.],
        [0., 0.]], dtype=float32)>,
 <tf.Tensor: shape=(10, 4), dtype=float32, numpy=
 array([[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]], dtype=float32)>]

Also note that if you use a **layer class** the **call()** method will take batched tensors(s) as well. Basically once you instantiate a layer, everything has batches from that point on.


## Graph Mode


A layer can be called on a concrete tensor for **eager mode** evaluation.


In [89]:
tf.keras.layers.Dense(2)(tf.constant([[1, 2]]))

<tf.Tensor: shape=(1, 2), dtype=float32, numpy=array([[-1.9751551,  0.2966845]], dtype=float32)>

The same layer can also be called on a graph node for **graph mode** evaluation. Note that what you get back is still a tensor, but of a different type (**KerasTensor**). You can examine properties such as **shape**. Note that any dimension that is not known at the time the graph is being built can be **None** (for instance the batch size dimension). You can **pass None yourself** for dimensions when applicable too.


In [96]:
input_layer = tf.keras.layers.Input(shape=(5))  # Graph mode tensor
t = tf.keras.layers.Dense(2)(input_layer)
print(t)  # Another tensor
print(t.shape)

KerasTensor(type_spec=TensorSpec(shape=(None, 2), dtype=tf.float32, name=None), name='dense_63/BiasAdd:0', description="created by layer 'dense_63'")
(None, 2)


You can perform **arbitrary tensor operations** in a graph too and that will get included in the graph. It won't disrupt anything such as GPU execution.


In [100]:
input_layer = tf.keras.layers.Input(shape=(5))  # Graph mode tensor
output_layer = input_layer**2  # Could also be something like tf.reshape
model = tf.keras.Model(inputs=input_layer, outputs=output_layer)

result = model(tf.constant([[1, 2, 3, 4, 5]]))
print(result)
print(result.device)

tf.Tensor([[ 1.        4.        8.999998 16.       24.999998]], shape=(1, 5), dtype=float32)
/job:localhost/replica:0/task:0/device:GPU:0


You can **compile a function** to a graph (eg. custom helper code inside a training loop) using @tf.function.

The example here doesn't work as I expected, but usually in coursera courses tf.function is used for a custom **training loop step** so that each training step runs faster (as an optimized graph instead of python).

The tracing/compilation happens on the **first call** so if only called once, it's worthless to do. Also, it's only worthwhile if there are a lot of small operations. It won't help at all if all you're doing is convolutions.

NOTE: this would not be needed for a function that creates your model because operaitons on graph mode tensors already result in a graph, and it's only happening once.


In [104]:
@tf.function
def make_graph_tensor():
    return tf.constant([1, 2, 3])


print(make_graph_tensor())

input_layer = tf.keras.layers.Input(shape=(5))  # Graph mode tensor


tf.Tensor([1 2 3], shape=(3,), dtype=int32)
