-
Hi, On https://github.com/ml-explore/mlx/blob/main/docs/src/python/nn.rst the def __call__(self, x):
for i, l in enumerate(self.layers):
x = mx.maximum(x, 0) if i > 0 else x
x = l(x)
return x Which never uses the input layer ( I assume this is a bug? In contrast on https://ml-explore.github.io/mlx/build/html/examples/mlp.html the def __call__(self, x):
for l in self.layers[:-1]:
x = mx.maximum(l(x), 0.0)
return self.layers[-1](x) Which uses all layers. Both these example MLPs use Hence bug assumption, can anyone confirm please? Thanks in advance, Howard. |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
It looks correct to me. They are both just different ways of doing the same thing which is avoiding applying a nonlinearity after the last layer. |
Beta Was this translation helpful? Give feedback.
It looks correct to me. They are both just different ways of doing the same thing which is avoiding applying a nonlinearity after the last layer.