What is the equivalent to a Flatten layer in MLX? #1308

s4m13337 · 2024-08-04T11:32:17Z

I am trying to implement a simple LeNet:

class MLP(nn.Module):

    def __init__(self, out_dims):
        super().__init__()
        self.layers = [
            nn.Conv2d(1, 20, 5),    # input channels, output channels, kernel size
            nn.ReLU(),
            nn.MaxPool2d(2, 2),    # kernel size, stride length
            nn.Conv2d(20, 50, 5),
            nn.ReLU(),
            nn.MaxPool2d(2, 2),
            ### Need a flatten layer here ###
            nn.Linear(800, 500),
            nn.ReLU(),
            nn.Linear(500, 10),
        ]

    def __call__(self, x):
        for l in self.layers:
            x = l(x)
        return(x)

Torch offers nn.Flatten for this. I could not find an equivalent in MLX. Can someone give directions regarding this?

The text was updated successfully, but these errors were encountered:

awni · 2024-08-04T13:08:53Z

There is no Flatten layer yet. You would have to redo the computation like so:

class MLP(nn.Module):

    def __init__(self, out_dims):
        super().__init__()
        self.conv = nn.Sequential(
            nn.Conv2d(1, 20, 5),    # input channels, output channels, kernel size
            nn.ReLU(),
            nn.MaxPool2d(2, 2),    # kernel size, stride length
            nn.Conv2d(20, 50, 5),
            nn.ReLU(),
            nn.MaxPool2d(2, 2),
        )
        self.mlp = nn.Sequential(
            nn.Linear(800, 500),
            nn.ReLU(),
            nn.Linear(500, 10),
        )

    def __call__(self, x):
        x = self.conv(x):
        x = x.flatten(-3, -1)
        x =  self.mlp(x)
        return(x)

At some point we considered adding Flatten but decided we prefer not to mirror every op with an NN equivalent and the equivalent above is not so onerous.

s4m13337 · 2024-08-04T16:58:00Z

@awni Thanks, I have tried this. But I've ended up across another bug.

Here is an minimal example:

i = 0
for X, y in batch_iterate(64, train_images, train_labels):
    i += 1
    loss, grads = loss_and_grad_fn(model, X, y)
    optimizer.update(model, grads)
    mx.eval(model.parameters(), optimizer.state)
    if(i == 100):
        break
print(model.parameters())

The parameters become nan at some point between the 100th and 150th batch. It varies on every evaluation but is usually within this range. Is this related to #1277 or #319 ?

awni · 2024-08-04T17:00:18Z

That I have no idea about. You’d need to share more code to fully reproduce this so we can help debug.

also make sure you are using the latest MLX.

s4m13337 · 2024-08-04T17:23:44Z

LeNet MLX.ipynb.zip

@awni I'm attaching my code here. I am using version 0.16.1.

awni · 2024-08-05T14:16:26Z

You are converting train_labels to one hot and then using it's size to determine the size of the dataset. That is a bug because the size will be a factor of 10 too large. So when in your batch_iterate function you will be reading lot's of unitialized memory.

My recommendation would be to not convert the labels to one hot, just use them as is which works with cross_entropy and is more efficient.

Alternatively you could change your batch iteration to get the right dataset size:

def batch_iterate(batch_size, X, y):
    perm = mx.array(np.random.permutation(y.shape[0]))
    for s in range(0, y.shape[0], batch_size):
        ids = perm[s: s+batch_size]
        yield X[ids], y[ids]

s4m13337 · 2024-08-06T00:48:45Z

Great catch. I overlooked y.size. Thank you for the help.

awni closed this as completed Aug 4, 2024

This issue was closed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What is the equivalent to a Flatten layer in MLX? #1308

What is the equivalent to a Flatten layer in MLX? #1308

s4m13337 commented Aug 4, 2024 •

edited

Loading

awni commented Aug 4, 2024 •

edited

Loading

s4m13337 commented Aug 4, 2024

awni commented Aug 4, 2024

s4m13337 commented Aug 4, 2024

awni commented Aug 5, 2024

s4m13337 commented Aug 6, 2024

What is the equivalent to a Flatten layer in MLX? #1308

What is the equivalent to a Flatten layer in MLX? #1308

Comments

s4m13337 commented Aug 4, 2024 • edited Loading

awni commented Aug 4, 2024 • edited Loading

s4m13337 commented Aug 4, 2024

awni commented Aug 4, 2024

s4m13337 commented Aug 4, 2024

awni commented Aug 5, 2024

s4m13337 commented Aug 6, 2024

s4m13337 commented Aug 4, 2024 •

edited

Loading

awni commented Aug 4, 2024 •

edited

Loading