RuntimeError: Placeholder storage has not been allocated on MPS device! #90440

collindbell · 2022-12-08T03:04:44Z

🐛 Describe the bug

I get an error every time I attempt to use MPS to train a model on my M1 Mac. The error occurs at first training step (so first call of model(x)). MRE:

import torch
from torch import nn
from torch.utils.data import DataLoader, Dataset

import pandas as pd
import numpy as np

device = torch.device('mps')

class MyLSTM(nn.Module):
    def __init__(self, hidden_size, num_layers, output_size, input_dim):
        super().__init__()
        self.hidden_size = hidden_size
        self.num_layers = num_layers
        self.input_dim = input_dim

        self.lstm = nn.LSTM(input_size=input_dim, hidden_size=hidden_size, num_layers=num_layers, batch_first=True)
        self.fc = nn.Linear(hidden_size, output_size)
    
    def forward(self, x):
        h0 = torch.zeros(self.num_layers, x.size(0), self.hidden_size)
        c0 = torch.zeros(self.num_layers, x.size(0), self.hidden_size)

        out, _ = self.lstm(x, (h0, c0))

        out = self.fc(out[:, -1, :])
        return out

def train_step(model, criterion, optimizer, x, y):
    model.train()
    optimizer.zero_grad()
    y_pred = model(x)
    loss = criterion(y_pred, y)
    loss.backward()
    optimizer.step()
    return loss.item()

def train_model(model, criterion, optimizer, train_loader, val_loader, epochs=100):
    train_losses = []
    for epoch in range(epochs):
        print("Epoch", epoch)
        train_loss = 0
        for x, y in train_loader:
            train_loss += train_step(model, criterion, optimizer, x, y)
        train_loss /= len(train_loader)
        train_losses.append(train_loss)
        print("Train loss:", train_loss)
    return train_losses

class MyDataset(Dataset):
    def __init__(self, df, window_size):
        self.df = df
        self.window_size = window_size
        self.data = []
        self.labels = []
        for i in range(len(df) - window_size):
            x = torch.tensor(df.iloc[i:i+window_size].values, dtype=torch.float, device=device)
            y = torch.tensor(df.iloc[i+window_size].values, dtype=torch.float, device=device)
            self.data.append(x)
            self.labels.append(y)
    def __len__(self):
        return len(self.data)
    def __getitem__(self, idx):
        return self.data[idx], self.labels[idx]

class MyDataLoader(DataLoader):
    def __init__(self, dataset, window_size, batch_size, shuffle=True):
        self.dataset = dataset
        super().__init__(self.dataset, batch_size=batch_size, shuffle=shuffle)

df = pd.DataFrame(np.random.randint(0,100,size=(100, 1)))

model = MyLSTM(1, 1, 1, 1)
model.to(device)

train_data = MyDataset(df, 5)

train_loader = MyDataLoader(train_data, 5, 16)

criterion = nn.MSELoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.01)

train_losses = train_model(model, criterion, optimizer, train_loader, None, epochs=10)

I receive the following traceback:

Traceback (most recent call last):
  File "min_mps.py", line 83, in <module>
    train_losses = train_model(model, criterion, optimizer, train_loader, None, epochs=10)
  File "min_mps.py", line 44, in train_model
    train_loss += train_step(model, criterion, optimizer, x, y)
  File "min_mps.py", line 32, in train_step
    y_pred = model(x)
  File "~/miniconda3/envs/jaxenv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1480, in _call_impl
    return forward_call(*args, **kwargs)
  File "min_mps.py", line 24, in forward
    out, _ = self.lstm(x, (h0, c0))
  File "~/miniconda3/envs/jaxenv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1480, in _call_impl
    return forward_call(*args, **kwargs)
  File "~/miniconda3/envs/jaxenv/lib/python3.10/site-packages/torch/nn/modules/rnn.py", line 776, in forward
    result = _VF.lstm(input, hx, self._flat_weights, self.bias, self.num_layers,
RuntimeError: Placeholder storage has not been allocated on MPS device!

Versions

Python version: 3.10.8 (main, Nov 24 2022, 08:08:27) [Clang 14.0.6 ] (64-bit runtime)
Python platform: macOS-13.0-arm64-arm-64bit
Is CUDA available: False
CUDA runtime version: No CUDA
CUDA_MODULE_LOADING set to: N/A
GPU models and configuration: No CUDA
Nvidia driver version: No CUDA
cuDNN version: No CUDA
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True

Versions of relevant libraries:
[pip3] numpy==1.22.4
[pip3] torch==1.14.0.dev20221207
[pip3] torchaudio==0.14.0.dev20221207
[pip3] torchvision==0.15.0.dev20221207
[conda] numpy                     1.22.4                   pypi_0    pypi
[conda] torch                     1.14.0.dev20221207          pypi_0    pypi
[conda] torchaudio                0.14.0.dev20221207          pypi_0    pypi
[conda] torchvision               0.15.0.dev20221207          pypi_0    pypi

Also note if relevant I'm running Mac OS 13.0. I also have tried this on the 1.13 stable release, same issue.

cc @kulinseth @albanD @malfet @DenisVieriu97 @razarmehr @abhudev

The text was updated successfully, but these errors were encountered:

pudepiedj · 2023-04-12T09:44:28Z

I don't think this is a bug in PyTorch! You haven't allocated your torch.zeros to the device in the forward pass. If you do that, it runs, at least for me.
def forward(self, x): h0 = torch.zeros(self.num_layers, x.size(0), self.hidden_size, device=device) c0 = torch.zeros(self.num_layers, x.size(0), self.hidden_size, device=device) out, _ = self.lstm(x, (h0, c0)) out = self.fc(out[:, -1, :]) return out

kulinseth · 2023-04-12T13:20:32Z

I don't think this is a bug in PyTorch! You haven't allocated your torch.zeros to the device in the forward pass. If you do that, it runs, at least for me.
def forward(self, x): h0 = torch.zeros(self.num_layers, x.size(0), self.hidden_size, device=device) c0 = torch.zeros(self.num_layers, x.size(0), self.hidden_size, device=device) out, _ = self.lstm(x, (h0, c0)) out = self.fc(out[:, -1, :]) return out

That’s indeed correct . We see these errors when all the tensors are not mapped to the device . Also there were some bugs in LSTM layer which got fixed in 2.0 release. I would recommend @collindbell to try that latest release with MacOS 13.3 OS version

pudepiedj · 2023-04-13T17:47:34Z

Dear Kulin, Thank you! He needs to normalise his training loss too, but ... The error that is of most interest to me right now is the *"aten::empty.memory_format"* that seems to arise whenever I switch a model that works perfectly well (but slowly) on the CPU (intel i9 I 8-core; running *whisper*) to the "mps" device (*AMD Radeon Pro 5500M*). It throws an error somewhere reported as lying around lines 1143-45 of the *module.py* buried deep in the anaconda3/lib/ .../site-packages/ storage, but there is no reference to this "*aten*" process anywhere in that file. My supposition is that I have done something stupid/ignorant in innocence, or there is some kind of hardware (memory) limitation, but nobody seems to have an answer. It may be to do with *SparseMPS*, but I frankly doubt it. So I started reading and wonder whether this hypothesis has any value: this error arises for me running whisper, and I am loading the models programmatically, but when I am using the "mps" device it will try to load them *using the mps*, and if there is some element of quantization involved in this, that will create an error because *quantization is supposed to be done on the CPU*. That I am trying to run these models on a 16GB MacOS Ventura 13.3 makes me wonder whether there is some kind of "*automatic quantization optimization*" going on that tries to make the models smaller because I don't have enough RAM. I hesitated to post this because I don't know enough about quantization, or indeed anything else, but is it possible that this is the issue and that it is somehow registering as a different error couched in terms of "aten::...." because the code at module.py 1143-45 looks more like a quantization process than some sort of backprop differentiation? If this has enough legs to merit being posted as an issue, I'll happily do so, but my default assumption is that I am an idiot! Apologies if this just confirms my stupidity! Best, John

…

On Wed, Apr 12, 2023 at 2:20 PM Kulin Seth ***@***.***> wrote: I don't think this is a bug in PyTorch! You haven't allocated your torch.zeros to the device in the forward pass. If you do that, it runs, at least for me. def forward(self, x): h0 = torch.zeros(self.num_layers, x.size(0), self.hidden_size, device=device) c0 = torch.zeros(self.num_layers, x.size(0), self.hidden_size, device=device) out, _ = self.lstm(x, (h0, c0)) out = self.fc(out[:, -1, :]) return out That’s indeed correct . We see these errors when all the tensors are not mapped to the device . Also there were some bugs in LSTM layer which got fixed in 2.0 release. I would recommend @collindbell <https://github.com/collindbell> to try that latest release with MacOS 13.3 OS version — Reply to this email directly, view it on GitHub <#90440 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AGG22YIXPXMTJO77NHYMCG3XA2T23ANCNFSM6AAAAAASXS4A6M> . You are receiving this because you commented.Message ID: ***@***.***>

LTsommer · 2023-08-25T02:31:36Z

I fix it by adding device before when create each network,
device = torch.device("cuda") if torch.cuda.is_available() else torch.device("cpu")
self.lstm = nn.LSTM(..., device=device)

edvard-bjarnason · 2024-02-24T23:46:45Z

I ran into a similar issue, got the same error message "RuntimeError: Placeholder storage has not been allocated on MPS device!" when using a LSTM model. All tensors and the model were correctly mapped to the device in the code.

However, my code worked fine when I updated torch to version 2.2.1 (I was using version 2.0.1)

I have M1 Mac and was using the "mps" device. Before I updated torch, I tried to run on "cpu" device and then the output tensor from the forward pass contained NaN. I didn't look into it since this issue is fixed in latest versions of torch :)

malfet added triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module module: mps Related to Apple Metal Performance Shaders framework labels Dec 8, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RuntimeError: Placeholder storage has not been allocated on MPS device! #90440

RuntimeError: Placeholder storage has not been allocated on MPS device! #90440

collindbell commented Dec 8, 2022 •

edited by pytorch-bot bot

pudepiedj commented Apr 12, 2023 •

edited

kulinseth commented Apr 12, 2023

pudepiedj commented Apr 13, 2023 via email

LTsommer commented Aug 25, 2023

edvard-bjarnason commented Feb 24, 2024

RuntimeError: Placeholder storage has not been allocated on MPS device! #90440

RuntimeError: Placeholder storage has not been allocated on MPS device! #90440

Comments

collindbell commented Dec 8, 2022 • edited by pytorch-bot bot

🐛 Describe the bug

Versions

pudepiedj commented Apr 12, 2023 • edited

kulinseth commented Apr 12, 2023

pudepiedj commented Apr 13, 2023 via email

LTsommer commented Aug 25, 2023

edvard-bjarnason commented Feb 24, 2024

collindbell commented Dec 8, 2022 •

edited by pytorch-bot bot

pudepiedj commented Apr 12, 2023 •

edited