Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AttributeError: '_FakeLoader' object has no attribute 'pin_memory_device' #3655

Closed
geg00 opened this issue May 18, 2022 · 11 comments · Fixed by #3659
Closed

AttributeError: '_FakeLoader' object has no attribute 'pin_memory_device' #3655

geg00 opened this issue May 18, 2022 · 11 comments · Fixed by #3659

Comments

@geg00
Copy link

geg00 commented May 18, 2022

Be sure you've searched the forums for the error message you received. Also, unless you're an experienced fastai developer, first ask on the forums to see if someone else has seen a similar issue already and knows how to solve it. Only file a bug report here when you're quite confident it's not an issue with your local setup.

Please see this model example of how to fill out an issue correctly. Please try to emulate that example as appropriate when opening an issue.

Please confirm you have the latest versions of fastai, fastcore, and nbdev prior to reporting a bug (delete one): YES / NO

Describe the bug
AttributeError: '_FakeLoader' object has no attribute 'pin_memory_device'
When I execute the dos.show_batch(max_n=6)

To Reproduce
Steps to reproduce the behavior:
Successfully installed torch-1.12.0.dev20220511
Right after you execute the DataBlock section
dls = DataBlock(
blocks=(ImageBlock, CategoryBlock),
get_items=get_image_files,
splitter=RandomSplitter(valid_pct=0.2, seed=42),
get_y=parent_label,
item_tfms=[Resize(192, method='squish')]
).dataloaders(path)

You execute the code
dls.show_batch(max_n=6)
===>

AttributeError Traceback (most recent call last)
Untitled-1.ipynb Cell 9' in <cell line: 1>()
----> 1 dls.show_batch(max_n=6)

File ~/opt/anaconda3/lib/python3.9/site-packages/fastai/data/core.py:102, in TfmdDL.show_batch(self, b, max_n, ctxs, show, unique, **kwargs)
100 old_get_idxs = self.get_idxs
101 self.get_idxs = lambda: Inf.zeros
--> 102 if b is None: b = self.one_batch()
103 if not show: return self._pre_show_batch(b, max_n=max_n)
104 show_batch(*self._pre_show_batch(b, max_n=max_n), ctxs=ctxs, max_n=max_n, **kwargs)

File ~/opt/anaconda3/lib/python3.9/site-packages/fastai/data/load.py:170, in DataLoader.one_batch(self)
168 def one_batch(self):
169 if self.n is not None and len(self)==0: raise ValueError(f'This DataLoader does not contain any batches')
--> 170 with self.fake_l.no_multiproc(): res = first(self)
171 if hasattr(self, 'it'): delattr(self, 'it')
172 return res

File ~/opt/anaconda3/lib/python3.9/site-packages/fastcore/basics.py:621, in first(x, f, negate, **kwargs)
619 x = iter(x)
620 if f: x = filter_ex(x, f=f, negate=negate, gen=True, **kwargs)
--> 621 return next(x, None)

File ~/opt/anaconda3/lib/python3.9/site-packages/fastai/data/load.py:125, in DataLoader.iter(self)
123 self.before_iter()
124 self.__idxs=self.get_idxs() # called in context of main process (not workers/subprocesses)
--> 125 for b in _loadersself.fake_l.num_workers==0:
126 # pin_memory causes tuples to be converted to lists, so convert them back to tuples
127 if self.pin_memory and type(b) == list: b = tuple(b)
128 if self.device is not None: b = to_device(b, self.device)

File ~/opt/anaconda3/lib/python3.9/site-packages/torch/utils/data/dataloader.py:590, in _SingleProcessDataLoaderIter.init(self, loader)
589 def init(self, loader):
--> 590 super(_SingleProcessDataLoaderIter, self).init(loader)
591 assert self._timeout == 0
592 assert self._num_workers == 0

File ~/opt/anaconda3/lib/python3.9/site-packages/torch/utils/data/dataloader.py:521, in _BaseDataLoaderIter.init(self, loader)
517 self._prefetch_factor = loader.prefetch_factor
518 # for other backends, pin_memory_device need to set. if not set
519 # default behaviour is CUDA device. if pin_memory_device is selected
520 # and pin_memory is not set, the default behaviour false.
--> 521 if (len(loader.pin_memory_device) == 0):
522 self._pin_memory = loader.pin_memory and torch.cuda.is_available()
523 self._pin_memory_device = None

AttributeError: '_FakeLoader' object has no attribute 'pin_memory_device'

Expected behavior
A clear and concise description of what you expected to happen.
to show the images selected

Error with full stack trace

Place between these lines with triple backticks:

Additional context
I'm using the new version of torch-1.12.0.dev20220511 to take advantage of the M1 Metal option.

Add any other context about the problem here.

@josiahls
Copy link
Contributor

@geg00 I'm making a pr tonight that fixes this along with some user warnings.

@josiahls
Copy link
Contributor

    def __init__(self, dataset=None, bs=None, num_workers=0, pin_memory=False, timeout=0, batch_size=None,
                 shuffle=False, drop_last=False, indexed=None, n=None, device=None, persistent_workers=False,
                 pin_memory_device='', **kwargs):
....
        self.fake_l = _FakeLoader(self, pin_memory, num_workers, timeout, persistent_workers=persistent_workers,
                                  pin_memory_device=pin_memory_device)

Torch's default for pin_memory_device is a blank string.

I also updated the DistributedDL since that also does FakeLoader initialization.

@jim-king-2000
Copy link

jim-king-2000 commented Jun 30, 2022

still see the same error on fastai@2.7.4 with pytorch@1.12.0+cu116, any idea?

@josiahls
Copy link
Contributor

@jim-king-2000 What is the error message/trace you are getting? can you provide a minimal example? Because it passed unit tests, so I would need to know what the scenario is that doesnt work.

@jim-king-2000
Copy link

This is the sample:

from fastai.vision.all import *
path = untar_data(URLs.PETS)/'images'

def is_cat(x): return x[0].isupper()

dls = ImageDataLoaders.from_name_func(
        path, get_image_files(path), valid_pct=0.2, seed=42,
        label_func=is_cat, item_tfms=Resize(224))
learn = vision_learner(dls, resnet34, metrics=error_rate)
learn.fine_tune(1)

And this is the error code:

jim@ML0-CACHE-HIT:~/test$ python3 test.py 
/usr/local/lib/python3.10/dist-packages/torchvision/models/_utils.py:208: UserWarning: The parameter 'pretrained' is deprecated since 0.13 and will be removed in 0.15, please use 'weights' instead.
  warnings.warn(
/usr/local/lib/python3.10/dist-packages/torchvision/models/_utils.py:223: UserWarning: Arguments other than a weight enum or `None` for 'weights' are deprecated since 0.13 and will be removed in 0.15. The current behavior is equivalent to passing `weights=ResNet34_Weights.IMAGENET1K_V1`. You can also use `weights=ResNet34_Weights.DEFAULT` to get the most up-to-date weights.
  warnings.warn(msg)
epoch     train_loss  valid_loss  error_rate  time    
Traceback (most recent call last):-------------------------------------------------------------------------------| 0.00% [0/92 00:00<00:00]
  File "/home/jim/test/test.py", line 10, in <module>
    learn.fine_tune(1)
  File "/home/jim/.local/lib/python3.10/site-packages/fastai/callback/schedule.py", line 168, in fine_tune
    self.fit_one_cycle(freeze_epochs, slice(base_lr), pct_start=0.99, **kwargs)
  File "/home/jim/.local/lib/python3.10/site-packages/fastai/callback/schedule.py", line 122, in fit_one_cycle
    self.fit(n_epoch, cbs=ParamScheduler(scheds)+L(cbs), reset_opt=reset_opt, wd=wd, start_epoch=start_epoch)
  File "/home/jim/.local/lib/python3.10/site-packages/fastai/learner.py", line 241, in fit
    self._with_events(self._do_fit, 'fit', CancelFitException, self._end_cleanup)
  File "/home/jim/.local/lib/python3.10/site-packages/fastai/learner.py", line 179, in _with_events
    try: self(f'before_{event_type}');  f()
  File "/home/jim/.local/lib/python3.10/site-packages/fastai/learner.py", line 230, in _do_fit
    self._with_events(self._do_epoch, 'epoch', CancelEpochException)
  File "/home/jim/.local/lib/python3.10/site-packages/fastai/learner.py", line 179, in _with_events
    try: self(f'before_{event_type}');  f()
  File "/home/jim/.local/lib/python3.10/site-packages/fastai/learner.py", line 224, in _do_epoch
    self._do_epoch_train()
  File "/home/jim/.local/lib/python3.10/site-packages/fastai/learner.py", line 216, in _do_epoch_train
    self._with_events(self.all_batches, 'train', CancelTrainException)
  File "/home/jim/.local/lib/python3.10/site-packages/fastai/learner.py", line 179, in _with_events
    try: self(f'before_{event_type}');  f()
  File "/home/jim/.local/lib/python3.10/site-packages/fastai/learner.py", line 185, in all_batches
    for o in enumerate(self.dl): self.one_batch(*o)
  File "/home/jim/.local/lib/python3.10/site-packages/fastai/data/load.py", line 132, in __iter__
    for b in _loaders[self.fake_l.num_workers==0](self.fake_l):
  File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/dataloader.py", line 1009, in __init__
    super(_MultiProcessingDataLoaderIter, self).__init__(loader)
  File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/dataloader.py", line 594, in __init__
    self._shared_seed = loader._get_shared_seed()
AttributeError: '_FakeLoader' object has no attribute '_get_shared_seed'

@jim-king-2000
Copy link

And the env:

jim@ML0-CACHE-HIT:~/test$ pip3 show fastai
Name: fastai
Version: 2.7.4
Summary: fastai simplifies training fast and accurate neural nets using modern best practices
Home-page: https://github.com/fastai/fastai/tree/master/
Author: Jeremy Howard, Sylvain Gugger, and contributors
Author-email: info@fast.ai
License: Apache Software License 2.0
Location: /home/jim/.local/lib/python3.10/site-packages
Requires: fastcore, fastdownload, fastprogress, matplotlib, packaging, pandas, pillow, pip, pyyaml, requests, scikit-learn, scipy, spacy, torch, torchvision
Required-by: 
jim@ML0-CACHE-HIT:~/test$ pip3 list | grep -i torch
torch                   1.12.0+cu116
torchaudio              0.12.0+cu116
torchvision             0.13.0+cu116

@jim-king-2000
Copy link

And CUDA env:

import torch
print(torch.__version__)
print(torch.version.cuda)
print(torch.backends.cudnn.version())
print(torch.cuda.is_available())
jim@ML0-CACHE-HIT:~/test$ python3 showcoda.py 
1.12.0+cu116
11.6
8302
True

@jim-king-2000
Copy link

And the OS is Ubuntu 22.04.

@jim-king-2000
Copy link

It works when I switch to Pytorch@1.11.0+cu113.

Then a new error occurs:

RuntimeError: CUDA out of memory. Tried to allocate 26.00 MiB (GPU 0; 1.95 GiB total capacity; 1.24 GiB already allocated; 4.38 MiB free; 1.26 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

It is resolved when I change the batch size to 3. However, I think fastai should make the batch size adaptive.

@josiahls
Copy link
Contributor

josiahls commented Jul 3, 2022

@jim-king-2000 This is not the same error. #3704 would be the correct place to post this.

Please post you hardware info there also (use the original post #3704 as an example for hardware overview). Ref: #3704 (comment)

@jim-king-2000
Copy link

Sure.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants