Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OSError: [WinError 1455] paging file too small #7657

Closed
1 task done
GiaZ90 opened this issue May 1, 2022 · 6 comments
Closed
1 task done

OSError: [WinError 1455] paging file too small #7657

GiaZ90 opened this issue May 1, 2022 · 6 comments
Labels
question Further information is requested

Comments

@GiaZ90
Copy link

GiaZ90 commented May 1, 2022

Search before asking

Question

I'm running a yolov5 neural network using a coco128 dataset and i have this specs
i7 4770k
16gb ram
gtx 1080 8gb
python train.py --img 640 --batch 16 --epochs 3 --data coco128.yaml --weights yolov5s.pt
i'm facing the common issue of the small paging file
so i searched and did this change to the source code
from
nw = max(round(hyp['warmup_epochs'] * nb), 100) # number of warmup iterations, max(3 epochs, 100 iterations)
to
nw = 1
lowering the number of workers,and it is still having the same issue.
Then i also tryed to change the page allocation setting "dimension set by the system"
below is my last try to run it
p.s. i'm facing this issue after setting all to work with gpu to speed up the operations (with only the cpu it worked fine but it is so slow....)

PS C:\Users\AdminZ\Desktop\Tirocinio\Yolo\yolov5> python train.py --img 640 --batch 16 --epochs 3 --data coco128.yaml --weights yolov5s.pt wandb: Currently logged in as: giateam (use wandb login --relogin` to force relogin)
train: weights=yolov5s.pt, cfg=, data=coco128.yaml, hyp=data\hyps\hyp.scratch-low.yaml, epochs=3, batch_size=16, imgsz=640, rect=False, resume=False, nosave=False, noval=False, noautoanchor=False, evolve=None, bucket=, cache=None, image_weights=False, device=, multi_scale=False, single_cls=False, optimizer=SGD, sync_bn=False, workers=8, project=runs\train, name=exp, exist_ok=False, quad=False, cos_lr=False, label_smoothing=0.0, patience=100, freeze=[0], save_period=-1, local_rank=-1, entity=None, upload_dataset=False, bbox_interval=-1, artifact_alias=latest
"git" non è riconosciuto come comando interno o esterno,
un programma eseguibile o un file batch.
Command 'git fetch && git config --get remote.origin.url' returned non-zero exit status 1.
"git" non è riconosciuto come comando interno o esterno,
un programma eseguibile o un file batch.
YOLOv5 2022-4-22 torch 1.11.0+cu113 CUDA:0 (NVIDIA GeForce GTX 1080, 8192MiB)

hyperparameters: lr0=0.01, lrf=0.01, momentum=0.937, weight_decay=0.0005, warmup_epochs=3.0, warmup_momentum=0.8, warmup_bias_lr=0.1, box=0.05, cls=0.5, cls_pw=1.0, obj=1.0, obj_pw=1.0, iou_t=0.2, anchor_t=4.0, fl_gamma=0.0, hsv_h=0.015, hsv_s=0.7, hsv_v=0.4, degrees=0.0, translate=0.1, scale=0.5, shear=0.0, perspective=0.0, flipud=0.0, fliplr=0.5, mosaic=1.0, mixup=0.0, copy_paste=0.0
TensorBoard: Start with 'tensorboard --logdir runs\train', view at http://localhost:6006/
wandb: Tracking run with wandb version 0.12.15
wandb: Run data is saved locally in C:\Users\AdminZ\Desktop\Tirocinio\Yolo\yolov5\wandb\run-20220501_113706-11msfiny
wandb: Run wandb offline to turn off syncing.
wandb: Syncing run earthy-bird-19
wandb: View project at https://wandb.ai/giateam/train
wandb: View run at https://wandb.ai/giateam/train/runs/11msfiny
YOLOv5 temporarily requires wandb version 0.12.10 or below. Some features may not work as expected.

             from  n    params  module                                  arguments

0 -1 1 3520 models.common.Conv [3, 32, 6, 2, 2]
1 -1 1 18560 models.common.Conv [32, 64, 3, 2]
2 -1 1 18816 models.common.C3 [64, 64, 1]
3 -1 1 73984 models.common.Conv [64, 128, 3, 2]
4 -1 2 115712 models.common.C3 [128, 128, 2]
5 -1 1 295424 models.common.Conv [128, 256, 3, 2]
6 -1 3 625152 models.common.C3 [256, 256, 3]
7 -1 1 1180672 models.common.Conv [256, 512, 3, 2]
8 -1 1 1182720 models.common.C3 [512, 512, 1]
9 -1 1 656896 models.common.SPPF [512, 512, 5]
10 -1 1 131584 models.common.Conv [512, 256, 1, 1]
11 -1 1 0 torch.nn.modules.upsampling.Upsample [None, 2, 'nearest']
12 [-1, 6] 1 0 models.common.Concat [1]
13 -1 1 361984 models.common.C3 [512, 256, 1, False]
14 -1 1 33024 models.common.Conv [256, 128, 1, 1]
15 -1 1 0 torch.nn.modules.upsampling.Upsample [None, 2, 'nearest']
16 [-1, 4] 1 0 models.common.Concat [1]
17 -1 1 90880 models.common.C3 [256, 128, 1, False]
18 -1 1 147712 models.common.Conv [128, 128, 3, 2]
19 [-1, 14] 1 0 models.common.Concat [1]
20 -1 1 296448 models.common.C3 [256, 256, 1, False]
21 -1 1 590336 models.common.Conv [256, 256, 3, 2]
22 [-1, 10] 1 0 models.common.Concat [1]
23 -1 1 1182720 models.common.C3 [512, 512, 1, False]
24 [17, 20, 23] 1 229245 models.yolo.Detect [80, [[10, 13, 16, 30, 33, 23], [30, 61, 62, 45, 59, 119], [116, 90, 156, 198, 373, 326]], [128, 256, 512]]
Model summary: 270 layers, 7235389 parameters, 7235389 gradients

Transferred 349/349 items from yolov5s.pt
Scaled weight_decay = 0.0005
optimizer: SGD with parameter groups 57 weight (no decay), 60 weight, 60 bias
train: Scanning 'C:\Users\AdminZ\Desktop\Tirocinio\Yolo\datasets\coco128\labels\train2017.cache' images and labels... 1
wandb: Currently logged in as: giateam (use wandb login --relogin to force relogin)
wandb: Currently logged in as: giateam (use wandb login --relogin to force relogin)
wandb: Currently logged in as: giateam (use wandb login --relogin to force relogin)
wandb: Currently logged in as: giateam (use wandb login --relogin to force relogin)
wandb: Currently logged in as: giateam (use wandb login --relogin to force relogin)
wandb: Currently logged in as: giateam (use wandb login --relogin to force relogin)
wandb: Currently logged in as: giateam (use wandb login --relogin to force relogin)
wandb: Currently logged in as: giateam (use wandb login --relogin to force relogin)
val: Scanning 'C:\Users\AdminZ\Desktop\Tirocinio\Yolo\datasets\coco128\labels\train2017.cache' images and labels... 128
wandb: Currently logged in as: giateam (use wandb login --relogin to force relogin)
Traceback (most recent call last):
File "", line 1, in
File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.10_3.10.1264.0_x64__qbz5n2kfra8p0\lib\multiprocessing\spawn.py", line 116, in spawn_main
exitcode = _main(fd, parent_sentinel)
File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.10_3.10.1264.0_x64__qbz5n2kfra8p0\lib\multiprocessing\spawn.py", line 125, in _main
prepare(preparation_data)
File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.10_3.10.1264.0_x64__qbz5n2kfra8p0\lib\multiprocessing\spawn.py", line 236, in prepare
_fixup_main_from_path(data['init_main_from_path'])
File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.10_3.10.1264.0_x64__qbz5n2kfra8p0\lib\multiprocessing\spawn.py", line 287, in _fixup_main_from_path
main_content = runpy.run_path(main_path,
File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.10_3.10.1264.0_x64__qbz5n2kfra8p0\lib\runpy.py", line 269, in run_path
return _run_module_code(code, init_globals, run_name,
File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.10_3.10.1264.0_x64__qbz5n2kfra8p0\lib\runpy.py", line 96, in _run_module_code
_run_code(code, mod_globals, init_globals,
File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.10_3.10.1264.0_x64__qbz5n2kfra8p0\lib\runpy.py", line 86, in run_code
exec(code, run_globals)
File "C:\Users\AdminZ\Desktop\Tirocinio\Yolo\yolov5\train.py", line 26, in
import torch
File "C:\Users\AdminZ\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\torch_init
.py", line 126, in
raise err
OSError: [WinError 1455] Il file di paging è troppo piccolo per essere completato. Error loading "C:\Users\AdminZ\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\torch\lib\cudnn_cnn_infer64_8.dll" or one of its dependencies.`

Additional

No response

@GiaZ90 GiaZ90 added the question Further information is requested label May 1, 2022
@GiaZ90
Copy link
Author

GiaZ90 commented May 1, 2022

Looks like is working a bit more now....i've change the memory allocation to ALL disk and it starts to compute.

@GiaZ90 GiaZ90 closed this as completed May 1, 2022
@wmcnally
Copy link

@GiaZ90 Can you explain your solution in more detail? I'm facing the same issue. Thanks.

@ashmalvayani
Copy link

ashmalvayani commented Aug 16, 2022

write --workers 1 as an argument when you're running trian.py and then also try with batch size 8 instead of 16 by default.

@yuenherny
Copy link

yuenherny commented Nov 19, 2022

For me, it starts to compute after I reduced the batch size. I encountered two different error message:

ImportError: DLL load failed: The paging file is too small for this operation to complete.

and

OSError: [WinError 1455] The paging file is too small for this operation to complete.

but essentially both same in nature.

@anay-p
Copy link

anay-p commented Jan 18, 2023

@GiaZ90 or anyone else, please explain what is meant by 'changing memory allocation to all disk'. How does one do that?

@wrxhhh
Copy link

wrxhhh commented Feb 11, 2023

@anay-p The guide is as following: [(https://www.thewindowsclub.com/increase-page-file-size-virtual-memory-windows)]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

6 participants