Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Correct usage of lmdb #37

Closed
N0manDemo opened this issue Feb 11, 2021 · 11 comments
Closed

Correct usage of lmdb #37

N0manDemo opened this issue Feb 11, 2021 · 11 comments
Assignees
Labels
bug Something isn't working

Comments

@N0manDemo
Copy link

I used create_lmdb.py to create both my LR and HR datasets, and I was wondering how I should configure my options file.
Do the settings differ from using HR/LR image folders?

@victorca25
Copy link
Owner

victorca25 commented Feb 12, 2021

Hello! Technically you only need to point the dataroot_HR and dataroot_LR to the correct directories ending in '.lmdb' and they should be loaded correctly. I haven't used lmdb in a while, so let me know how it goes!

@victorca25
Copy link
Owner

victorca25 commented Feb 12, 2021

The directories should look like:
train_HR.lmdb
├── data.mdb
├── lock.mdb
├── meta_info.txt

@N0manDemo
Copy link
Author

Hi victorca25,

I receive this error when loading images from my lmdb directory.

My Log File
error_lmdb.log

My Config File:

train_esrgan.txt

21-02-12 17:31:47.474 - INFO: Random seed: 0
21-02-12 17:31:47.479 - INFO: Read lmdb keys from cache: ../../datasets/main/hr.lmdb/_keys_cache.p
21-02-12 17:31:47.479 - INFO: Dataset [LRHRDataset - DIV2K] is created.
21-02-12 17:31:47.479 - INFO: Number of train images: 44, iters: 6
21-02-12 17:31:47.479 - INFO: Total epochs needed: 83334 for iters 500,000
21-02-12 17:31:47.479 - INFO: Read lmdb keys from cache: ../../datasets/main/val/hr.lmdb/_keys_cache.p
21-02-12 17:31:47.479 - INFO: Dataset [LRHRDataset - val_set14_part] is created.
21-02-12 17:31:47.479 - INFO: Number of val images in [val_set14_part]: 44
21-02-12 17:31:47.631 - INFO: AMP library available
21-02-12 17:31:48.803 - INFO: Initialization method [kaiming]
21-02-12 17:31:49.020 - INFO: Initialization method [kaiming]
21-02-12 17:31:49.931 - INFO: AMP enabled
21-02-12 17:31:49.939 - INFO: Network G structure: DataParallel - RRDBNet, with parameters: 16,697,987
21-02-12 17:31:49.939 - INFO: Network D structure: DataParallel - Discriminator_VGG, with parameters: 14,502,281
21-02-12 17:31:49.939 - INFO: Model [SRRaGANModel] is created.
21-02-12 17:31:49.939 - INFO: Start training from epoch: 0, iter: 0
Traceback (most recent call last):
File "/mnt/ext4-storage/Training/BasicSR/codes/train.py", line 416, in
main()
File "/mnt/ext4-storage/Training/BasicSR/codes/train.py", line 412, in main
fit(model, opt, dataloaders, steps_states, data_params, loggers)
File "/mnt/ext4-storage/Training/BasicSR/codes/train.py", line 215, in fit
for n, train_data in enumerate(dataloaders['train'], start=1):
File "/home/n0man/Envs/main/lib64/python3.9/site-packages/torch/utils/data/dataloader.py", line 435, in next
data = self._next_data()
File "/home/n0man/Envs/main/lib64/python3.9/site-packages/torch/utils/data/dataloader.py", line 1085, in _next_data
return self._process_data(data)
File "/home/n0man/Envs/main/lib64/python3.9/site-packages/torch/utils/data/dataloader.py", line 1111, in _process_data
data.reraise()
File "/home/n0man/Envs/main/lib64/python3.9/site-packages/torch/_utils.py", line 428, in reraise
raise self.exc_type(msg)
AttributeError: Caught AttributeError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/home/n0man/Envs/main/lib64/python3.9/site-packages/torch/utils/data/_utils/worker.py", line 198, in _worker_loop
data = fetcher.fetch(index)
File "/home/n0man/Envs/main/lib64/python3.9/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/n0man/Envs/main/lib64/python3.9/site-packages/torch/utils/data/_utils/fetch.py", line 44, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/mnt/ext4-storage/Training/BasicSR/codes/data/LRHRC_dataset.py", line 224, in getitem
img_HR = util.read_img(self.HR_env, HR_path, out_nc=image_channels)
File "/mnt/ext4-storage/Training/BasicSR/codes/dataops/common.py", line 129, in read_img
img = fix_img_channels(img, out_nc)
File "/mnt/ext4-storage/Training/BasicSR/codes/dataops/common.py", line 139, in fix_img_channels
if img.ndim == 2:
AttributeError: 'NoneType' object has no attribute 'ndim'

@victorca25
Copy link
Owner

Can you try adding in line 100 here: https://github.com/victorca25/BasicSR/blob/master/codes/dataops/common.py

print("env: ", env)

And let me know what it prints in the console?

@N0manDemo
Copy link
Author

21-02-13 13:08:32.573 - INFO: Random seed: 0
21-02-13 13:08:32.594 - INFO: Read lmdb keys from cache: ../../datasets/main/hr.lmdb/_keys_cache.p
21-02-13 13:08:32.595 - INFO: Dataset [LRHRDataset - DIV2K] is created.
21-02-13 13:08:32.595 - INFO: Number of train images: 44, iters: 6
21-02-13 13:08:32.595 - INFO: Total epochs needed: 83334 for iters 500,000
21-02-13 13:08:32.596 - INFO: Read lmdb keys from cache: ../../datasets/main/val/hr.lmdb/_keys_cache.p
21-02-13 13:08:32.597 - INFO: Dataset [LRHRDataset - val_set14_part] is created.
21-02-13 13:08:32.597 - INFO: Number of val images in [val_set14_part]: 44
21-02-13 13:08:33.009 - INFO: AMP library available
21-02-13 13:08:36.369 - INFO: Initialization method [kaiming]
21-02-13 13:08:36.587 - INFO: Initialization method [kaiming]
21-02-13 13:08:38.641 - INFO: AMP enabled
21-02-13 13:08:38.648 - INFO: Network G structure: DataParallel - RRDBNet, with parameters: 16,697,987
21-02-13 13:08:38.649 - INFO: Network D structure: DataParallel - Discriminator_VGG, with parameters: 14,502,281
21-02-13 13:08:38.649 - INFO: Model [SRRaGANModel] is created.
21-02-13 13:08:38.649 - INFO: Start training from epoch: 0, iter: 0
env: None
env: None
env: None
env: None
env: None
Traceback (most recent call last):
File "/mnt/ext4-storage/Training/BasicSR/codes/train.py", line 416, in
main()
File "/mnt/ext4-storage/Training/BasicSR/codes/train.py", line 412, in main
fit(model, opt, dataloaders, steps_states, data_params, loggers)
File "/mnt/ext4-storage/Training/BasicSR/codes/train.py", line 215, in fit
for n, train_data in enumerate(dataloaders['train'], start=1):
File "/home/n0man/Envs/main/lib64/python3.9/site-packages/torch/utils/data/dataloader.py", line 435, in next
data = self._next_data()
File "/home/n0man/Envs/main/lib64/python3.9/site-packages/torch/utils/data/dataloader.py", line 1085, in _next_data
return self._process_data(data)
File "/home/n0man/Envs/main/lib64/python3.9/site-packages/torch/utils/data/dataloader.py", line 1111, in _process_data
data.reraise()
File "/home/n0man/Envs/main/lib64/python3.9/site-packages/torch/_utils.py", line 428, in reraise
raise self.exc_type(msg)
AttributeError: Caught AttributeError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/home/n0man/Envs/main/lib64/python3.9/site-packages/torch/utils/data/_utils/worker.py", line 198, in _worker_loop
data = fetcher.fetch(index)
File "/home/n0man/Envs/main/lib64/python3.9/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/n0man/Envs/main/lib64/python3.9/site-packages/torch/utils/data/_utils/fetch.py", line 44, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/mnt/ext4-storage/Training/BasicSR/codes/data/LRHRC_dataset.py", line 224, in getitem
img_HR = util.read_img(self.HR_env, HR_path, out_nc=image_channels)
File "/mnt/ext4-storage/Training/BasicSR/codes/dataops/common.py", line 129, in read_img
img = fix_img_channels(img, out_nc)
File "/mnt/ext4-storage/Training/BasicSR/codes/dataops/common.py", line 139, in fix_img_channels
if img.ndim == 2:
AttributeError: 'NoneType' object has no attribute 'ndim'

(main) [n0man@fedora-desktop-n0man codes]$

@victorca25
Copy link
Owner

Ok, so the problem is that for some reason the enviroment variables for lmdb are not being correctly passed to the read function:

env: None
env: None
env: None
env: None```

I'm working on something else at the moment, but I'll try to take a look to see if I find where the issue is.

@victorca25 victorca25 self-assigned this Feb 14, 2021
@victorca25 victorca25 added the bug Something isn't working label Feb 14, 2021
@victorca25
Copy link
Owner

I may have found a solution, but it will take a while to commit, since I have been modifying the dataloaders and they are not in a state I can commit at the moment

@N0manDemo
Copy link
Author

N0manDemo commented Feb 15, 2021 via email

@victorca25
Copy link
Owner

victorca25 commented Feb 15, 2021

Awesome! It won't be on hold, I just need one or two more days to finish testing it and modifying the dataloaders to a state that can be committed, I'll let you know when it's up

@victorca25
Copy link
Owner

@N0manDemo the updated datasets and lmdb codes have now been commited. Please refer to the wiki for more details about the updated lmdb, you will have to recreate the database with the script, but it should work much better now

@N0manDemo
Copy link
Author

Thank you.

lmdb is working now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants