Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test new labels on this network #14

Closed
ranch-hands opened this issue Aug 6, 2021 · 8 comments
Closed

test new labels on this network #14

ranch-hands opened this issue Aug 6, 2021 · 8 comments

Comments

@ranch-hands
Copy link

hi dear @SebastianSchildt @SushkoVadim
I used another network to create label maps from images (this network: https://github.com/CSAILVision/semantic-segmentation-pytorch) then fed your oasis network with them in test mode, but it doesn't work. it says:
[RuntimeError: cuDNN error: CUDNN_STATUS_NOT_INITIALIZED]
would you please help ?

thank you so much

@SebastianSchildt
Copy link
Member

Hi, I do not really do to much with NN, on a lucky day I might run a TensorFlow example without fully knowing what it is doing 😹 Random unhelpful Tech-support guess: Update nvidia drivers 🔧

-> I assume you might have me mixed up with another colleague, who actually knows about this 😁

@SushkoVadim
Copy link
Contributor

Hi,

[RuntimeError: cuDNN error: CUDNN_STATUS_NOT_INITIALIZED] is surely an error on your GPU device side, and not an error in the implementation of neural networks. Could you please check if you have installed pytorch correctly; your NVIDIA-drivers are compatible with the pytorch version you are using; NVIDIA-drivers are installed correctly.

As a first check, you may test

import torch
print(torch.cuda.is_available())

in your python shell to verify whether you have your GPU configured properly

@ranch-hands
Copy link
Author

ranch-hands commented Aug 6, 2021

@SushkoVadim @edgarschnfld
yes, dear, the GPU or cudda is available, and I test the normal labels with it. but the error is because of the labels. would you tell me please what should I do about the size and number of the classes of label maps that apparently mismatch the oasis network?
let me show you part of the complete error:

/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [164,0,0], thread: [7,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [164,0,0], thread: [8,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [164,0,0], thread: [9,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [164,0,0], thread: [10,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [164,0,0], thread: [11,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [223,0,0], thread: [2,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [223,0,0], thread: [3,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [223,0,0], thread: [4,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [222,0,0], thread: [52,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [222,0,0], thread: [53,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [222,0,0], thread: [54,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [222,0,0], thread: [55,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [222,0,0], thread: [56,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [222,0,0], thread: [57,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [222,0,0], thread: [63,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds" failed.
Traceback (most recent call last):
File "test.py", line 36, in
generated = model(data_i, mode='inference')
File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/content/SPADE/models/pix2pix_model.py", line 57, in forward
fake_image, _ = self.generate_fake(input_semantics, real_image)
File "/content/SPADE/models/pix2pix_model.py", line 195, in generate_fake
fake_image = self.netG(input_semantics, z=z)
File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/content/SPADE/models/networks/generator.py", line 89, in forward
x = self.fc(x)
File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/conv.py", line 443, in forward
return self._conv_forward(input, self.weight, self.bias)
File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/conv.py", line 440, in _conv_forward
self.padding, self.dilation, self.groups)
RuntimeError: cuDNN error: CUDNN_STATUS_NOT_INITIALIZED

@SushkoVadim
Copy link
Contributor

Thanks for posting the error message.
The message probably indicates the label maps are stored in a wrong format.
Thus the pre-processing did not succeed in transforming it to a one-hot encoding tensor with correct number of classes.
Could you verify you store your label maps in the same format as our model expects (see https://github.com/NVlabs/SPADE#dataset-preparation for reference)?

@ranch-hands
Copy link
Author

thanks for your reply.
yes, I followed the link https://github.com/NVlabs/SPADE#dataset-preparation, but it doesn't say anything special, does it? it just says that we should put the dataset in folders like annotations and images. I made the folder exactly like what it said for ade20k . but as you know the labels are attained from https://github.com/CSAILVision/semantic-segmentation-pytorch.

@edgarschnfld
Copy link
Contributor

Hi,
Label maps exist in two formats: A map of integer indices of size {batch_size, height, width} or as one-hot versions of size {batch_size, number_of_channels, height, width}.

Your error indicates that something is wrong with the indices of the label maps or number_of_channels. In particular, it means that the magnitude of the highest index (which equals the number of channels in the one-hot version) is out of range.

Use you favourite debugger or print statements to compare the original data with the new data that you are trying to use. The new label maps should have the same size, the same minimum and maximum value (when looking at many label maps) or the same number of channels (in the one-hot format), as well as the same data type (long or float) as the original data in the original code.

You might find that sometimes the lowest label index is -1, sometimes 0 (all label indices are shifted by 1), depending on the dataset or where you get it from.

@ranch-hands
Copy link
Author

@edgarschnfld @taesungp
thanks dear
is there a snippet that makes labels suitable to the oasis network?
I'm really confused about how to compare the 2 datasets and then alter my labels to proper form.

@ranch-hands
Copy link
Author

finally worked : i just changed the values of each of my label based on the original label map that had only 151 different labels.
here is the code for example for test:
import cv2
for i in range(len(ff)):
for j in range(len(ff[i])):
for k in range(len(ff[i][j])):
if ff[i][j][k]>150:
ff[i][j][k]=12

cv2.imwrite('/content/1.png',ff)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants