Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issues w/ running on Windows #1

Open
ad48hp opened this issue Mar 17, 2020 · 26 comments
Open

Issues w/ running on Windows #1

ad48hp opened this issue Mar 17, 2020 · 26 comments

Comments

@ad48hp
Copy link

ad48hp commented Mar 17, 2020

File "D:\Cancer\neural-dream\neural_dream\models\googlenet\bvlc_googlenet.py", line 352, in forward pool5_7x7_s1 = F.avg_pool2d(inception_5b_output, kernel_size=(7, 7), stride=(1, 1), padding=(0,), ceil_mode=False, count_include_pad=False) RuntimeError: Given input size: (1024x4x4). Calculated output size: (1024x-2x-2). Output size is too small

Tried with multiple images, image-size & models..

Would be nice if you could get it working as it's quite moar challenging to get Caffe run on Windows currently..

@ProGamerGov
Copy link
Owner

@ad48hp That issue occurs when the smallest octave image size is too small for the model. A quick test with the GoogleNet Caffe models shows that they require both height and width to higher than 223px.

This will result in an error because one of Octave 1's dimensions is than 223px:

Performing 2 octaves with the following image sizes:
  Octave 1 image size: 208x278
  Octave 2 image size: 348x464

For the example images, I used -num_octaves 2 -octave_scale 0.6 -image_size 512, which resulted in Octave 1's smallest dimension being greater than 223:

Performing 2 octaves with the following image sizes:
  Octave 1 image size: 230x307
  Octave 2 image size: 384x512

Octaves as I understand are pretty much the same thing as the multiscale resolution technique used with style transfer. The -octave_scale value tells the code how much to reduce the image in size, and the -num_octaves tells the code how many times to do that.

@ad48hp
Copy link
Author

ad48hp commented Mar 22, 2020

Seems to work now, though not sure why, but the DeepDreaming is incredibly slow with this one..
On the Caffe linux version, image took few seconds to process, here one octave takes minutes.. (both on CPU)

Could ya look into that ?

This is the script i tried..
Dreamify

@ProGamerGov
Copy link
Owner

@ad48hp I'll look into it. Have you tried using -backend mkl when using the CPU at all? That should make it faster.

@ad48hp
Copy link
Author

ad48hp commented Mar 22, 2020

Seems to not go faster.
Also, what's the learning_rate parameter doing here, isn't that supposed to be used only for training phase ?

Can't yet find that one in the dreamify script..

@ProGamerGov
Copy link
Owner

ProGamerGov commented Mar 22, 2020

The -learning_rate parameter is the same thing as the 'step size' on other DeepDream projects. On Dreamify, it's called step_size. It's the size of the 'jump' or 'step' towards the network's goal. If it's too high you can overshoot, and if it's too low then it will take forever to reach the goal.

If the learning rate is set too high, the image will look like this:

out_1

If the learning rate is set too low, then very little if any change will occur on the output image.

Different models can require different learning rates, and you may have to play around with the values to find one that works. PyTorch VGG models for instance will require a really low learning rate, like 0.0055 or something like that. Caffe models tend to work better with learning rates of around 1-5.

@ad48hp
Copy link
Author

ad48hp commented Mar 22, 2020

Okay, seems to be running fine now, maybe the speed gets higher when it's run few times..
I have a request, if ya don't mind, it's quite challenging maybe yet for me to orientate in other people's codes, so it might take some time for me to implement it..

The code i wrote for myself back then was able to pick-up w/ the zooming where it last ended, so let's say i wanted 1000 pictures to be generated, but on 437 i stopped it, and then when i runned it again it continued with 438..

Could ya implement it here, boi ?

@ProGamerGov
Copy link
Owner

ProGamerGov commented Mar 23, 2020

@ad48hp For the moment you can try this: https://github.com/ProGamerGov/neural-dream/tree/output-image-name. The -output_start_num parameter will let you start the output number at a number of your choosing. I'll probably delete the branch and merge it into the master branch when I have more time, but for the moment all you need is the neural_dream.py file from the branch I linked.

I made the -output_start_num parameter start at the number that you give it, as I wasn't sure if it was better to have it be -output_start_num + 1. It's possible that I may change how it works.

@ad48hp
Copy link
Author

ad48hp commented Mar 23, 2020

How that script worked is that it used path.isfile to search for the last image, and then use it..
Only if no images were present in the output directory, it used the input image..

Check this script yet

@ad48hp
Copy link
Author

ad48hp commented Mar 25, 2020

On Windows, CUDA backend writes "RuntimeError: error in LoadLibraryA"..

I wrote..
'import ctypes
ctypes.cdll.LoadLibrary('caffe2_nvrtc.dll')'
in neural_dream.py after 'import torch'
and now it seems to work..

Src

@ad48hp
Copy link
Author

ad48hp commented Mar 25, 2020

Also, i was able to reproduce the error i wrote you about! ^^
https://github.com/ad48hp/TemporaryFiles/tree/master/NeuralDream

python neural_dream.py -content_image steve-halama-paxhSiyjndU.png -model_file models/googlenet_places205.pth -init image -gpu 0 -num_iterations 600 -octave_iter 50 -octave_scale 0.4 -image_size 576 -learning_rate 6 -zoom 98 -backend cudnn -output_image output2/t.png

Can you reproduce the results ?

@ProGamerGov
Copy link
Owner

@ad48hp The parameters don't seem to raise that PyTorch error that you posted above. The content image you are using seems extremely bright, and using -adjust_contrast 99.999 seems to change that, but changes the output look a bit.

@ad48hp
Copy link
Author

ad48hp commented Mar 25, 2020

I meant, i was able to reproduce the issue i wrote about here, oddly enough with the Places205 model path as well..

Img.
Original

Iteration 98
Iteration 98

Iteration 201
Iteration 201

What's weird is that at the beggining, i wasn't getting those shapes at all and now consistently.. Isn't there some sort of cache that keeps ruining it (i tried to remove the __pycache folders, didn't changed anything so far)..

@ProGamerGov
Copy link
Owner

The learning rate that you are using seems like it might be a bit too high. Lowering it to 1.6 resulted in this output at 29 iterations:

out_29_lp0

Using -lap_scale 4, and a learning rate of 1.6, I got this at 11 iterations:

out_11_lp4

I've found that the best results sometimes require taking things slow initially, because the network has more time to build the details without overshooting.

@ad48hp
Copy link
Author

ad48hp commented Mar 25, 2020

That sucessfully fixed the problem with Places205 !
Sadly, the Places365 seems to do the same problem..

python neural_dream.py -content_image steve-halama-paxhSiyjndU.png -model_file models/googlenet_places365.pth -init image -gpu 0 -num_iterations 600 -octave_iter 24 -octave_scale 0.4 -image_size 576 -learning_rate 0.66 -zoom 98 -backend cudnn -output_image output22/t.png

Iteration 18
Iteration 18

Iteration 36
Iteration 36

@ProGamerGov
Copy link
Owner

@ad48hp I've found that results like those are normally caused by the inputs and parameters that you have chosen. The content image you are using may be affecting the results. You can try adding -channel_mode avg -channels 10 or -channel_mode weak -channels 10 -lap_scale 4 and potentially use a learning rate of around 1.5 if you want.

@ad48hp
Copy link
Author

ad48hp commented Mar 26, 2020

Ha, ya seem to be yet way better at it than me !
Can you look at (implementing) this model ?
https://github.com/vpulab/Semantic-Aware-Scene-Recognition

Under Places365 "Ours" it ends with "tar" but from what i look there's no archive in it, so ya can just remove the 'tar' extension from the filename completely..

@ad48hp
Copy link
Author

ad48hp commented Mar 26, 2020

Also, the NSFW network doesn't seem to work..
AttributeError: 'ResNet_50_1by2_nsfw' object has no attribute 'inception_4d_3x3_reduce'

@ProGamerGov
Copy link
Owner

ProGamerGov commented Mar 26, 2020

@ad48hp That's because it doesn't have the same layer names as the GoogleNet models. Use the -print_layers command when using the model, or check out the list of layer names here, to see the possible values.

@ProGamerGov
Copy link
Owner

I've added the -output_start_num parameter to the master branch, in addition to some other new features and fixes: 9bea997

@ad48hp
Copy link
Author

ad48hp commented Mar 27, 2020

Shouldn't the If channel_mode 'is set to a value other than all, only the first value in the list will be used' part in Readme also include 'ignore' ?

Also, can ya now add one tiny thing, that when the output_start_num is settled to a value greater than 1, it would automatically load the t(output_start_num - 1).png file in the output directory or something like that yet ?

@ProGamerGov
Copy link
Owner

@ad48hp Good catch!

I'll have to think about how I'd implement such a feature. For the moment though, it's simple enough to manually find the image highest number.

@ProGamerGov
Copy link
Owner

ProGamerGov commented Mar 28, 2020

I should mention that if you add before_ to the layer name of a GoogleNet, Inception, or ResNet model, like adding before_ to inception_4d_3x3_reduce to make before_inception_4d_3x3_reduce, the DeepDream layer will be placed before the specified layer and not right after it.

@ad48hp ad48hp closed this as completed Mar 28, 2020
@ad48hp ad48hp reopened this Mar 28, 2020
@ad48hp
Copy link
Author

ad48hp commented Mar 28, 2020

Also, this is a thing i believe is not directly relevant to this repository, but for some reason, DeepDream scripts run fast at the beggining, and then, after about 20 minutes of runtime, it slows down tredemounsly, sometimes yet rerunning the app helps temporarily, but after a while, the speed goes terrible again (first about 6 iterations per seconds then drops to about 1 iteration per 3 seconds).. Reported GPU usage is consistently low (even when setting NVIDIA to highest performance setting) and goes slow even with MSI Afterburner..

CPU speed goes from 45% to about 24% after a while..
Process priority of Python didn't seem to change anything yet either..

@ProGamerGov
Copy link
Owner

I would expect if there was a memory leak, then usage would increase. I changed a line of code in the update branch that could fix the issue if it was caused by a bug in my code: https://github.com/ProGamerGov/neural-dream/tree/update. If that doesn't resolve it, then there could be something else going on.

@ad48hp
Copy link
Author

ad48hp commented Mar 28, 2020

I'll try the update soon hopefully..
The 'old' version now gave me this after about few hours of running..
Traceback (most recent call last): File "neural_dream.py", line 761, in <module> main() File "neural_dream.py", line 233, in main net(img) File "C:\Python38\lib\site-packages\torch\nn\modules\module.py", line 532, in __call__ result = self.forward(*input, **kwargs) File "D:\Cancer\neural-dream_2\neural_dream\dream_utils.py", line 296, in forward return self.net(self.input_net(input)) File "C:\Python38\lib\site-packages\torch\nn\modules\module.py", line 532, in __call__ result = self.forward(*input, **kwargs) File "D:\Cancer\neural-dream_2\neural_dream\models\googlenet\googlenetplaces.py", line 275, in forward inception_4d_3x3_reduce = self.inception_4d_3x3_reduce(inception_4c_output) File "C:\Python38\lib\site-packages\torch\nn\modules\module.py", line 534, in __call__ hook_result = hook(self, input, result) File "C:\Python38\lib\site-packages\torch\nn\modules\module.py", line 532, in __call__ result = self.forward(*input, **kwargs) File "D:\Cancer\neural-dream_2\neural_dream\loss_layers.py", line 138, in forward self.loss = self.dream(output.clone()) * self.strength File "C:\Python38\lib\site-packages\torch\nn\modules\module.py", line 532, in __call__ result = self.forward(*input, **kwargs) File "D:\Cancer\neural-dream_2\neural_dream\loss_layers.py", line 104, in forward input = self.lap_pyramid(input) File "C:\Python38\lib\site-packages\torch\nn\modules\module.py", line 532, in __call__ result = self.forward(*input, **kwargs) File "D:\Cancer\neural-dream_2\neural_dream\dream_utils.py", line 134, in forward return self.lap_merge(self.pyramid_list(input)) File "D:\Cancer\neural-dream_2\neural_dream\dream_utils.py", line 122, in pyramid_list input, hi = self.split_lap(input) File "D:\Cancer\neural-dream_2\neural_dream\dream_utils.py", line 116, in split_lap gt = self.gauss_blur_hi(input) File "C:\Python38\lib\site-packages\torch\nn\modules\module.py", line 532, in __call__ result = self.forward(*input, **kwargs) File "D:\Cancer\neural-dream_2\neural_dream\dream_utils.py", line 79, in forward return self.guass_blur(input) File "C:\Python38\lib\site-packages\torch\nn\modules\module.py", line 532, in __call__ result = self.forward(*input, **kwargs) File "D:\Cancer\neural-dream_2\neural_dream\dream_utils.py", line 62, in forward input = self.conv(input, weight=self.weight, groups=self.groups) RuntimeError: cuda runtime error (700) : an illegal memory access was encountered at C:/w/1/s/windows/pytorch/aten/src\THCUNN/generic/SpatialDepthwiseConvolution.cu:89

@ProGamerGov
Copy link
Owner

That looks like it may be a PyTorch bug:

NVIDIA/apex#319
pytorch/pytorch#21819
https://discuss.pytorch.org/t/cuda-runtime-error-700-illegal-memory-access/71100/9

It doesn't look like anyone has been able to reliably reproduce the error though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants