Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Questions for run train.py #41

Closed
jiyoungAn opened this issue May 23, 2018 · 13 comments
Closed

Questions for run train.py #41

jiyoungAn opened this issue May 23, 2018 · 13 comments

Comments

@jiyoungAn
Copy link

jiyoungAn commented May 23, 2018

Hi, I am a beginner in tensorflow and I'm very interested in your paper.
I'll be very grateful if you could answer my question.

  1. run only using CPU
    I am trying to run train.py using CPU only.
    So I changed NUM_GPUS to 0 (inpaint.yml).
    However error related with reading gpu still has occurred and it stops after below sentence.
    32m[2018-05-23 19:02:54 @weights_viewer.py:60]�[0m Total size of trainable weights: 0G 10M 184K 136B (Assuming32-bit data type.)
    Are there any settings that need to be changed or Is it impossible to run only using cpu?

  2. Adding train image in existing model
    I'd like to add some training images(#800(shape 255,255,3)) in your place2 model.

  1. downloading the place2 model you've built
  2. locate the model into the model_logs file
  3. train
    In this way, will my image be added to the existing model?

Regarding, jiyoungAn

@JiahuiYu
Copy link
Owner

  1. Please remove lines here. It is highly recommend to have GPUs to train model, otherwise the progress will be very slow.

  2. Yes, you images with random sampled masks will be used to fine-tune existing model. Make sure you do have validation images to track the training progress, otherwise it may be easily overfitting on you 800 images.

@chenzhaiyu
Copy link

Hi, I am able to run train.py using CPU only after removing the several code lines as you said.

But when I run test.py, it raises the Error reading GPU information, set no GPU.

What other changes should I make to run the test.py with your pretrained model still using CPU only?

Best Regards.

@JiahuiYu
Copy link
Owner

For testing, the similar lines should be removed in test.py.

@jiyoungAn
Copy link
Author

thank you!!

@KangSH9776
Copy link

KangSH9776 commented May 26, 2018

thank you very much!!

I have additional questions.

1.downloading the place2 model you've built
It's in the top question.

How and where to download place2 data?

@jiyoungAn
Copy link
Author

You can download it in "pretrained models" section here.
click dataset name. :)

@KangSH9776
Copy link

KangSH9776 commented May 26, 2018

Thank you for your answers.

I deleted and ran the code below in the file ' test.py '.
ng.get_gpus(1)

The following error occurred :
2018-05-26 17:21:06.800877: E tensorflow/stream_executor/cuda/cuda_dnn.cc:455] could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2018-05-26 17:21:06.800926: E tensorflow/stream_executor/cuda/cuda_dnn.cc:427] could not destroy cudnn handle: CUDNN_STATUS_BAD_PARAM
2018-05-26 17:21:06.800939: F tensorflow/core/kernels/conv_ops.cc:713] Check failed: stream->parent()->GetConvolveAlgorithms( conv_parameters.ShouldIncludeWinogradNonfusedAlgo(), &algorithms)
Aborted (core dumped)
What did i miss?ㅠㅠ

can I ask you one more question?
I'd like to do training about place2 data, not pretrained models.
How do I download place2 data for training?
Thank you very much for your time.

@jiyoungAn
Copy link
Author

  1. I don't have much experience about tf, but
    If you write code like ng.get_gpus(1). It means that you will use Some results #1 gpu.
    os.environ['CUDA_VISIBLE_DEVICES']='1'
    maybe you can check again your gpu number.

  2. In inpaint.yml, there are address info in each data set.

@JiahuiYu
Copy link
Owner

If you only have one GPU, starts from 0. os.environ['CUDA_VISIBLE_DEVICES']='0'

@JiahuiYu JiahuiYu reopened this May 27, 2018
@KangSH9776
Copy link

Hello, I'm going to train 'place2 dataset', but I have a problem and leave a question.

I downloaded 'High-resolution images' as it was written. below 'Data of Places365-Standard '

The 'flist' file was downloaded from 'CelebA-HQ'.

I get the following 'cuda has no input image' error when executing 'python train.py '

I want to know how to do data when training 'place2'.

Thanks for reading

@JiahuiYu
Copy link
Owner

JiahuiYu commented Jun 3, 2018

The path of files in flielist file maybe wrong. Please check. Please consider print the path inside data_from_fnames.py file.

@KangSH9776
Copy link

where is data_from_fnames.py file?

@hexia11
Copy link

hexia11 commented Apr 18, 2019

  1. Please remove lines here. It is highly recommend to have GPUs to train model, otherwise the progress will be very slow.
  2. Yes, you images with random sampled masks will be used to fine-tune existing model. Make sure you do have validation images to track the training progress, otherwise it may be easily overfitting on you 800 images.

It still stops after show the sentence:
�[32m[2019-04-18 12:32:44 @weights_viewer.py:60]�[0m Total size of trainable wei
ghts: 0G 10M 184K 136B (Assuming32-bit data type.)
I have already removed the lines you said.How to fix this up?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants