-
Notifications
You must be signed in to change notification settings - Fork 88
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feed_dict={K.learning_phase(): 1} while executing train_mitoses.py #3
Comments
and these are other errors that I've got during compiling:
|
Now, I wanted to try something different: I changed my Anaconda version from 3.5 to 3.6 and changed cudnn version from v5.1 to v6.0.
After that, I rerun the code itself with the same exact configuration, but still getting same error.
|
Hi @Narthyard, thanks for reaching out, and thanks for trying out the code! Interesting issue -- I have not run into this issue on my current setup, so this will be a good exercise to find out what is going on in the case of different setups. Looking at the error message, it looks like it is an issue either with the
I'll investigate this a bit further with the TensorFlow debugger and see if I can determine the exact op and get back to you. |
Hi @dusenberrymw, I'd like to say big thank you for your kind, quick response and your amazing effort. Do you mind if I ask what is your current set up's version numbers?, I looked around but I couldn't find any info about it. |
@dusenberrymw By the way, I didn't download the testing_image_data due to fact that I just wanted to train the model without validation. Is the error itself correlate with lacking test image data? I'm not sure about it, that may be a silly question. |
@Narthyard I debugged this a bit, and it was indeed an issue with the output of the As for the testing images, your original intuition was correct -- for training the model, the testing images are not necessary. For clarity, the preprocess script will still split the training images into a training and validation set, but those still come from the original training folder, and not the testing folder. Here is my current setup:
Please let me know if the latest change fixes the issue! |
@dusenberrymw I grabbed your current update by hoping to solve the problem(your solution was indeed sounded plausible) , but unfortunately still facing with the same problem.
|
Hi again @dusenberrymw, I'ld like to mention that I tried your work on Linux platform(ubuntu 16.04LTS), I reduced training batch size due to limited GPU ram availability and it is working flawlessly. |
Hi @Narthyard, that's great that you were able to successfully run it on Linux. Although I didn't post an update here yesterday, we continued to look into it, and @gweidner was able to reproduce the exact issue on a Windows setup. Thus, it appears to be a core TensorFlow issue. We'll create a minimal example that causes the same issue, and then look into options. For your particular workflow, is your Linux setup sufficient for you to be able to move forward for now, or is this issue on Windows still a blocker? Thanks again for trying out the code! |
Hi @dusenberrymw, first of all, I'd like to say thank you for taking the time to work with me. You really went out of your way to ensure my case is solved the way I like. So far so good, I'll carry on with Linux platform for now. I'll limit the number of patches to examine the traning model efficiently and quickly. The current set up produces 782,490 patches, and my GPU and its ram capability are way too low. I tried to set them all but it takes ages to feed the model, so I interrupted it after one and a half day. Now, I will just use limited number of patches to be ensure that system is producing the desired outputs. Then, I may try it on super computer to examine the environment in a different way, so I also log the traning time when the system setup is premium. I'm looking forward to try your freshly added and updated "predict_mitosis" code. I'll report for the upcoming errors if any. You are doing amazing job, thanks again for both of you. @gweidner |
@Narthyard Your welcome, and thank you for taking the time to try out the codebase and report issues! I'm going to close this issue now and keep track of the dependency bug as a separate task. Thanks for reporting it and trying out the code! |
I extracted train and validation patches using "python preprocess_mitoses.py". Once I wanted to train the default vgg model with those patches, I got the error which is shown above.
Actually I do not know how to manage with the problem, due to fact that I do not get a clue about the error itself. Would you mind if I seek for your help? Thank you.
OS: Windows
tensorflow == 1.3.0(cpu loaded)
Keras== 2.0.8
anaconda == 3.5.2
conda == 4.3.27
The text was updated successfully, but these errors were encountered: