Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when initializing training #14

Closed
michjens opened this issue Aug 12, 2020 · 9 comments
Closed

Error when initializing training #14

michjens opened this issue Aug 12, 2020 · 9 comments

Comments

@michjens
Copy link

Whenever i train the model, I can prepare the data, and its located in the correct place, but every time i start the training i get an exception:

Exception in thread Thread-35: Traceback (most recent call last): File "C:\Users\Michael\Anaconda3\envs\invoicetest\lib\site-packages\tensorflow\python\client\session.py", line 1356, in _do_call return fn(*args) File "C:\Users\Michael\Anaconda3\envs\invoicetest\lib\site-packages\tensorflow\python\client\session.py", line 1341, in _run_fn options, feed_dict, fetch_list, target_list, run_metadata) File "C:\Users\Michael\Anaconda3\envs\invoicetest\lib\site-packages\tensorflow\python\client\session.py", line 1429, in _call_tf_sessionrun run_metadata) tensorflow.python.framework.errors_impl.OutOfRangeError: End of sequence [[{{node IteratorGetNext_1}}]]

I've tried everything i could think of, but I'm unable to get it to work.

Any tips would be much appreciated

Thanks in advance

@naiveHobo
Copy link
Owner

naiveHobo commented Aug 12, 2020

It looks like it's not able to find any prepared data for training. What is the size of your training dataset? Did you prepare the data before training?

The most likely reason I can think of is that the size of your training dataset is smaller than the batch size you're using.

@michjens
Copy link
Author

Thanks for the quick reply

I just wanted to test it real quick, so i only made a couple of test files for training:

image

Each JSON only contains 4 different values as well

And yes, i did prepare the data before testing:
image

@naiveHobo
Copy link
Owner

naiveHobo commented Aug 12, 2020

Ah right, tensorflow dataloader throws that error since it is expected there would be some validation data. This is definitely an error that the user should at least be warned about, so thank you for pointing it out. For now, you need at least one sample for validation to satisfy the dataloader so having a dataset of at least (batch_size + 1) samples should work.

@michjens
Copy link
Author

That did the job, thanks a lot.

One last question tho, is it normal for it to take this long with such a small batch? I've had it running for ~1 hour now and it's finished:
image

And i definitely don't have a bad pc.

Just curious if this is to be expected so i don't end up waiting a day for it to complete only for it to not be intended.

@naiveHobo
Copy link
Owner

With a batch size of 4 I usually get about 0.9-1.0 batches/s on an Nvidia GTX 1050ti for an amount based field. For a general field, I get about 1.0-1.1 batches/s. It drops to about 0.65 and 0.45 respectively with a batch size of 8.

Do you have a GPU? Is CUDA 10.1 installed? I've written the setup.py script such that it automatically detects if CUDA is available and will install tensorflow-gpu if it is. Otherwise, it defaults to the tensorflow cpu bindings. Can you check the tensorflow logs to see if you're actually using the GPU if you have one?

@michjens
Copy link
Author

That's my issue - i don't have a dedicated GPU on my work laptop. I just didn't expect it would be this rough on it given the limited amount of training data i threw at it, but if it's normal then i just need to be a bit patient

@siddas27
Copy link

siddas27 commented Sep 7, 2020

I am getting the same error, although I have more than 70 processed data in train and 20 in val. @naiveHobo

@umairDms
Copy link

umairDms commented Oct 7, 2020

can you

tensorflow

please can you explain that how can i solve that, still facing same on macos using paython 3.7. @naiveHobo

@cobramostar
Copy link

training is not enough,
sorry, but I still didn't understand what data to insert for some validation data

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants