Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increasing the number of training images. Error Too many index array #28

Closed
alexdominguez09 opened this issue Jul 22, 2018 · 7 comments
Closed

Comments

@alexdominguez09
Copy link

Hi,

I have tried the scripts to train the auto-pilot with your own dataset, and it is working ok. Now, I have created my own dataset installing a webcam and a angle encoder in may car to capture around 30 images for second, and I manage to get for the moment 2 datasets one with 62000 images and their angle and another with 22000 images and angles too.

When I train both new captured datasets individually, they train ok and test ok. But if I merged both datasets to get a total of around 85000 images and angles, after the 100 step in the 1st epoch, the script driving_data.py crashed complaining about too many index array in line 42:

x_out.append(scipy.misc.imresize(scipy.misc.imread(train_xs[(train_batch_pointer + i) % num_train_images])[-150:], [66, 200]) / 255.0)

I changed the batch size from 100 to 200, and the only difference is that it crashed in step 110.

What is the correct number of epoch, batch size, etc for an increase of images in the dataset?

Thanks

Alex

@alexdominguez09
Copy link
Author

This is the exact error message:

Epoch: 0, Step: 60, Loss: 6.16457
Traceback (most recent call last):
File "train.py", line 40, in
xs, ys = driving_data.LoadTrainBatch(batch_size)
File "/media/alex/61170219-f8e4-49b1-bb7b-32dd8ba2b6f6/cnn/steering_wheel/Autopilot-TensorFlow/driving_data.py", line 44, in LoadTrainBatch
x_out.append(scipy.misc.imresize(scipy.misc.imread(train_xs[(train_batch_pointer + i) % num_train_images])[-150:], [66, 200]) / 255.0)
IndexError: too many indices for array

@SullyChen
Copy link
Owner

How are you appending the data? Are you making sure to update the .txt file with the new images you added?

@alexdominguez09
Copy link
Author

Yes. I updated the files data.txt where the driving_data.py are referencing it.

In fact, I trained it and tested it ok, when the data set is samll (22000 images and angles).

Is there anything to take into account if I want to train with many more images, lets say 500000 or one million. Modifying epochs, batch sizes or any other constant in any script?.

Thanks

@SullyChen
Copy link
Owner

There shouldn't be anything you need to do. For debugging, you can try the following: print(driving_data.num_images) and see what it outputs.

@alexdominguez09
Copy link
Author

It is solved now. It turns that some data within my dataset was corrupt, a special character probably. I could not get the exact line failing, as I am talking tens of thousands of lines, so I remove some chunks of data with the corrupted lines and it worked.

So I take then that no manipulation or changes to the model needs to be done to train one million of lines or thousands.

thanks

@coral-agc
Copy link

coral-agc commented Nov 18, 2019

It is solved now. It turns that some data within my dataset was corrupt, a special character probably. I could not get the exact line failing, as I am talking tens of thousands of lines, so I remove some chunks of data with the corrupted lines and it worked.

So I take then that no manipulation or changes to the model needs to be done to train one million of lines or thousands.

thanks

Hey Alex,

I am having the same problem and wanted to ask for some clarification- are you saying that some of your image files were corrupt, or that with your .txt file some lines were corrupt?

I've run train.py four separate times and it has failed with this error after 120, 220, 180, then 40 in epoch 0, so I'm trying to think what would cause the inconsistency.

Thanks!

@alexdominguez09
Copy link
Author

Dear coral-agc,

I meant to say the corruption was in my .txt files, where I discover lines with no values, or strange values and characters (result of how I created those .txt files with a serial to usb connection) via an arduino. My image files were ok.

I hope that help.

Regards.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants