The objective of this project is to train CNN model on images of different vehicles and using sliding window approach to detect different types of vehilces in the image.
Download and Install the following packages
- tensorflow
- tflearn
- h5py
- hdf5
- SciPy
- numpy
- cv2
We capture the video near the local traffic signal using camera and use matlab to extract images from the video.
After extracting the images from video using crop_data.m file we save the images in a separate folder and labeled them as
- 0 for rikshaws.
- 1 for cars.
- 2 for bikes
- 4 for trucks/buses.
- 3 for non vehicles.
In our case datatset contains 951 images in which 127 images were labeled as 0, 409 images were labeled as 1, 195 images were labeled as 2 and 219 images were labeled as 3 and the rest ones were labeled as 4.
check the file format dataset.txt
Some positive and negative images are as follows:
Images containing multiple vehicles extracted from another video not used for training.
Follwing are the files to run the model
- train_project.py: Used to train the data.
- test.py:Load the image for testing and detects a vehicles using sliding window approach.
- dataset.txt: set of images in a formatted order .
- crop_data.m: matlab code for cropping imaes from video.
First run the matlab code on the captured video to detect images and create a dataset file named as dataset.txt and then run the files by typing the follwoing commands on terminal
python train_project.py
python test.py
We use CNN in our case as CNN is best fit where we have dataset of images.
- Input data shape= [100*100,3]
- Conv: 64 filters of size 3x3 with ReLU activation
- Pooling: with filter size 2x2
- Conv: 32 filters of size 3x3 with ReLU activation
- Pooling: with filter size 2x2
- Conv: 32 filters of size 3x3 with ReLU activation
- Pooling: with filter size 2x2
- Fully Connected: with 256 neurons and ReLU activation and dropout with probability 0.75
- Fully Connected: with 256 neurons and ReLU activation and dropout with probability 0.75
- Fully Connected output layer: with 5 neurons (equal to number of classes) and softmax classifier.
tflearn image preloader was used to load train dataset using a file
We ran the cnn model for 5 epox and got accuracy 0.8398 with validation accuracy of 0.8796 using learning date of 0.001 with adam optimizer and 80% data used for tsing and 20% data used for validation.
Run the cnn model for 20 epox which gives accuracy 0.9454 with validation accuracy of 0.8586 using learning date of 0.001 with adam optimizer
This is our originl Image on which we perform testing.
sliding window apporach is used having width and height 100*100.
Than we perform the same function on our original Image and than perform testing and we got the following result.
After using all_rec we got many rectangles. To avoid this we use group_rec function here and got following results.
This project aims to detect vehicles on unseen data using cnn network. The difficulty faced so far was to capture the video of atleast 25 minutes so that atleast 1000 images can be cropped down from the video. cropping is done using matlab code but it requires selection point for each frame so it was a difficult task to do. Moreover cnn network using 5 and 20 epox gives different accuracies which was a challenge as well.However to produce better result(more accuracy) we can increase the epox and try different learning rates.