AutoPercept

A project focused on Autonomous Vehicle Perception processes. This repository serves as a starting point for implementing object detection and depth estimation capabilities in autonomous systems using YOLO architecture and Vision Transformers.

Overview

AutoPercept uses YOLO (You Only Look Once) for real-time object detection and tracking and a vision transformer called MiDaS for a Monocular Depth Estimation. The model is trained on the KITTI dataset, enabling accurate detection and depth estimation of objects in various driving scenarios. The open sourced YOLOv8 weights from Ultralytics have been utilized for object detection model training. Zero Shot Depth Estimation is done using MiDAS since the model has already been trained on the KITTI dataset (You can learn more about MiDaS at https://github.com/isl-org/MiDaS)

Key Features

YOLO Object Detection: Real-time detection of objects using YOLO architecture.
Real Time Counting: Functionality for counting and displaying the amount of detections for each class has been added
Pythonic UI : AutoPercept can also be used as a Wrapper or Interface to run inference on videos using custom model weights
Depth Estimation: Monocular Depth Estimation has been added using MiDaS, a Vision Transformer.
Saveable Results : Functionality to save the video with the detections in a given directory with a given name has also been added
KITTI Dataset: Trained on the KITTI dataset, which includes various object categories commonly encountered in autonomous driving scenarios

Getting Started

Clone the Repository:

git clone https://github.com/yourusername/AutoPercept.git
cd AutoPercept

Setup Environment:

# Install required dependencies
pip install -r requirements.txt

Running the Gooey App:
```
python AutoPercept.py
```
or
```
python ViT.py
```
Specifying Pre-inference parameters
- NOTE : Trained Weights can be found in the "Model_Weights" directory
Inference
- After specifying the pre-inference params, hit the "Start" Button to start the inference process
- A new window will open up showing the detections being made for each frame of the video
- After inference is completed, you will find the video saved in the 'output_path' directory

Performance Metrics

Precision, Recall, Mean Average Precision@IoU=0.5 and Mean Average Precision@IoU=0.5-0.95 for each class over the validation dataset

Model Training Performance

Working Examples

README_EX.mp4

49d37214-86df-4ed1-99c1-8cf15eaa3364.mp4

Future Scope

Monocular Depth Estimation : I aim to add functionality of simultaneous depth estimation and objection detection very soon. (This functionality has been added)
FPS Optimization : The model seems to be working at a mediocre FPS. I will be looking to solve this soon enough. (This issue was being caused because of a bug that was not letting the program utilize the GPU. It has now been fixed)
Quantization : Currently, the MiDaS depth estimation model runs on 1-10 fps depending upon the resolution of the image. To makes the inference faster I am going to add 4 bit Quantization to the model. This will involve converting the model hyperparameters from their 32 bit floating point representation to a 4 bit one
Simultaneous Localization and Mapping (SLAM) : Due to the fact that the KITTI dataset also contains LiDAR and 3D Point Cloud data, It will be possible to add functionalities to visualize SLAM Processes in real time.
Object Tracking and Projection : Using Kalman Filter, I am currently working on creating methods to Track these object's movements and visualize them in real time.
YOLOv10 : Since the initiation of this project, YOLOv10 was released. Ultralytics state that this new model beats all SOTA Object Detection Benchmarks. I will soon add functionality for YOLOv10 inference

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
10_EPOCHS		10_EPOCHS
30_EPOCHS		30_EPOCHS
45_EPOCHS		45_EPOCHS
Model_Weights		Model_Weights
AutoPercept.py		AutoPercept.py
LICENSE		LICENSE
README.md		README.md
ViT.py		ViT.py
example1.mp4		example1.mp4
example2.mp4		example2.mp4
example3.mp4		example3.mp4
requirements.txt		requirements.txt
training.ipynb		training.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AutoPercept

Overview

Key Features

Getting Started

Performance Metrics

Working Examples

Future Scope

About

Releases

Packages

Languages

License

lag25/AutoPercept

Folders and files

Latest commit

History

Repository files navigation

AutoPercept

Overview

Key Features

Getting Started

Performance Metrics

Working Examples

Future Scope

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages