ESP32-S3-CAM Object Detection using CNN

Real-time object detection system using ESP32-S3-CAM with COCO-SSD model running in the browser.

Features

Live video streaming from ESP32-S3-CAM
Real-time object detection using TensorFlow.js COCO-SSD model
Detects 80 common object classes
Bounding boxes with confidence scores
Browser-based AI processing (no server required)
Configurable detection interval (default: 2 seconds)

Hardware Requirements

ESP32-S3-CAM board
USB cable for programming
WiFi network

Software Requirements

Arduino IDE (1.8.x or later)
ESP32 board support for Arduino
Web browser (Chrome recommended for best performance)

Installation

1. Install ESP32 Board Support

Open Arduino IDE
Go to File > Preferences

Add this URL to "Additional Board Manager URLs":

https://dl.espressif.com/dl/package_esp32_index.json

Go to Tools > Board > Boards Manager
Search for "esp32" and install "esp32 by Espressif Systems"

2. Upload Code

Clone or download this repository
Open esp32_modular.ino in Arduino IDE

Configure your WiFi credentials in the code:

const char* ssid = "YOUR_WIFI_SSID";
const char* password = "YOUR_WIFI_PASSWORD";

Select board: Tools > Board > ESP32 Arduino > ESP32S3 Dev Module
Select the correct COM port
Click Upload

Usage

After uploading, open the Serial Monitor (115200 baud)
Wait for the ESP32 to connect to WiFi
Note the IP address displayed in Serial Monitor
Open your web browser and navigate to: http://[ESP32_IP_ADDRESS]
Wait for the COCO-SSD model to load (~25MB, may take 30-60 seconds)
Point the camera at objects to detect them

Detectable Objects

The COCO-SSD model can detect 80 object classes including:

People & Animals:

person, dog, cat, bird, horse, cow, sheep, bear, zebra, giraffe

Vehicles:

bicycle, car, motorcycle, airplane, bus, train, truck, boat

Indoor Objects:

chair, couch, bed, dining table, toilet, tv, laptop, mouse, keyboard, cell phone, book

Kitchen Items:

bottle, wine glass, cup, fork, knife, spoon, bowl, banana, apple, sandwich, orange

And many more...

Configuration

Change Detection Interval

Edit the interval in html_page.h:

setInterval(detectObjects, 2000); // Change 2000 to desired milliseconds

Examples:

1 second: 1000
5 seconds: 5000
500ms: 500

Change Camera Resolution

Edit in esp32_modular.ino:

config.frame_size = FRAMESIZE_VGA; // Options: QVGA, VGA, SVGA, XGA, UXGA

Project Structure

esp32_modular/
├── esp32_modular.ino    # Main ESP32 code (camera, WiFi, web server)
├── html_page.h          # Web interface (HTML, CSS, JavaScript)
└── README.md            # This file

How It Works

ESP32-S3-CAM captures video frames and streams them via HTTP
Web browser receives the MJPEG stream
TensorFlow.js loads the COCO-SSD model in the browser
JavaScript periodically captures frames and runs object detection
Canvas overlay draws bounding boxes and labels on detected objects

Troubleshooting

Model stuck on "Initializing..."

Check browser console (F12) for errors
Ensure stable internet connection (model downloads from CDN)
Try Chrome browser for best compatibility
Wait longer (model is ~25MB and can take time to download)

Camera not working

Check Serial Monitor for error messages
Verify camera pin configuration matches your board
Try lowering frame size or JPEG quality

WiFi connection fails

Double-check SSID and password
Ensure WiFi network is 2.4GHz (ESP32 doesn't support 5GHz)
Check Serial Monitor for connection status

Low frame rate

Reduce camera resolution
Increase detection interval
Close other browser tabs

Technical Details

ESP32 Side:

Camera: OV2640/OV5640 sensor
Resolution: VGA (640x480)
JPEG compression: Quality 22
Web server: ESP HTTP Server
Stream format: MJPEG

Browser Side:

AI Framework: TensorFlow.js 4.11.0
Model: COCO-SSD 2.2.3
Detection: Client-side processing
No cloud/server required

Performance

Model load time: 30-60 seconds (one-time)
Detection interval: Configurable (default 2 seconds)
Inference time: ~100-300ms per frame (browser-dependent)
Objects per frame: Up to 100 (practical limit ~10-20)

Credits

TensorFlow.js: https://www.tensorflow.org/js
COCO-SSD Model: https://github.com/tensorflow/tfjs-models/tree/master/coco-ssd
ESP32 Arduino Core: https://github.com/espressif/arduino-esp32

License

This project is open source and available under the MIT License.

Future Improvements

Add support for custom trained models
Implement person tracking
Add snapshot/recording functionality
Support for multiple camera streams
Mobile-optimized interface
Offline model hosting on SD card

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
README.md		README.md
esp32_cnn.ino		esp32_cnn.ino
html_page.h		html_page.h

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

ESP32-S3-CAM Object Detection using CNN

Features

Hardware Requirements

Software Requirements

Installation

1. Install ESP32 Board Support

2. Upload Code

Usage

Detectable Objects

Configuration

Change Detection Interval

Change Camera Resolution

Project Structure

How It Works

Troubleshooting

Model stuck on "Initializing..."

Camera not working

WiFi connection fails

Low frame rate

Technical Details

Performance

Credits

License

Future Improvements

About

Uh oh!

Releases

Packages

Languages

SillyGooos/ESP32-Object-Detection-using-CNN

Folders and files

Latest commit

History

Repository files navigation

ESP32-S3-CAM Object Detection using CNN

Features

Hardware Requirements

Software Requirements

Installation

1. Install ESP32 Board Support

2. Upload Code

Usage

Detectable Objects

Configuration

Change Detection Interval

Change Camera Resolution

Project Structure

How It Works

Troubleshooting

Model stuck on "Initializing..."

Camera not working

WiFi connection fails

Low frame rate

Technical Details

Performance

Credits

License

Future Improvements

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages