A Flask-based face recognition system that receives images from ESP32, detects faces using OpenCV, identifies persons using a TensorFlow model, and logs results to CSV.
- 🎥 ESP32 Integration: Dedicated endpoint for receiving images from ESP32 camera
- 👤 Face Detection: Fast face detection using OpenCV Haar Cascade
- 🤖 AI Recognition: Person identification using TensorFlow/Teachable Machine model
- 🖼️ Visual Output: Automatically draws bounding boxes and labels on detected faces
- 📊 CSV Logging: Saves all detections with timestamps and confidence scores
- 🌐 Web Interface: Real-time testing and monitoring dashboard
ESP32 Camera → POST /esp32 → Face Detection (OpenCV) →
Classification (TensorFlow) → Draw Boxes → Save Image + CSV Log
Using pip:
pip install -r requirements.txtUsing uv (recommended):
uv pip install -r requirements.txtPlace your TensorFlow/Teachable Machine model in the my_model/ directory:
my_model/
├── model.json (or saved_model.pb)
├── metadata.json
└── weights.bin (or variables/)
Note: If using Teachable Machine, export as "TensorFlow" format and extract here.
python app.pyThe server will start on http://0.0.0.0:5000
- Open
http://localhost:5000in your browser - Upload test images via drag-and-drop or file selector
- View processed images with bounding boxes
- Monitor recent detections in the log
POST http://your-server-ip:5000/esp32
Content-Type: multipart/form-data
Field name: image
{
"success": true,
"timestamp": "2025-10-17 14:30:45",
"detections": [
{
"name": "John Doe",
"confidence": "87.5%"
}
],
"num_detections": 1,
"original_image": "images/received/esp32_20251017_143045_123456.jpg",
"processed_image": "images/processed/processed_20251017_143045_123456.jpg"
}#include <WiFi.h>
#include <HTTPClient.h>
#include "esp_camera.h"
const char* ssid = "YOUR_WIFI_SSID";
const char* password = "YOUR_WIFI_PASSWORD";
const char* serverUrl = "http://192.168.1.100:5000/esp32";
void setup() {
Serial.begin(115200);
// Connect to WiFi
WiFi.begin(ssid, password);
while (WiFi.status() != WL_CONNECTED) {
delay(500);
Serial.print(".");
}
Serial.println("\nWiFi connected");
// Initialize camera
camera_config_t config;
config.ledc_channel = LEDC_CHANNEL_0;
config.ledc_timer = LEDC_TIMER_0;
config.pin_d0 = Y2_GPIO_NUM;
config.pin_d1 = Y3_GPIO_NUM;
config.pin_d2 = Y4_GPIO_NUM;
config.pin_d3 = Y5_GPIO_NUM;
config.pin_d4 = Y6_GPIO_NUM;
config.pin_d5 = Y7_GPIO_NUM;
config.pin_d6 = Y8_GPIO_NUM;
config.pin_d7 = Y9_GPIO_NUM;
config.pin_xclk = XCLK_GPIO_NUM;
config.pin_pclk = PCLK_GPIO_NUM;
config.pin_vsync = VSYNC_GPIO_NUM;
config.pin_href = HREF_GPIO_NUM;
config.pin_sscb_sda = SIOD_GPIO_NUM;
config.pin_sscb_scl = SIOC_GPIO_NUM;
config.pin_pwdn = PWDN_GPIO_NUM;
config.pin_reset = RESET_GPIO_NUM;
config.xclk_freq_hz = 20000000;
config.pixel_format = PIXFORMAT_JPEG;
config.frame_size = FRAMESIZE_VGA;
config.jpeg_quality = 10;
config.fb_count = 1;
esp_camera_init(&config);
}
void loop() {
// Capture photo
camera_fb_t * fb = esp_camera_fb_get();
if (!fb) {
Serial.println("Camera capture failed");
return;
}
// Send to server
HTTPClient http;
http.begin(serverUrl);
String boundary = "----WebKitFormBoundary7MA4YWxkTrZu0gW";
http.addHeader("Content-Type", "multipart/form-data; boundary=" + boundary);
String head = "--" + boundary + "\r\n";
head += "Content-Disposition: form-data; name=\"image\"; filename=\"esp32.jpg\"\r\n";
head += "Content-Type: image/jpeg\r\n\r\n";
String tail = "\r\n--" + boundary + "--\r\n";
uint32_t totalLen = head.length() + fb->len + tail.length();
uint8_t *payload = (uint8_t*)malloc(totalLen);
memcpy(payload, head.c_str(), head.length());
memcpy(payload + head.length(), fb->buf, fb->len);
memcpy(payload + head.length() + fb->len, tail.c_str(), tail.length());
int httpResponseCode = http.POST(payload, totalLen);
if (httpResponseCode > 0) {
String response = http.getString();
Serial.println("Response: " + response);
} else {
Serial.println("Error: " + String(httpResponseCode));
}
free(payload);
esp_camera_fb_return(fb);
http.end();
delay(5000); // Wait 5 seconds before next capture
}infineon-flask/
├── app.py # Main Flask application
├── requirements.txt # Python dependencies
├── detections.csv # Detection logs (auto-created)
├── my_model/ # TensorFlow model directory
│ ├── model.json
│ ├── metadata.json
│ └── weights.bin
├── images/
│ ├── received/ # Original images from ESP32
│ └── processed/ # Annotated output images
└── templates/
└── index.html # Web interface
Receive and process image from ESP32
- Body:
multipart/form-datawithimagefield - Returns: Detection results and image paths
Test endpoint for web interface (same as /esp32)
Get recent detection logs
- Returns: JSON array of recent detections
The detections.csv file contains:
timestamp: Detection timeperson_name: Identified personconfidence: Model confidence score (0-1)image_path: Original image locationprocessed_image_path: Annotated image location
Edit these variables in app.py:
UPLOAD_FOLDER = 'images/received' # Received images directory
PROCESSED_FOLDER = 'images/processed' # Processed images directory
MODEL_PATH = 'my_model' # TensorFlow model path
CSV_FILE = 'detections.csv' # Detection log fileConfidence threshold (line 109):
if confidence > 0.5: # Adjust threshold (0.0 - 1.0)- Visit Teachable Machine
- Create image classes for each person
- Upload training images (20+ per person recommended)
- Train the model
- Export as "TensorFlow" format
- Extract and place in
my_model/directory
Ensure your model:
- Accepts input shape
(224, 224, 3)or modifypreprocess_face_for_model() - Outputs class probabilities
- Includes
metadata.jsonwith labels array
- Check
my_model/directory contains all required files - Verify TensorFlow version compatibility
- Check console for error messages
- Ensure good lighting in images
- Faces should be frontal and clear
- Adjust Haar Cascade parameters in
detect_faces()function
- Retrain model with more/better images
- Adjust confidence threshold
- Ensure consistent lighting conditions
- Verify server IP address is correct
- Check firewall settings
- Ensure both devices on same network
- Test endpoint with web interface first
- Face Detection: Using Haar Cascade for speed. For better accuracy, switch to DNN-based detection
- Model Size: Smaller models = faster inference. Consider MobileNet architecture
- Image Resolution: ESP32 should send 640x480 or smaller for best performance
- Batch Processing: Current implementation processes one face at a time
This project is free to use and modify for personal and commercial purposes.
- Face detection: OpenCV Haar Cascade
- Face recognition: TensorFlow
- Web framework: Flask