-
Notifications
You must be signed in to change notification settings - Fork 4
G5: Baby Monitoring System
| Name | GitHub |
|---|---|
| Aly Elaswad | alyelaswad |
| Mazin Bersy | mazinbersy |
| Omar Ganna | omarganna |
Github Repo: https://github.com/mazinbersy/Baby-Monitoring-System
Caregivers cannot maintain constant presence near a baby, and existing monitors are either too simple or too expensive. They tell you nothing is wrong, but they cannot tell you why or what triggered a concern.
Smart Baby Monitoring System is a self-contained embedded device that monitors a baby across three dimensions simultaneously: sound, motion, and environment. It detects infant crying using FFT-based audio analysis, monitors ambient temperature, and detects prolonged inactivity. When distress is detected, the system first attempts to soothe the baby by playing a lullaby automatically. If crying persists, it escalates to a caregiver push notification with a live video stream.
All of this runs on dedicated embedded hardware, streaming wirelessly over WiFi using HTTP POST requests, with no physical connection required from the caregiver.
- Detect infant crying using FFT-based audio analysis on the MAX9814 microphone
- Play lullaby automatically via DFPlayer Mini within 30 seconds of cry detection
- Monitor ambient temperature via LM35 and alert if outside safe range (< 18°C or > 30°C)
- Detect prolonged inactivity via HC-SR501 PIR and alert if no motion for > 5 minutes
- Deliver mobile push notifications via WiFi using HTTP POST on any alert
- Activate event-triggered live video stream on any alert
- Remote camera toggle — turn camera on/off from the app
- [❌] Two-way audio — speak through app, baby hears through speaker

The system uses two connected boards working together:
- The STM32L432KC (Nucleo) handles sensing, sound analysis, and playing lullabies.
- The ESP32-CAM handles Wi-Fi communication and live video streaming.
The STM32 Nucleo runs several FreeRTOS tasks. It continuously records audio from the microphone and analyzes it using an FFT to detect baby crying. If crying is detected for a long enough period, it plays a lullaby using the DFPlayer Mini and sends a CRY alert to the ESP32-CAM.
The Nucleo also:
- Monitors temperature using the LM35DZ sensor and sends alerts if it becomes too high or too low.
- Detects movement using the PIR sensor. If no movement is detected for 5 minutes, it sends a
NOMOValert. - Includes a sleep-watch mode that tracks motion over time to detect when a sleeping baby wakes up and sends an
AWAKEalert.
The ESP32-CAM receives these alerts from the Nucleo and sends them to a Railway-hosted server, which then pushes notifications to the mobile app. It also receives commands from the app, such as enabling or disabling sleep-watch mode. In addition, the ESP32-CAM captures and uploads camera frames to support live video streaming.
| Component | Photo | Role | Interface |
|---|---|---|---|
| STM32L432KC (Nucleo-32) | — | Central MCU, FFT processing, sensor fusion, DFPlayer control | ADC, UART, GPIO, FreeRTOS |
| ESP32-CAM (AI-Thinker) | ![]() |
WiFi communication, live video streaming | UART, WiFi, Camera |
| Microphone | ![]() |
Microphone | Analog out to ADC |
| LM35 | ![]() |
Temperature sensing | Analog out to ADC |
| HC-SR501 | ![]() |
PIR motion detection | Digital GPIO |
| DFPlayer Mini (MP3-TF-16P) | ![]() |
MP3 lullaby playback | UART |
| 3W Speaker | ![]() |
Audio output for lullabies | Direct to DFPlayer Mini |
| STM32 Pin | Function | Connect To |
|---|---|---|
| PA6 | ADC1_IN11 | Mic analog OUT |
| 3.3V | Power | Mic VCC |
| GND | Ground | Mic GND |
| STM32 Pin | Function | Connect To |
|---|---|---|
| PA0 | ADC1_IN5 | LM35DZ Vout |
| 3.3V | Power | LM35DZ +Vs |
| GND | Ground | LM35DZ GND |
| STM32 Pin | Function | Connect To |
|---|---|---|
| PA9 | USART1_TX | DFPlayer RX |
| PA10 | USART1_RX | DFPlayer TX |
| 5V | Power | DFPlayer VCC |
| GND | Ground | DFPlayer GND |
| — | — | DFPlayer SPK1/SPK2 → Speaker |
| STM32 Pin | Function | Connect To |
|---|---|---|
| PA4 | GPIO_INPUT (PULLDOWN) | Sensor OUT |
| 5V | Power | HC-SR501 VCC |
| GND | Ground | Sensor GND |
| STM32 Pin | Function | Connect To |
|---|---|---|
| PA2 | USART2_TX | ESP32-CAM GPIO 13 (RX) |
| PA3 | USART2_RX | ESP32-CAM GPIO 14 (TX) |
| GND | Common ground | ESP32-CAM GND |
| ESP32-CAM Pin | Function | Connect To |
|---|---|---|
| GPIO 13 | Serial2 RX | Nucleo PA2 (USART2_TX) |
| GPIO 14 | Serial2 TX | Nucleo PA3 (USART2_RX) |
| GND | Common ground | Nucleo GND |
No pins. Connects to access point and communicates with the HTTPS server (Railway).
STM32 Nucleo
PA0 ──── LM35DZ Vout (temperature)
PA4 ──── HC-SR501 OUT (PIR motion)
PA6 ──── Microphone OUT (audio)
PA9 ──→ DFPlayer RX (USART1 TX)
PA10 ──← DFPlayer TX (USART1 RX)
PA2 ──→ ESP32-CAM GPIO 13 (USART2 TX)
PA3 ──← ESP32-CAM GPIO 14 (USART2 RX)
ESP32-CAM
GPIO 13 ──← Nucleo PA2 (receives CRY / NOMOV / AWAKE / TEMP_HIGH / TEMP_LOW)
GPIO 14 ──→ Nucleo PA3 (sends SLEEP_ON / SLEEP_OFF)
GND ──── Nucleo GND (common ground)
[Camera] ──→ JPEG frames → HTTPS server (Railway)
[WiFi] ──→ alerts / mode / status → HTTPS server
| Component | Model | Cost (EGP) |
|---|---|---|
| Microcontroller Board | STM32L432KC (Nucleo-32) | 750 |
| WiFi + Camera Module | ESP32-CAM (AI-Thinker) | 350 |
| Microphone Amplifier | MAX9814 | 185 |
| Audio Player | DFPlayer Mini | 150 |
| Speaker | 3W Speaker | 160 |
| Temperature Sensor | LM35DZ | 60 |
| PIR Motion Sensor | HC-SR501 | 70 |
| Total | 1725 |
P = V × I.
| Component | Voltage | Typical Current | Power |
|---|---|---|---|
| STM32L432KC (80 MHz, ADC DMA + 2× UART) | 3.3 V | 15 mA | 49.5 mW |
| MAX9814 microphone amplifier | 3.3 V | 3.5 mA | 11.6 mW |
| LM35DZ temperature sensor | 3.3 V | 0.1 mA | 0.3 mW |
| Onboard LED LD3 (cry-alert blink, average) | 3.3 V | 1 mA | 3.3 mW |
| 3.3 V Rail Total | 19.6 mA | 64.7 mW |
| Component | Voltage | Typical Current | Power |
|---|---|---|---|
| Nucleo board (ST-LINK + LDO overhead) | 5 V | 50 mA | 250 mW |
| HC-SR501 PIR motion sensor | 5 V | 0.1 mA | 0.5 mW |
| DFPlayer Mini + speaker (moderate volume) | 5 V | 120 mA | 600 mW |
| 5 V Rail Total | 170 mA | 850 mW |
| Component | Voltage | Typical Current | Power |
|---|---|---|---|
| ESP32-CAM (WiFi active + camera streaming) | 3.3 V | 200 mA | 660 mW |
| ESP32-CAM Rail Total | 200 mA | 660 mW |
- Cry detection via FFT sampling at 8 kHz; alert triggered after a 30-second majority-vote window confirms sustained crying
- Lullaby playback initiated automatically via DFPlayer Mini within 30 seconds of cry onset
- Temperature sampled every 5 seconds via LM35 ADC; alert triggered if temp > 30 °C or < 18 °C
- PIR motion sampled continuously; alert triggered if no motion detected for > 5 minutes
- Event-triggered video stream activated within 5 seconds of any alert
- Mobile push notification delivered via WiFi using HTTP POST within 5 seconds of any alert
The firmware runs on two microcontrollers communicating over USART2/Serial2 at 115 200 baud.
The STM32L432KC (Nucleo) runs six FreeRTOS tasks under CMSIS-RTOS V2:
| Task | Priority | Stack | Responsibility |
|---|---|---|---|
defaultTask |
Idle | 512 B | Idle placeholder |
AudioCapture |
Realtime | 1 KB | Starts ADC DMA + TIM1 trigger at 8 kHz |
FFT |
Normal | 2 KB | 1024-point Hann-windowed FFT, 30-second majority vote → CRY/QUIET |
Alert |
Above Normal | 1 KB | Plays DFPlayer lullaby and blinks LD3 on CRY; sends CRY\n over USART2 |
PIR |
Low | 512 B | HC-SR501 polling; NOMOV after 5 min no motion; baby wakeup detection in sleep mode |
Temp |
Low | 512 B | LM35DZ injected ADC read every 5 s; sends TEMP_HIGH / TEMP_LOW over USART2 |
AudioQueue passes 1024-sample ADC buffer pointers from AudioCapture to FFT. ResultQueue passes CRY/QUIET results from FFT to Alert. USART2 RX is interrupt-driven, assembling incoming bytes into a command string for the PIR task to consume.
The ESP32-CAM runs a single Arduino loop that reads alert strings from Serial2, dispatches HTTP POST alerts to the Railway server, polls /api/mode every 3 s to forward parent commands to Nucleo, polls /api/status every 2 s to toggle streaming, and uploads JPEG frames when streaming is active.

Sensor fusion logic:
| Condition | Action |
|---|---|
| Cry detected (30-s window, ≥ 60/234 frames score as CRY) | Play lullaby via DFPlayer Mini; send CRY\n to ESP32-CAM |
| No PIR motion > 5 min | Send NOMOV\n to ESP32-CAM; enter sleep-watch mode |
| Motion detected in sleep-watch window (> 8/30 samples over 75 s) | Send AWAKE\n to ESP32-CAM; return to awake mode |
| Temperature out of range | Send TEMP_HIGH\n or TEMP_LOW\n to ESP32-CAM |
| SLEEP_ON / SLEEP_OFF received from ESP32-CAM | Switch PIR task between sleep-watch and awake-watch mode |
FFT Cry Detection
The MAX9814 analog output is sampled at 8 kHz via ADC DMA triggered by TIM1. Every 1024 samples (~128 ms), a window is applied and an FFT is computed. Five frequency-band energy percentages, spectral centroid, peak frequency, and spectral prominence are extracted and combined into a 100-point score. Frames scoring ≥ 75 vote as CRY. Over a 30-second window (234 frames), if ≥ 60 frames vote CRY and the audio was not continuously silent, a cry event is raised.
- STM32CubeIDE was used for Nucleo peripheral configuration (ADC with DMA, TIM1 at 8 kHz, USART1/2, GPIO) and firmware development using HAL drivers and CMSIS-RTOS V2 (FreeRTOS).
- Railway was used to host the server.
- GitHub was used for version control and to host the wiki.
Cry detection pipeline
- ADC DMA sampling verified at 8 kHz by checking buffer fill rate in the FFT task.
- FFT output verified against known audio inputs to confirm frequency-bin mapping.
- Score threshold and vote threshold tuned using infant cry recordings to eliminate false triggers from speech and music.
Motion subsystem
- HC-SR501 output on PA4 verified to go HIGH on movement and LOW after holdoff.
- No-motion alert confirmed to fire after the configured timeout.
- Sleep-mode wakeup window confirmed to send
AWAKE\nwhen motion count exceeds threshold.
DFPlayer Mini
- Binary command frame with checksum verified to start playback on track 1 within 1 s of a cry event.
- Stop command confirmed to silence output when the FFT window returns QUIET.
USART2 / Serial2 link
- Interrupt-driven ISR on Nucleo verified to correctly assemble
CRY,NOMOV,TEMP_HIGH, andTEMP_LOWstrings. - ESP32-CAM Serial2 confirmed to receive and dispatch each alert string to the server.
-
SLEEP_ONandSLEEP_OFFcommands confirmed to reach Nucleo and flip thebaby_sleepingflag.
WiFi and server
- HTTP POST verified to reach the Railway server and trigger a mobile push notification.
- Mode poll verified to pick up app changes within one 3-second poll cycle.
Ran the complete system with all sensing pipelines active simultaneously. Played infant cry audio near the microphone and confirmed lullaby playback started within the 30-second window closing and the CRY alert appeared on the mobile app. Applied heat near the LM35DZ and confirmed a TEMP_HIGH alert within one 5-second sampling cycle. Blocked the PIR sensor for 5 minutes and confirmed the NOMOV alert sent and the system transitioned to sleep-watch mode. Set sleep mode ON from the app and confirmed SLEEP_ON propagated through the server, ESP32-CAM, and Nucleo within one poll cycle. Simulated movement in sleep mode and confirmed the 75-second wakeup window correctly sent AWAKE and reverted the system to awake mode. Toggled the camera stream from the app and confirmed JPEG frames appeared in the web viewer.
| Challenge | Detail | Solution |
|---|---|---|
| Architecture pivot | ESP32-CAM ADC is unusable while WiFi is active, making concurrent audio sampling and streaming impossible on a single chip | Split responsibilities: Nucleo handles all sensing, FFT, and DFPlayer; ESP32-CAM handles camera and WiFi only |
| Microphone ADC pin conflict | PA3 was originally planned for the microphone ADC but is shared with USART2 RX needed for the ESP32-CAM link | Moved microphone input to PA6 (ADC1_IN11), freeing PA3 for USART2 |
| FFT threshold calibration | Initial score thresholds produced false cry triggers on background speech and TV audio | Tuned score weights and vote threshold against infant cry recordings at multiple distances |
| PIR warm-up false triggers | HC-SR501 generates spurious detections for ~60 s after power-on | Added a 60-second startup delay in the PIR task before monitoring begins |
| UART message loss | Missing newline terminators caused the ESP32-CAM to buffer incomplete alert strings | Added explicit \n to every Nucleo UART transmission; ISR resets the buffer index on each \n
|
| Two-way audio | DFPlayer Mini has no audio input path; routing a server audio stream to the speaker had no viable hardware path | Dropped from scope |
| Metric | Target | Achieved |
|---|---|---|
| Cry detection window | 30 s | 30 s (234 frames × 128 ms) |
| Lullaby start after cry event | < 30 s | < 30 s |
| Push notification delivery | < 5 s from event | ~3–4 s over home WiFi |
| Camera stream activation | < 5 s from alert | ~4–5 s |
| Temperature measurement accuracy | ±1 °C | ±1 °C vs. reference thermometer |
| No-motion alert trigger | 5 min no movement | Confirmed |
| SLEEP_ON propagation (app → Nucleo) | < 3 s | ~3 s (one poll cycle) |
Aly: worked on the FFT baby cry detection and the railway/mobile app hosting Omar: worked on the camera integration and temperature and motion sensors Mazin: worked on the sound player and the alert signaling
| Date | Milestone | Status | Date of Completion |
|---|---|---|---|
| Apr 14, 2026 | Team formation finalized and submitted | ✅ Completed | Apr 14, 2026 |
| Apr 15, 2026 | Proposal presentation | ✅ Completed | Apr 15, 2026 |
| Apr 20, 2026 | Wiki deployment with proposal and architecture | ✅ Completed | Apr 20, 2026 |
| Apr 22–25, 2026 | Phase 1: Sensor validation — MAX9814 ADC, LM35 ADC, PIR GPIO | ✅ Completed | Apr 25, 2026 |
| Apr 26–29, 2026 | Phase 2: Core processing — FFT pipeline, DFPlayer playback, ESP32-CAM stream | ✅ Completed | Apr 29, 2026 |
| Apr 29, 2026 | Milestone 3: Progress demo — at least one working subsystem | ✅ Completed | Apr 29, 2026 |
| May 1–5, 2026 | Phase 3: Full integration — sensor fusion, WiFi alerts, Nucleo–ESP32 link | ✅ Completed | May 5, 2026 |
| May 6, 2026 | Checkpoint B: Integration update on wiki | ✅ Completed | May 6, 2026 |
| May 8–12, 2026 | Phase 4: Stretch goals — remote camera toggle ✅ / two-way audio ❌ | ✅ Completed | May 12, 2026 |
| May 13, 2026 | Final demo and presentation | ✅ Completed | May 13, 2026 |
GitHub Repo: https://github.com/mazinbersy/Baby-Monitoring-System
- ESP32-CAM AI-Thinker datasheet
- MAX9814 datasheet — Maxim Integrated
- DFRobotDFPlayerMini Arduino library
- HC-SR501 PIR sensor datasheet
- LM35 datasheet — Texas Instruments
- STM32L432KC datasheet — STMicroelectronics
- https://github.com/Wendy-Nam/IoT-BabyCryDetection





