DrowsinessNet is a deep learning-based project that detects driver drowsiness from video feeds (webcam or camera). It utilizes a robust hybrid architecture combining Convolutional Neural Networks (CNN) for spatial feature extraction, Long Short-Term Memory (LSTM) networks for temporal pattern learning, and a Soft Attention mechanism to focus on the most critical frames.
- Hybrid Architecture: CNN + LSTM + Attention Mechanism for high-accuracy temporal modeling.
- Accurate Predictions: Achieves ~93.6% accuracy and ~93.8% F1-Score on the test set.
- Real-time Processing Capability: Uses sliding window approaches with specific sequence lengths for smooth and continuous predictions.
- Web Demo (FastAPI): Includes a complete FastAPI-based web inference backend (
web_app/app.py), allowing both pre-loaded demo videos and user uploads.
The model processes sequences of image frames (90 frames equivalent to ~3 seconds at 30 fps) to classify the state as either Alert (0) or Drowsy (1).
- CNN Layer (Spatial Features):
- Grayscale images (1 channel, 64x64 size) are passed through 2 Convolutional Blocks.
- Each block uses
Conv2d,BatchNorm2d,ReLU,MaxPool2d, andDropout2d.
- LSTM Layer (Temporal Patterns):
- The extracted spatial features are flattened and passed to a 2-layer LSTM with 128 hidden units.
- Soft Attention Layer:
- A linear attention layer computes weights for each frame in the sequence, producing a weighted context vector.
- Classification:
- A Fully Connected (FC) layer outputs the final logic prediction (Alert vs. Drowsy).
DrowsinessNet/
├── notebooks/ # Notebooks for data preparation and training
│ ├── DATA PREPROCESSING .ipynb
│ └── Train.ipynb
├── models/ # Saved trained PyTorch model weights
│ └── best_model.pth
├── web_app/ # Web Application Backend (FastAPI)
│ ├── app.py # FastAPI server and routing
│ ├── model.py # Singleton model loading & inference functions
│ ├── static/ # Static assets (CSS, JS)
│ ├── templates/ # HTML templates (index.html)
│ └── uploads/ # Temporary directory for user-uploaded videos
├── assets/ # Sample videos and evaluation plots
│ ├── test_drowsy.mp4
│ ├── test_nondrowsy.mp4
│ ├── confusion_matrix.png
│ ├── model_flow_diagram.png
│ └── training_history.png
├── requirements.txt # Python dependencies
└── README.md
- Input size: 64x64 Grayscale
- Sequence Length: 90 frames
- Optimizer: Adam (LR: 5e-5, Weight Decay: 1e-4) with
ReduceLROnPlateauscheduler. - Loss Function:
BCEWithLogitsLoss - Evaluation Metrics (on Test Set):
- Accuracy: 93.62%
- Precision: 92.00%
- Recall: 95.83%
- F1-Score: 93.88%
Install the required dependencies from the requirements.txt file using pip:
pip install -r requirements.txtNavigate into the web_app directory and run the FastAPI application via Uvicorn:
cd web_app
python app.pyOr run directly with uvicorn:
uvicorn app:app --host 0.0.0.0 --port 8000Open your browser and navigate to http://localhost:8000 to access the interactive user interface. You can upload your own mp4 videos or test with the available demo recordings.
If you wish to retrain or modify the model:
- Put your preprocessed data appropriately inside the
../Data/train_datafolder or as modified in the notebook. - Run all the cells in the
Train.ipynbnotebook to initiate training, evaluate results, and save the updatedbest_model.pth.