EdgeAttend

Real-time attentiveness monitoring for online meetings, powered by edge AI.

EdgeAttend detects whether each participant in a video call is attentive or not, entirely on the client device. A MobileNetV2-based binary classifier runs locally on every client machine. A lightweight server aggregates the results, assembles a shared grid view, and streams it back to all participants and to a browser-based monitor.

This project was developed as a part of Edge AI course at Indian Institute of Science, Bengaluru.

Project Structure

EdgeAttend/
├── client.py                        # Client app — webcam capture, local inference, server streaming
├── server.py                        # Server app — multi-client aggregator, grid composer
├── requirements.txt                 # Python dependencies
├── report.md                        # Full project report
├── plots/                           # Training and compression graphs
│   ├── accuracy.png
│   ├── auc.png
│   ├── loss.png
│   ├── struct_pruning_tradeoff_graph.png
│   ├── unstructured_pruning_tradeoff_graph.png
│   ├── all_attentive.png
│   ├── all_non_attentive.png
│   └── one_non_attentive.png
│  
│   
│
├── Training/                        # Data preparation and model training
│   ├── README.md                    # Training-specific instructions
│   ├── prepare_dataset.py           # Extracts face crops from DAiSEE videos
│   └── train.ipynb                  # Two-stage training notebook
│
└── Edge_Optimization/               # Model compression pipeline
    ├── README.md                    # Compression-specific instructions
    ├── labels.json                  # Class index → label mapping
    ├── quantize_model.py            # Post-training static INT8 quantization 
    ├── prune_model.py               # Unstructured (L1) pruning with optional fine-tuning
    ├── struct_prune_model.py        # Structural (channel) pruning with optional fine-tuning
    └── evaluate_model.py            # Unified benchmark — accuracy, speed, size for all variants

System Architecture

┌─────────────────────────────────────────────────────────────────┐
│                          CLIENT MACHINE                         │
│                                                                 │
│  Webcam → [Frame capture] → [Face detection (Haar cascade)]     │
│               ↓                                                 │
│         [AttentiveMobileNetV2]  ← attentive_model.pth           │
│         (local inference, batch of 5 frames)                    │
│               ↓                                                 │
│         Label + Score  ──MSG_ATTN──►  SERVER (port 9999)        │
│         JPEG frames    ──MSG_FRAME──► SERVER (port 9999)        │
│                                                                 │
│         ◄──MSG_GRID── JPEG Grid                                 │
│         [Draw own overlay on top-left] → cv2.imshow             │
└─────────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────────┐
│                          SERVER MACHINE                         │
│                                                                 │
│  TCP socket (port 9999) ← accepts multiple clients              │
│  One thread per client (ClientHandler)                          │
│               ↓                                                 │
│  Aggregates frames + attentiveness labels                       │
│               ↓                                                 │
│  [Grid encoder loop]                                            │
│   ├─ Annotated grid  → HTTP                                     │
│   └─ Clean grid      → MSG_GRID pushed to all clients           │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Prerequisites

Python 3.9 or later
A webcam on each client machine and server machine
The server and all clients must be on the same network
GPU is optional but recommended for training, inference runs on CPU

Step-by-Step: Reproduce the Project

1 - Install dependencies

pip install -r requirements.txt

torch-pruning requires PyTorch ≥ 2.0.
Install CUDA-enabled PyTorch first if you want GPU training:
pip install torch torchvision --index-url https://download.pytorch.org/whl/cu118

2 - Download the dataset

Go to https://people.iith.ac.in/vineethnb/resources/daisee/index.html and request access to the DAiSEE dataset.
Download and extract it so the directory layout matches:

Training/
└── DAiSEE/
    ├── DataSet/
    │   └── Train/
    │       └── {person_id}/
    │           └── {clip_id}/
    │               └── {video_file}.avi
    └── Labels/
        └── TrainLabels.csv

3 - Prepare the dataset

cd Training
python prepare_dataset.py

Output:

Training/dataset/
├── attentive/      # 5000 images  (A_0.jpg … A_N.jpg + aug_*.jpg)
└── not_attentive/  # 5000 images  (N_0.jpg … N_N.jpg + aug_*.jpg)

4 - Train the model

Open and run Training/train.ipynb sequentially. Outputs saved to the Training/ folder:

attentive_model.pth - best checkpoint
dataset_splits.json - train/val/test file-path lists

Copy both files to the project root and to Edge_Optimization/ before the next steps:

cp Training/attentive_model.pth .
cp Training/dataset_splits.json .
cp Training/attentive_model.pth Edge_Optimization/
cp Training/dataset_splits.json Edge_Optimization/

5 - Compress the model

All scripts in Edge_Optimization/ expect attentive_model.pth and dataset_splits.json in the same directory as the script.

5a - Post-training static quantization

cd Edge_Optimization
python quantize_model.py

Converts the FP32 model to INT8 using PyTorch FX graph mode with the qnnpack backend.
Output: attentive_model_quantized.pth

5b - Unstructured (L1) pruning

python prune_model.py

Tests pruning ratios from 10 % to 90 % and evaluates accuracy with and without 3-epoch fine-tuning.
Saves the two best trade-off models, e.g.:

best_unstructured_pruned_no_ft_90.pth
best_unstructured_pruned_ft_90.pth

Also saves plots/unstructured_pruning_tradeoff_graph.png.

5c - Structural (channel) pruning

python struct_prune_model.py

Removes channels using torch-pruning (MagnitudePruner) at ratios 10 %–90 %, with and without fine-tuning.
Saves the two best trade-off models, e.g.:

best_struct_pruned_no_ft_90.pth
best_struct_pruned_ft_90.pth

Also saves plots/struct_pruning_tradeoff_graph.png.

6 - Evaluate compressed models

cd Edge_Optimization
python evaluate_model.py

Loads all six model variants, evaluates accuracy on the held-out validation split, measures single-image inference latency (CPU), and records model size.

Outputs:

evaluation_results.json — machine-readable metrics table
evaluation_results.log — timestamped log

7 - Run the live system

7a - Start the server

Run on the central machine (can also be one of the participant machines):

python server.py

7b - Start each client

Run on each participant's machine. The model file must be present:

# Copy attentive_model_quantized.pth to the project root on each client machine, then:
python client.py

When prompted, enter the server's IP address (press Enter to use the default).

Enter server IP address [10.24.48.12]: 192.168.1.42

Press Q in the client window to disconnect.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

EdgeAttend

Table of Contents

Project Structure

System Architecture

Prerequisites

Step-by-Step: Reproduce the Project

1 - Install dependencies

2 - Download the dataset

3 - Prepare the dataset

4 - Train the model

5 - Compress the model

5a - Post-training static quantization

5b - Unstructured (L1) pruning

5c - Structural (channel) pruning

6 - Evaluate compressed models

7 - Run the live system

7a - Start the server

7b - Start each client

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 70 Commits
Edge_Optimization		Edge_Optimization
Training		Training
plots		plots
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
client.py		client.py
report.md		report.md
requirements.txt		requirements.txt
server.py		server.py

Folders and files

Latest commit

History

Repository files navigation

EdgeAttend

Table of Contents

Project Structure

System Architecture

Prerequisites

Step-by-Step: Reproduce the Project

1 - Install dependencies

2 - Download the dataset

3 - Prepare the dataset

4 - Train the model

5 - Compress the model

5a - Post-training static quantization

5b - Unstructured (L1) pruning

5c - Structural (channel) pruning

6 - Evaluate compressed models

7 - Run the live system

7a - Start the server

7b - Start each client

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages