Interview System

A real-time multi-modal interview analysis system that combines pose detection, speech-to-text, and emotion recognition to provide comprehensive behavioral insights during interviews.

🚀 Quick Start

Performance Optimized Versions (NEW!)

For CPU-only systems:

python interview_system_v6_optimized.py

Delivers 18-22 FPS (3-4x faster than baseline) with frame skipping and lightweight models

For GPU systems:

python interview_system_onnx_full.py

Delivers 35-60 FPS with ONNX Runtime and hardware acceleration

📖 Full Performance Guide

⚡ Performance Improvements

NEW Performance Optimized Versions Available!

Version	FPS (CPU)	FPS (GPU)	Memory	Best For
v5 (baseline)	6-10	30-35	900 MB	Reference
v6 optimized	18-22	30-35	550 MB	CPU users
ONNX full	12-15	35-60	280 MB	GPU users

Key Improvements:

✅ 3-6x FPS improvement (6-10 → 18-60 FPS)
✅ 40% memory reduction (900 → 280-550 MB)
✅ 60% faster startup (10s → 3-5s)
✅ Configurable performance vs accuracy
✅ GPU acceleration (CUDA, DirectML)

📊 Performance Results & Benchmarks

Features

Core Capabilities

Real-time Pose Detection: Detects 10+ body language actions
- Arms crossed, hands clasped, chin rest
- Lean forward/backward, head down
- Touch face/nose, fix hair, fidget hands
Speech-to-Text: Live transcription using Whisper
- Multi-language support
- Accurate transcription with timestamps
Emotion Analysis:
- Facial emotion detection (optional, configurable)
- Voice emotion analysis (energy-based)
Multi-threading: Concurrent video and audio processing
Comprehensive Logging: JSON exports with timestamps
- Individual logs per feature
- Combined log for integrated analysis

Available Versions

Production Versions

Version	Description	FPS	Use Case
v6_optimized.py	Performance optimized with frame skipping	18-22	CPU-only systems
onnx_full.py	ONNX Runtime with GPU acceleration	35-60	GPU systems

Legacy Versions

Version	Description	FPS	Status
interview_system.py (v1)	Basic pose + STT	6-10	Stable
interview_system_v2.py	+ Facial emotion	3-4	Laggy
interview_system_v3.py	Enhanced emotion	3-4	Laggy
interview_system_v4.py	+ Voice emotion	6-10	Stable
interview_system_v5.py	Separate logs	6-10	Stable

Installation

Basic Dependencies

pip install opencv-python numpy ultralytics
pip install faster-whisper sounddevice
pip install deepface  # Optional, for facial emotion

For ONNX Acceleration (Recommended)

CPU or Any GPU (Windows):

pip install onnxruntime-directml  # DirectML support

NVIDIA GPU (Best Performance):

pip install onnxruntime-gpu  # CUDA support

CPU Only:

pip install onnxruntime  # Fallback

Model Files

yolov8n-pose.pt - Lightweight pose model (6.6 MB)
yolo11m-pose.pt - Standard pose model (41 MB)
yolo11m-pose.onnx - ONNX pose model (81 MB)

Models are downloaded automatically on first run.

Usage

v6 Optimized (CPU-Friendly)

python interview_system_v6_optimized.py

Configuration Options:

USE_LIGHTWEIGHT_MODEL = True   # YOLOv8n-pose (3x faster)
SKIP_FRAMES = 2                # Process every 3rd frame
EMOTION_CHECK_INTERVAL = 30    # Check emotion every 30 frames
ENABLE_FACIAL_EMOTION = True   # Set False to disable

Performance: 18-22 FPS on CPU

ONNX Full (GPU-Accelerated)

python interview_system_onnx_full.py

Auto-Detects:

CUDA (NVIDIA GPU) - Fastest
DirectML (Any GPU on Windows) - Fast
CPU - Fallback

Performance: 35-60 FPS with GPU

Legacy Version (v5)

python interview_system_v5.py

Performance: 6-10 FPS on CPU (baseline)

Configuration Guide

For Different Hardware

Budget Laptop (CPU-only)

# v6_optimized.py
USE_LIGHTWEIGHT_MODEL = True
SKIP_FRAMES = 2
ENABLE_FACIAL_EMOTION = False

Expected: 15-18 FPS, 250 MB RAM

Mid-Range Desktop

# v6_optimized.py or onnx_full.py
SKIP_FRAMES = 1
ENABLE_FACIAL_EMOTION = True

Expected: 20-30 FPS, 280-350 MB RAM

Gaming Laptop/Desktop (GPU)

# onnx_full.py
SKIP_FRAMES = 0  # No skipping needed
ENABLE_FACIAL_EMOTION = True

Expected: 40-60 FPS, 250-300 MB RAM

Output Files

Generated Logs

All versions produce JSON logs with timestamps:

action_log.json - Detected body language actions
transcription_log.json - Speech-to-text results
voice_emotion_log.json - Voice emotion analysis
facial_emotion_log.json - Facial emotion analysis
combined_log.json - Merged data per second

Example:

{
  "time": "00:05",
  "timestamp_seconds": 5.0,
  "actions": ["arms_crossed", "lean_back"],
  "texts": ["I'm ready for the interview"],
  "facial_emotions": ["neutral"],
  "voice_emotions": ["calm"]
}

Performance Benchmarking

Test your system's performance:

python benchmark_performance.py

Output:

YOLOv8n-pose vs YOLO11m-pose comparison
PyTorch vs ONNX comparison
CPU vs GPU performance
Improvement percentages

Documentation

Performance Optimization

Video Processing

Keyboard Controls

Key	Action
`q`	Quit and save logs
ESC	Alternative quit

System Requirements

Minimum (CPU-only)

CPU: Intel i5-8th gen or AMD Ryzen 5
RAM: 8 GB
Storage: 500 MB for models
OS: Windows 10+, Linux, macOS

Configuration: v6 with YOLOv8n + skip=2
Performance: 15-18 FPS

Recommended (GPU)

CPU: Intel i7-10th gen or AMD Ryzen 7
GPU: NVIDIA GTX 1650+ or AMD RX 6600+
RAM: 8 GB
VRAM: 2 GB

Configuration: ONNX full with DirectML/CUDA
Performance: 35-40 FPS

Optimal (High-end)

CPU: Intel i7-12th gen or AMD Ryzen 7 5000+
GPU: NVIDIA RTX 3060+ or AMD RX 6700+
RAM: 16 GB
VRAM: 4 GB

Configuration: ONNX full with CUDA
Performance: 50-60 FPS

Troubleshooting

Low FPS (< 10 FPS)

Switch to v6 optimized
Enable lightweight model: USE_LIGHTWEIGHT_MODEL = True
Increase frame skipping: SKIP_FRAMES = 3
Disable facial emotion: ENABLE_FACIAL_EMOTION = False

GPU Not Being Used (ONNX)

Install correct ONNX Runtime:
- NVIDIA: pip install onnxruntime-gpu
- Windows (any GPU): pip install onnxruntime-directml
Update GPU drivers
Check providers at startup (printed in console)

High Memory Usage

Use lightweight model: USE_LIGHTWEIGHT_MODEL = True
Disable facial emotion: ENABLE_FACIAL_EMOTION = False
Use ONNX version (more efficient)

Known Limitations

YOLOv8n-pose (Lightweight Model)

3-5% lower accuracy vs YOLO11m-pose
Sufficient for interview body language
Not suitable for fine-grained hand gestures

Frame Skipping

33-66ms delay in action updates
Acceptable for sitting subjects
Not suitable for fast-paced activities

Facial Emotion

Requires good lighting
Can be slow on CPU (120ms per check)
Consider disabling on low-end hardware

Future Enhancements

Planned Optimizations

Adaptive frame skipping (motion-based)
Async emotion detection (non-blocking)
FP16 quantized models (2x speedup)
TensorRT optimization (NVIDIA)
Batch processing support

Additional Features

Multi-person interview support
Action heatmap visualization
Real-time feedback dashboard
Cloud integration options

Performance Summary

Optimization	Speedup	Memory Saved	Effort
Frame Skipping	3x	-	Low
Lightweight Model	3x	200 MB	Minimal
ONNX Runtime	1.5-6x	100 MB	Low
Emotion Frequency	Fixes v2-v3	-	Low
Combined (CPU)	3-4x	300-350 MB	Low
Combined (GPU)	4-6x	400-600 MB	Low

Contributing

Adding New Actions

See: Adding New Actions Guide

Improving Performance

Profile with benchmark_performance.py
Implement optimization
Measure improvement
Document in performance docs

Documentation

Update relevant markdown files
Include performance metrics
Provide code examples
Note any trade-offs

License

This project is available for educational and research purposes.

Citation

If you use this system in your research, please cite:

Interview System - Real-time Multi-modal Interview Analysis
https://github.com/yb235/Interview_System

Changelog

v6 Optimized (2025-11-23)

✨ Frame skipping implementation (3x effective speedup)
✨ Lightweight model option (YOLOv8n-pose)
✨ Configurable emotion detection frequency
✨ Lazy model loading (50% faster startup)
📊 18-22 FPS on CPU (vs 6-10 baseline)

ONNX Full (2025-11-23)

✨ ONNX Runtime integration
✨ Hardware acceleration (CUDA, DirectML)
✨ Auto-detect best provider
✨ Optimized preprocessing
📊 35-60 FPS with GPU (vs 30-35 PyTorch)

Performance Documentation (2025-11-23)

📝 Comprehensive performance analysis
📝 Improvement plan with metrics
📝 Code examples and templates
📝 Performance results with benchmarks
📝 Usage guide with troubleshooting

Contact

For questions, issues, or contributions:

Open an issue on GitHub
Include system specs and benchmark results
Reference relevant documentation

Status: ✅ Production Ready
Latest Version: v6 Optimized + ONNX Full
Last Updated: 2025-11-23

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
docs		docs
.gitattributes		.gitattributes
.gitignore		.gitignore
Deepface_test.py		Deepface_test.py
README.md		README.md
benchmark_performance.py		benchmark_performance.py
export_pose.py		export_pose.py
interview_system.py		interview_system.py
interview_system_onnx_full.py		interview_system_onnx_full.py
interview_system_v2.py		interview_system_v2.py
interview_system_v3.py		interview_system_v3.py
interview_system_v4.py		interview_system_v4.py
interview_system_v5.py		interview_system_v5.py
interview_system_v6_optimized.py		interview_system_v6_optimized.py
mic_test.py		mic_test.py
run_pose_onnx_dml.py		run_pose_onnx_dml.py
yolo11m-pose.onnx		yolo11m-pose.onnx
yolo11m-pose.pt		yolo11m-pose.pt
yolov8n-pose.pt		yolov8n-pose.pt

Folders and files

Latest commit

History

Repository files navigation

Interview System

🚀 Quick Start

Performance Optimized Versions (NEW!)

⚡ Performance Improvements

Features

Core Capabilities

Available Versions

Production Versions

Legacy Versions

Installation

Basic Dependencies

For ONNX Acceleration (Recommended)

Model Files

Usage

v6 Optimized (CPU-Friendly)

ONNX Full (GPU-Accelerated)

Legacy Version (v5)

Configuration Guide

For Different Hardware

Budget Laptop (CPU-only)

Mid-Range Desktop

Gaming Laptop/Desktop (GPU)

Output Files

Generated Logs

Performance Benchmarking

Documentation

Performance Optimization

Video Processing

Keyboard Controls

System Requirements

Minimum (CPU-only)

Recommended (GPU)

Optimal (High-end)

Troubleshooting

Low FPS (< 10 FPS)

GPU Not Being Used (ONNX)

High Memory Usage

Known Limitations

YOLOv8n-pose (Lightweight Model)

Frame Skipping

Facial Emotion

Future Enhancements

Planned Optimizations

Additional Features

Performance Summary

Contributing

Adding New Actions

Improving Performance

Documentation

License

Citation

Changelog

v6 Optimized (2025-11-23)

ONNX Full (2025-11-23)

Performance Documentation (2025-11-23)

Contact

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages