Skip to content

dorszewski/cci

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Seeing Conversations: Communication Context Identification in Egocentric Video

Code for training and evaluating the Communication Context Network (CoCoNet) for Communication Context Identification (CCI): Given an egocentric video stream with detected face bounding boxes, determine whether each individual is part of the camera wearer’s conversation group.

CCI is predicted for a set of face feature extracted from egocentric video as described in the Seeing Conversations paper.

Usage

Enviroment setup

Code was developed and tested with python 3.12.1

python3 -m venv .venv
source .venv/bin/activate   # Linux
# .\.venv\Scripts\activate    # Windows

pip install -e .
pip install -r requirements.txt
# install torch appropriately for your GPU

Download data

Download the data from https://doi.org/10.11583/DTU.31545667, unzip and place it in ./data.

Training and Inference

Config can be changed in ./config/run_config/coconet.yaml. E.g., for smaller GPU or CPU reduce batch_size and segment_length_load. With default parameters as used in the paper CoCoNet converges within 2h on one 32GB GPU.

To run training and inference:

python -m cci.temp_cci.main --config "config/run_config/coconet.yaml"

Model checkpoint and evaluation will be placed in ./results/runs/coconet

Citation

@inproceedings{dorszewski2026seeing,
  author={Tobias Dorszewski and Jens Hjortkj\ae r},
  title={Seeing Conversations: Communication Context Identification in Egocentric Video},
  year={2026},
  booktitle={CVPR}
}

About

Code for training and inference of the CoCoNet model for Communication Context Identification.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages