GitHub - Mrshenshen/CoHa

Introduction

This is the source code of "Collaborated with Hallucination: Enhancing Egocentric Grounded Question Answering via Error Demonstrations".

Collaborated with Hallucination: Enhancing Egocentric Grounded Question Answering via Error Demonstrations

The CoHa model is structured with three integral components:

Uncertainty Modeling (UM): This component employs the Subjective Logic theory to model the unreliability degree of generated responses, which enables the model to assess the extent of penalty for improper answers.
Diffusion-based Contrastive Constraint (DCC): Designed to assign reasonable restraint to mitigate hallucinations according to uncertainty.
Interactive Refinement Module (IRM): Focused on adding more fine-grained cues observed from the first-person view.

Proposed Model (CoHa)

Uncertainty Modeling (UM)
Diffusion-based Contrastive Constraint (DCC)
Interactive Refinement Module (IRM)

Motivation

Compared with the widely used third-person view datasets on the Internet, the egocentric datasets are recorded in the first-person view. However, demonstrated by Egothink, existing Ego-GQA methods often treat video understanding from first-person and third-person views equivalently, neglecting the critical role of human-centric reasoning in egocentric contexts. This oversight leads to the model being more susceptible to hallucinations during the understanding of egocentric videos, whereby the model inaccurately focuses on objects or actions being interacted with by the subject.

Results

Ablation Study on QAEgo4D and NLQv2

Grounding and QA Examples

Usage

Training

CUDA_VISIBLE_DEVICES=0,1,2,3 python run.py \
    model=MILU \
    'dataset.qa_train_splits=[QaEgo4D_train]' \
    dataset.batch_size=4 \
    trainer.gpus=4

Testing

CUDA_VISIBLE_DEVICES=0 HYDRA_FULL_ERROR=1 python run.py \
    model=MILU \
    'dataset.qa_train_splits=[QaEgo4D_train]' \
    'dataset.test_splits=[QaEgo4D_test]' \
    dataset.batch_size=8 \
    +trainer.test_only=True \
    '+trainer.checkpoint_path="./"'

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
config		config
fig		fig
model		model
scripts		scripts
README.md		README.md
eval.py		eval.py
eval_nlq.py		eval_nlq.py
requirements.txt		requirements.txt
run.py		run.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Introduction

Collaborated with Hallucination: Enhancing Egocentric Grounded Question Answering via Error Demonstrations

Proposed Model (CoHa)

Motivation

Results

Ablation Study on QAEgo4D and NLQv2

Grounding and QA Examples

Usage

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Introduction

Collaborated with Hallucination: Enhancing Egocentric Grounded Question Answering via Error Demonstrations

Proposed Model (CoHa)

Motivation

Results

Ablation Study on QAEgo4D and NLQv2

Grounding and QA Examples

Usage

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages