Skip to content


Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?

Latest commit


Git stats


Failed to load latest commit information.
Latest commit message
Commit time

Human-Centered Deferred Inference

This is the repository for the IUI2023 paper titled: Human-Centered Deferred Inference: Measuring User Interactions and Setting Deferral Criteria for Human-AI Teams. Citation information will be added soon.

A cartoon example of deferred inference


Although deep learning holds the promise of novel and impactful interfaces, realizing such promise in practice remains a challenge: since dataset-driven deep-learned models assume a one-time human input, there is no recourse when they do not understand the input provided by the user. Works that address this via deferred inference—soliciting additional human input when uncertain—show meaningful improvement, but ignore key aspects of how users and models interact. In this work, we focus on the role of users in deferred inference and argue that the deferral criteria should be a function of the user and model as a team, not simply the model itself. In support of this, we introduce a novel mathematical formulation, validate it via an experiment analyzing the interactions of 25 individuals with a deep learning-based visiolinguistic model, and identify user-specific dependencies that are under-explored in prior work. We conclude by demonstrating two human-centered procedures for setting deferral criteria that are simple to implement, applicable to a wide variety of tasks, and perform equal to or better than equivalent procedures that use much larger datasets.


The conda environment I used at all steps (collection and analysis) is included in environment.yml. However, it has some extraneous packages from previous experiments. It is also unlikely that the apex library will install correctly. This is not necessary for the analysis steps, but it is necessary for running the data collection.


All analysis steps assume that you have downloaded the experimental data to the folder user_trackers. Outlier users (removed in our analysis) can be downloaded here. Release of anonymized data was approved by the University of Michigan IRB.

Removing outliers

If you download the outlier users, you can see our script identify them via the command python analysis/ from the repository root directory. Note that this does not alter the filesystem---these files must be deleted manually before performing any other analysis.

Demographic data

python analysis/

RQ1: Is user satisfaction related to error and deferral rate?

python analysis/

RQ2: What are the time dependencies of error, e, and deferral score, s?

python analysis/ produces fig. 5 and the points where the conditions are met.

python analysis/ produces fig. 6.

RQ3: Do deferral scores vary meaningfully between users?

python analysis/ for statistical results and python analysis/ produces Fig 7, and compares the different users with the same error.

RQ4: How do users respond when inference is deferred?

python analysis/ outputs the results, as well as the tsv containing correlated initial queries and deferral responses used for table 1 (also uploaded here.

python analysis/ performs tests to determine if deferral improves accuracy.

RQ5: Does knowing the user provide additional information about the mapping between probability of error and deferral score?

python analysis/

Targeting a deferral rate

python analysis/

Experimental Apparatus

Downloading weights

Model weights for UNITER can be downloaded here. They should be downloaded to the directory net_weights/ckpt/.

Download images and features

download the images and features, which should be unzipped to scenarios/iui_2023_scenario. The features correspond to the tasks used in our study scenarios/iui_2023_scenario.csv.

Running the webapp

Add the current directory (human-centered-deferred-inference) to the python path. This is necessary for the import of export PYTHONPATH=$(pwd)

Update the ssl_context in of (last line).

python --scenario_category iui_2023_scenario --consent_form regular --rqd_constraint 1

NB: terminology in code

In the codebase, you will occasionally encounter the term requery. The terminology or our work changed while the paper was being written, and requery can be read interchangeably with deferral.


No description, website, or topics provided.






No releases published


No packages published