You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Naming consistency: Only 80-90 matching trials between human and modeling experiments? Need to debug the stimulus matching issue, there are known discrepancies between naming conventions for stimuli for model & human evaluations
Replicate RMSE analysis for all trials (not just adversarial subset) to verify that we can recover the same pattern in model-human consistency across models (with particle models doing the best, convnets worse, etc.)
Visualize raw correlation between human & model predictions on adversarial trials, contextualized among all trials.
Computing noise ceiling on the adversarial trials only (same as in paper). Then normalizing these metrics by these noise ceiling estimates.
The text was updated successfully, but these errors were encountered:
Re the naming consistency: as I understand it, the 80-90 matching trials is fine. The dataframes contain information for all 150 (x8) stem IDs (for the models already properly renamed under Canon Stimulus Name). The 80-90 refers to the number of human observation per stimulus ID. We started out with about 100 participants per scenario and excluded some for various reasons (see OSF file), so those numbers make sense.
The text was updated successfully, but these errors were encountered: