User Modelling for Avoiding Overfitting in Interactive Knowledge Elicitation for Prediction

This repository contains the codes and data for user modeling to avoid user overfitting in interactive machine learning. Many algorithms and user interfaces often expose the user to the training data or its statistics which may lead to double use of data and overfitting, if the user reinforces noisy patterns in the data. We propose a user modelling methodology, by assuming simple rational behaviour, to correct this problem (see the paper [1], and the slides).

We apply our approach to infer user knowledge on feature relevance (probability of relevance of words in the Amazon reviews) in sparse linear regression. We use a probabilistic sparse linear regression model described in [Daee, P., Peltola, T., Soare, M. et al. Mach Learn (2017) 106: 1599. https://doi.org/10.1007/s10994-017-5651-7].

Data and user study

Data-Exp1 contains the explanations of each experiment, data, and the user responses.

main.m runs the user study and generates the results. The user model is implemented in this script (in Method "User FB after correction").

word_analysis.m compares the user feedbacks in the two system.

linreg_sns_ep.m performs the posterior approximation (using Expectation Propagation).

Citation

If you are using this source code in your research please consider citing us:

[1] Pedram Daee, Tomi Peltola, Aki Vehtari and Samuel Kaski. User Modelling for Avoiding Overfitting in Interactive Knowledge Elicitation for Prediction, In Proceedings of the 23rd ACM International Conference on Intelligent User Interfaces (IUI). ACM, 2018. DOI: 10.1145/3172944.3172989. [Link] [preprint].

Team


Pedram Daee	Tomi Peltola

License

This project is licensed under the MIT License - see the LICENSE.md file for details

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
Data-Exp1		Data-Exp1
LICENSE.md		LICENSE.md
README.md		README.md
Slides-IUI2018.pdf		Slides-IUI2018.pdf
calculate_posterior.m		calculate_posterior.m
evaluate.m		evaluate.m
linreg_sns_ep.m		linreg_sns_ep.m
main.m		main.m
word_analysis.m		word_analysis.m

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

User Modelling for Avoiding Overfitting in Interactive Knowledge Elicitation for Prediction

Data and user study

Citation

Team

License

About

Releases

Packages

Languages

License

HIIT/human-overfitting-in-IML

Folders and files

Latest commit

History

Repository files navigation

User Modelling for Avoiding Overfitting in Interactive Knowledge Elicitation for Prediction

Data and user study

Citation

Team

License

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages