Confidence estimation for t‑SNE embeddings using random forest

B. Ozgode Yigin and Gorkem Saygili

Cognitive Sciences and Artificial Intelligence, Tilburg School of Humanities and Digital Sciences, Tilburg University, The Netherlands.

Abstract: Dimensionality reduction algorithms are commonly used for reducing the dimension of multi-dimensional data to visualize them on a standard display. Although many dimensionality reduction algorithms such as the t-distributed Stochastic Neighborhood Embedding aim to preserve close neighborhoods in low-dimensional space, they might not accomplish that for every sample of the data and eventually produce erroneous representations. In this study, we developed a supervised confidence estimation algorithm for detecting erroneous samples in embeddings. Our algorithm generates a confidence score for each sample in an embedding based on a distance-oriented score and a random forest regressor. We evaluate its performance on both intra- and inter-domain data and compare it with the neighborhood preservation ratio as our baseline. Our results showed that the resulting confidence score provides distinctive information about the correctness of any sample in an embedding compared to the baseline.

This code is the code of our journal publication:

[1] B. Ozgode Yigin and G. Saygili, "Confidence estimation for t‑SNE embeddings using random forest", International Journal of Machine Learning and Cybernetics, 2022.

Online available at: https://link.springer.com/epdf/10.1007/s13042-022-01635-2?sharing_token=WHF414GgNmjoADmQasLa7ve4RwlQNchNByi7wbcMAY6tDVkBbSh45DjuKj43hFV3qga3b1UQE3Pb40D4zTiNcmW-0XY48mK9eedXGpzQbnRQ2y9SzJ9XZy8ZR0Z1JFgVtRhfTcs2HrmxHLausl2NjiPB9Y-igogtNeoT0-xTmV8%3D

Please cite our paper [1] in case you use the code.

Created by Busra Ozgode Yigin and Gorkem Saygili on 11-09-22.

Datasets:

MNIST
https://zenodo.org/record/4557712#.YUbplLgzZPY (AMB_integrated.zip)

Important Note: This code is under MIT License:

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

How to use:

You can run conf_pred_with_existing_model function for using pre-trained existing models on AMB18 and MNIST dataset on your test set.
You can run conf_pred_with_training function for training your own model on your own training set and make confidence predictions on your test set.

Name	Name	Last commit message	Last commit date
Latest commit gsaygili Update README.md Sep 11, 2022 42351b5 · Sep 11, 2022 History 50 Commits
License	License	Create License	Sep 11, 2022
README.md	README.md	Update README.md	Sep 11, 2022
apply_tsne.py	apply_tsne.py	final code uploaded	Sep 11, 2022
best_rf_model_for_AMB18.sav	best_rf_model_for_AMB18.sav	final code uploaded	Sep 11, 2022
best_rf_model_for_mnist.sav	best_rf_model_for_mnist.sav	final code uploaded	Sep 11, 2022
calc_error.py	calc_error.py	final code uploaded	Sep 11, 2022
conf_pred_with_existing_model.py	conf_pred_with_existing_model.py	final code uploaded	Sep 11, 2022
conf_pred_with_training.py	conf_pred_with_training.py	final code uploaded	Sep 11, 2022
evaluate_model.py	evaluate_model.py	final code uploaded	Sep 11, 2022
extract_features.py	extract_features.py	final code uploaded	Sep 11, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Confidence estimation for t‑SNE embeddings using random forest

B. Ozgode Yigin and Gorkem Saygili

Cognitive Sciences and Artificial Intelligence, Tilburg School of Humanities and Digital Sciences, Tilburg University, The Netherlands.

About

Releases

Packages

Contributors 2

Languages

License

gsaygili/dimred

Folders and files

Latest commit

History

Repository files navigation

Confidence estimation for t‑SNE embeddings using random forest

B. Ozgode Yigin and Gorkem Saygili

Cognitive Sciences and Artificial Intelligence, Tilburg School of Humanities and Digital Sciences, Tilburg University, The Netherlands.

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages