Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dumb theoretical questions #27

Closed
hlsfin opened this issue Jan 18, 2022 · 2 comments
Closed

dumb theoretical questions #27

hlsfin opened this issue Jan 18, 2022 · 2 comments

Comments

@hlsfin
Copy link

hlsfin commented Jan 18, 2022

I have not read the paper entirely so forgive me; but is there a way that information seeps from the unlabeled dataset to the label dataset? And if it doesn't, can we just have any dataset that could take the place of the unlabeled dataset, or does it have to look 'similar' to the label dataset? Thank you.

@dagleaves
Copy link

dagleaves commented Jan 18, 2022

tl;dr: 1. yes... 2. maybe, but it is hard to say for certain.

I am not the creator of this repository, but I have read the original paper and used the idea. I will answer this will the assumption that Meta Pseudo Labels works properly and as discussed in the original paper. With that in mind, here is what I would say about your question from what I understand of it.

There is no way that information can be transferred (seep) between the datasets per se without moving data from one set to the other outright. However, from your question, I believe you are referring more to data leakage in the model itself, which is actually technically a goal of MPL.

With semi-supervised learning, dual-network architectures (including, but not limited to MPL), the goal is to approximate the labels of unknown samples from a model trained on labeled samples. This works best if the unlabeled dataset looks "similar" to the labeled dataset as you have said.

MPL introduces its feedback signal between the student network and the teacher network to provide a sort of metric for how beneficial the pseudo-labels from the teacher are for the student's performance on the labeled data. The idea with this is that if the student performs worse on labeled data after being provided a batch of pseudo-labels, then the pseudo-labels must have been wrong. The teacher is then adjusted accordingly. Assuming MPL works in this way as they have described, then information is certainly transferred from the unlabeled data to the teacher model (who was only trained on the labeled data). This theoretically generalizes both the teacher and student models to unlabeled data as you are asking, though they only consider the student model for some reason in their paper.

Whether it has to look similar to the labeled dataset comes down to whether it works the way we/they think it does. From what I can tell, would it help for it to look similar? Absolutely. Is it necessary? Not completely. The point at which it is too dissimilar is unknown and would likely change on a data-specific basis.

To provide a concrete example, if you were to train a model to classify images containing cats and the labeled dataset contained only house cats and the unlabeled contained wild cats, you'd probably be fine. It might even perform close to if the labeled set contained both. However, if your unlabeled dataset only contained dogs, I do not know how you would fare. In my testing, with my specific application, I found it performed better when I seeded the unlabeled set with a portion of labeled data (moved a portion of the labeled data to the unlabeled dataset). Is this because the unlabeled set was too dissimilar? Maybe. Was it because the unlabeled set was unbalanced (containing too few ground truth positive samples)? Also maybe. It is hard to say.

I hope this helps.

@hlsfin
Copy link
Author

hlsfin commented Jan 20, 2022

So if my understanding is correct; While yes information does leak from the unlabeled validation set to the labeled training set, but not in the form of direct 'here are the labels for validation set', but in a similarity vs dissimilarity context per image. correct?

(btw, i really appreciate that you took the time to write all this.)

@hlsfin hlsfin closed this as completed Feb 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants