Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix bug default marking of unlabeled data #20494

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

trantrikien239
Copy link

Reference Issues/PRs

Fixes #17951

What does this implement/fix? Explain your changes.

Implement unlabeled_mark as a parameter for BaseLabelPropagation.
This is a way for user to specify the way that they mark unlabeled data,
which also fix the problem of described in issue #17951

Any other comments?

Thank you for reviewing this PR, if anything arise, please send me an email to
trantrikien239@gmail.com

Implement `unlabeled_mark` - a way for user to specify the way they
mark unlabeled data, which also fix the bug described in issue scikit-learn#17951

Close: scikit-learn#17951
Copy link
Member

@glemaitre glemaitre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need several tests to check the behaviour of the new parameter and that it behave as expected.

@@ -97,6 +97,15 @@ class BaseLabelPropagation(ClassifierMixin, BaseEstimator, metaclass=ABCMeta):
tol : float, default=1e-3
Convergence tolerance: threshold to consider the system at steady
state.

unlabeled_mark: {-1, 'unk'} or custom value, default=-1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume that it would be better to accept a list with the potential value to be considered as unlabeled.
It looks like something similar to what we are using for the detection of the missing value.

We can also accept a single item that we can wrap in a list.

@trantrikien239
Copy link
Author

Hi @glemaitre, sorry for the late reply. I can certainly add the list for unlabeled_mark. But I'm not sure how to resolve the unsuccessful test. What should I do?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Provide a sentinel to detect unknown label in LabelPropagation
3 participants