Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using HoloClean for creating labels on tabular numerical datasets #45

Closed
asstergi opened this issue Jan 12, 2019 · 6 comments
Closed

Using HoloClean for creating labels on tabular numerical datasets #45

asstergi opened this issue Jan 12, 2019 · 6 comments

Comments

@asstergi
Copy link

@thodrek Following up on this issue from snorkel (snorkel-team/snorkel#803), I was wondering if there are any examples on how I can use HoloClean to create labels for tabular numerical datasets with the help of labelling functions.

Any guidance would be really appreciated.

@asstergi
Copy link
Author

asstergi commented Feb 5, 2019

@thodrek could you please provide any guidance on the above question?

@jondoering
Copy link

Any update on that? Would be really interested, too.

@DataDoctorNG
Copy link

DataDoctorNG commented Jul 4, 2019

@thodrek Any update on using HoloClean for tabular data with examples? I have a project I would like to do with HoloClean and would be greatly interested in some examples on how to use it.

@thodrek
Copy link

thodrek commented Jul 5, 2019

We are preparing a release that handles mixed categorical and numerical. It’s in dev currently and soon to be pushed in master.

@thodrek
Copy link

thodrek commented Sep 21, 2019

The latest version on master handles both continuous and discrete values.

@thodrek thodrek closed this as completed Sep 21, 2019
@asstergi
Copy link
Author

@thodrek Thank you for the reply.

One more question though. How should we approach a data labeling problem? If my understanding is correct, the initial value of a sample's label is kind of a 'prior' (in a loose sense) to its final value (after error correction). Is that correct?

If so, how should I initially set the values of the labels? Could I just set them to the same value and let HoloClean sort this out utilizing the constraints?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants