Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Documentation unclear about LabelModel strategy #1462

Closed
cdeepakroy opened this issue Sep 13, 2019 · 9 comments
Closed

Documentation unclear about LabelModel strategy #1462

cdeepakroy opened this issue Sep 13, 2019 · 9 comments
Assignees

Comments

@cdeepakroy
Copy link

cdeepakroy commented Sep 13, 2019

Issue description

It is not clear from the documentation what strategy is used in snorkel.labeling.LabelModel().

The description in the docstring says it uses a A conditionally independent LabelModel to learn LF weights and assign training labels. I am guessing in this strategy the labeling functions are assumed to be independent when conditioned on the true label as described in the section named Independent Labeling functions in page 4 of the paper - Data Programming: Creating Large Training Sets, Quickly

image

However, this blog post -- Introducing the New Snorkel - says that in snorkel v0.9 a robust PCA / Low-rank + sparse based approach is used to automatically learn the dependency / relationship / correlation structure between the labeling functions as described in the paper -- Learning Dependency Structures for Weak Supervision Models. This approach to me is very promising than the aforementioned conditionally independent version. I want to use this for my use-case.

Can anyone confirm which of the above two strategies is implemented by snorkel.labeling.LabelModel()?

@glf1030
Copy link

glf1030 commented Sep 17, 2019

from code, I guess they have implemented approach in “Learning the Structure of Generative Models without Labeled Data ”

@cdeepakroy
Copy link
Author

@glf1030, Thanks for the reply. Can you please point me to where in the code the ideas of using the inverse covariance matrix and robust PCA to infer the graph structure? I tried to dig into the code but it was not clear. Also, I wish there was a way to visualize the graph structure that was learned for debugging purposes.

@glf1030
Copy link

glf1030 commented Sep 19, 2019

@glf1030, Thanks for the reply. Can you please point me to where in the code the ideas of using the inverse covariance matrix and robust PCA to infer the graph structure? I tried to dig into the code but it was not clear. Also, I wish there was a way to visualize the graph structure that was learned for debugging purposes.

hi, I am also struggling in understanding the code. I read the paper "Training Complex Models with Multi-Task Weak Supervision" and I guess the Algorithm 1 on that paper is implemented in current code. `
def _loss_mu(self, l2: float = 0) -> torch.Tensor:
----------
l2
A float or np.array representing the per-source regularization
strengths to use, by default 0

    Returns
    -------
    torch.Tensor
        Overall mu loss between learned mu and initial mu
    """
    loss_1 = torch.norm((self.O - self.mu @ self.P @ self.mu.t())[self.mask]) ** 2
    loss_2 = torch.norm(torch.sum(self.mu @ self.P, 1) - torch.diag(self.O)) ** 2
    return loss_1 + loss_2 + self._loss_l2(l2=l2)`

I am not quite sure ..

ok, I think I am a little bit close.
屏幕快照 2019-09-19 下午5 31 12

def _loss_mu is to update z in the above algorithm1

@ajratner
Copy link
Contributor

Hi @cdeepakroy @glf1030 thanks for the deep dive here! The current implemented model is based on the algorithm in the paper @glf1030 is pointing to, Training Complex Models with Multi-Task Weak Supervision, published in AAAI'19. We will add a line or two similar to the above clarifying this in the docstring.

In this approach, given a set of dependencies between the labeling functions (LFs), we compute the statistics of how different cliques of labeling functions agree and disagree with each other, and then use a matrix completion-style approach to recover the LabelModel parameters from this observed matrix (more precisely: we compute the inverse generalized covariance matrix of the junction tree of the LF dependency graph, and perform a matrix completion-style approach wrt this).

Regarding the model being learned: currently we learn a model in which we assume the LFs are conditionally independent given the unobserved true label Y, a common assumption in weak supervision / crowd modeling approaches. And, going beyond the data programming paper we published in NeurIPS'16 (referenced above), we actually estimate different LF accuracies for each label class, i.e. we estimate $P(\lf | Y)$ for all $\lf, Y$. In future releases coming soon, we will also add support for (a) modeling LF dependencies and (b) estimating the LF dependency structure, as we have supported in previous versions of the code and published on (e.g. see our ICML'17 and ICML'19 papers). Hope this helps!

@glf1030
Copy link

glf1030 commented Sep 23, 2019

Thanks for your reply. As for "we will also add support for (a) modeling LF dependencies and (b) estimating the LF dependency structure, as we have supported in previous versions of the code and published on (e.g. see our ICML'17 and ICML'19 papers)." I wonder which version of the code has implemented such strategy? @ajratner
From paper "Training Complex Models With Multi-task Weak Supervision" page 17, the loss function is
屏幕快照 2019-09-23 下午4 18 33

The corresponding code, I think, is:
屏幕快照 2019-09-23 下午4 21 44

I wonder:
1, The scripts didn't calculate the inverse matrix,
2,does loss_2 equal with Z*Z.t ?

@cdeepakroy
Copy link
Author

@ajratner Thank you for the clarification. Will be looking forward to the future release that provides implementations for (a) modeling LF dependencies and (b) estimating the LF dependency structure.

@github-actions
Copy link

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

@chaturv3di
Copy link

chaturv3di commented Nov 25, 2019

FWIW, I'm looking forward to LabelModel implementation which handles LF dependencies. If I remember correctly, this was there in v0.7, and was immensely helpful.

I'm not very hands-on with PyTorch, but it seems to me that it's a matter of updating the loss function _loss_mu() mentioned above. Or am I missing some subtle complexity, @ajratner, @cdeepakroy, @glf1030? I'm trying to understand under which category this PR would fall.

  1. Try it if you're comfortable with PyTorch (and DL algebra in general).
  2. It's simple, read up a tutorial on PyTorch and try it yourself. The current implementation has all the variables that you'll need.

I can try raising a PR if it's 2.

@ajratner
Copy link
Contributor

Hi @chaturv3di first of all, thanks so much for the offer of help! Unfortunately, the integration with the current v0.9 label model in a robust form (i.e. non-research code) is something that we think has some non-trivial aspects and design decisions involved, and so it's something the core team plans to handle. I don't have a timeline for this as yet, but will keep you updated!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants