Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MuE distributions for Pyro. #2728

Merged
merged 93 commits into from
Mar 23, 2021
Merged

MuE distributions for Pyro. #2728

merged 93 commits into from
Mar 23, 2021

Conversation

EWeinstein
Copy link
Contributor

@EWeinstein EWeinstein commented Dec 31, 2020

An implementation of MuE models for pyro. The code is organized as:

  1. contrib/mue/statearrangers.py Modules for converting standard MuE parameters into standard HMM parameters. Currently, this is just implemented for the profile HMM architecture, but in the future other MuE models will be added.
  2. contrib/mue/variablelengthhmm.py An HMM with discrete states and discrete emissions which can handle variable length sequences, assuming that they are one hot encoded and padded at the end with zeros to some fixed length. Based on DiscreteHMM. In the future this will be extended with functions for sampling, filtering, estimating the latent state and computing the first moment of the distribution.
  3. contrib/mue/biosequenceloaders.py Currently empty; in the future, this will include torch dataloaders for fasta files and for a2m (MSA) files.

Example models (profile HMM, FactorMuE) can be found in examples/contrib/mue. Unit tests can be found in tests/contrib/mue.

One more note: Pyro's DiscreteHMM computes the probability of the initial latent state as initial_probs @ transition_probs rather than just using initial_probs (the more common convention I think). I used the second convention, but if Pyro is committed to the first I will change my HMM appropriately; it might help to clarify this in the documentation. (I found the same behavior in Edward2, tensorflow/probability#958).

Tasks

  • Add a file docs/source/contrib.mue.rst similar to other contrib.*.rst files. Run make docs and view docs/build/html/contrib.mue.html to ensure this works.
  • Add files tutorial/source/*.rst for each example, following other .rst files by including code. Run make tutorial and examine docs to ensure this works.
  • Add shape tests for VariableLengthDiscreteHMM
  • Add tests comparing VariableLengthDiscreteHMM vs DiscreteHMM
  • Add unit tests for Profile
  • Register both examples in CPU_EXAMPLES of test_examples.py. We try to pass args so that each such test is quick to run. Run make test-examples to ensure this works.

@eb8680
Copy link
Member

eb8680 commented Jan 1, 2021

cc @samuelklee

Copy link
Member

@fritzo fritzo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good so far. Let's chat later this week. It's probably worth adding a tutorial notebook in a subsequent PR.

tests/contrib/mue/test_variablelengthhmm.py Outdated Show resolved Hide resolved
examples/contrib/mue/phmm.py Outdated Show resolved Hide resolved
@fritzo
Copy link
Member

fritzo commented Jan 6, 2021

Another thing @eb8680 and @martinjankowiak and I have discussed is creating a new biopyro package that moves all biology-related stuff out of Pyro. That would be a good thing to do in the long term, say before Pyro 2.0; in the short term it seems good to put MuE in pyro.contrib. We might consider making a pyro.contrib.dataloaders for fasta files etc, since they are useful for other genetics models.

Copy link
Member

@fritzo fritzo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!

examples/contrib/mue/FactorMuE.py Outdated Show resolved Hide resolved
pyro/contrib/mue/statearrangers.py Outdated Show resolved Hide resolved
pyro/contrib/mue/statearrangers.py Show resolved Hide resolved
pyro/contrib/mue/__init__.py Outdated Show resolved Hide resolved
pyro/contrib/mue/variablelengthhmm.py Outdated Show resolved Hide resolved
pyro/contrib/mue/variablelengthhmm.py Outdated Show resolved Hide resolved
pyro/contrib/mue/variablelengthhmm.py Outdated Show resolved Hide resolved
tests/contrib/mue/test_statearrangers.py Show resolved Hide resolved
tests/contrib/mue/test_statearrangers.py Outdated Show resolved Hide resolved
pyro/contrib/mue/biosequenceloaders.py Outdated Show resolved Hide resolved
Copy link
Member

@fritzo fritzo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @EWeinstein, this looks great. I have only a few minor nits remaining.

Could we set up a time to quick walk over the code before merging, just so I can get an idea of future plans?

pyro/contrib/mue/missingdatahmm.py Outdated Show resolved Hide resolved
cuda=False, pin_memory=False):
super().__init__()
assert isinstance(cuda, bool)
self.cuda = cuda
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: rename this to .is_cuda or something to avoid conflict with the nn.Module.cuda() method

pyro/contrib/mue/models.py Outdated Show resolved Hide resolved
pyro/contrib/mue/models.py Outdated Show resolved Hide resolved
tests/contrib/mue/test_statearrangers.py Outdated Show resolved Hide resolved
docs/source/contrib.mue.rst Outdated Show resolved Hide resolved
Copy link
Member

@fritzo fritzo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! I'll merge now and let's discuss your plans tomorrow.

@fritzo fritzo merged commit 6c8d887 into pyro-ppl:dev Mar 23, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants