Pytorch refactor #168

ebolyen · 2022-04-19T18:02:18Z

No description provided.

archived old code, new files, basic model rough-in

/getting ready to refactor model init and data

ebolyen

@mortonjt, some updates:

We've poured over the model a few times, but cannot get it to converge in the same place that the multimodal unit test expects. We think starting conditions or the Adam optimizer may have an outsized effect here, but we're also not sure we didn't miss something.

The general structure of our tensors are: [batch, sample, whatever], and tracing each operation seems to do what we expect.

So our X is drawn from a multinomial to do a bunch of categorical draws at once, giving us OTU indices in the form: [batch, sample]

That goes into the embedding giving us: [batch, sample, latent]
Then we slice the bias and add it to latent after reshaping it to match (maybe something went wrong here, but we've inspected that line a few different ways and it seems to do what we want).

Then we use the decoder, we run a linear model on 1 less dimension, giving us: [batch, sample, ALR_metabolites], then we add zeros to the front of that last dimension and run softmax over it to hopefully have: [batch, sample, P_metabolites].

We then parameterize the multinomial and calculate likelihoods.

As far as we can tell, this is what should be happening, so we don't really know why we get such unstable correlations from the unit-tests, ranging from -0.29 to accidentally passing for U, and always failing by the point we check V.

ebolyen · 2022-05-05T19:55:13Z

mmvec/train.py

+def mmvec_training_loop(model, learning_rate, batch_size, epochs):
+    optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate, 
+                                 betas=(0.8, 0.9), maximize=True)
+    for epoch in range(epochs):


We need to add better logic here, so that a single epoch represents the correct number of batch draws for the data.

@mortonjt, the paper seems to imply that an epoch represents a random draw (in batches of course) for each read in the feature-table, but the original code seems to use nnz which I interpret to mean "n-non-zero". So this would be the number of different types of sample:microbe pairs, rather than the number of observations. What was the goal there, and should we replicate that?

This line was largely to make the concept of epoch more interpretable. And yes, nnz is the number of non-zeros.

One epoch is completed if you read through the entire dataset, which means that you should be able to process all of the reads. Since the batch size is computed over the number of reads, this is used to compute the number of "iterations" within each loop.

So it should read like this : 1 epoch = num iterations / epoch = (total number of reads [aka nnz] ) / (num reads per batch)

We're basically calculating how many batches are within an epoch, in order to read through the entire dataset.

That being said -- I don't think you really need this. I think the current implementation is fine -- we just need a way to make the term epochs interpretable to the user.

I think nnz in the older implementation was actually the number of non-zero cells, not the sum of those cells.

It sounds like the goal was to make it the number of reads outright though (sum of the entire table). I think it's probably worth making sure that epoch fits that, if only for the sake of explanation. (It hasn't seemed to matter too much in practice while we've been testing.)

ebolyen · 2022-05-05T19:58:22Z

mmvec/tests/test_multimodal.py

+        v_r, v_p = spearmanr(pdist(model.V.T), pdist(self.V.T))
+
+        self.assertGreater(u_r, 0.5)
+        self.assertGreater(v_r, 0.5)


We always fail by this point, but often fail the u_r test above as well. @mortonjt, we're kind of at a loss here.

ebolyen · 2022-05-05T20:01:10Z

mmvec/ALR.py

+
+        forward_dist = forward_dist.log_prob(self.metabolites)
+
+        l_y = forward_dist.sum(0).sum()


Missing the norm that is multiplied against the data likelihood. @mortonjt we aren't 100% sure what its purpose is, but it kind of looks like a weird mean if you squint.

What is the interpretation of this line: https://github.com/biocore/mmvec/blob/master/mmvec/multimodal.py#L137?

ok, so there are two ways you can deal with the data

You try to use the mini-batches to approximate the loss on the entire dataset

You just compute the per-sample loss for each mini-batch
For all intents and purposes, I think it is ok to just compute the per-sample loss -- this appears to be an emerging standard in deep learning.

I think taking a mean is very ok. It'll basically be just l_y = forward_dist.sum(0).mean(). I'm able get the tests to pass once I run this model locally.

mortonjt · 2022-05-13T18:18:06Z

How about this, let me try to reproduce the findings. Sometimes it may require tweaking learning rates and batch sizes.

U and V does have an identifiability issue, so that is something to consider. The one metric that should always pass is U @ V.

mortonjt

I think the implementation in this pull request is actually correct. We don't expect U and V tests to always pass (this is why we are running SVD after fitting the model). Its the U @ V test that needs to pass.

I'm able to get the tests passing on my side (r>0.5, p<0.05). The only thing that you may want to drop is the total_count argument in the multinomial.

mortonjt · 2022-04-19T18:04:29Z

mmvec/multimodal.py

+        self.encoder = nn.Embedding(num_microbes, latent_dim)
+        self.decoder = nn.Sequential(
+                nn.Linear(latent_dim, num_metabolites),
+                nn.Softmax(dim=2)


self.input_bias = nn.Parameter(torch.randn(num_microbes))

I think you might have looked at an older commit. We should have that in the current model.

mortonjt · 2022-04-19T18:07:24Z

mmvec/multimodal.py

+        # Three likelihoods, the likelihood of each weight and the likelihood
+        # of the data fitting in the way that we thought
+        # LY
+        z = self.encoder(X)


bias = self.input_bias[X] z = z + bias.view(-1, 1)

Same as above, although the .view(-1, 1) looks nicer

mortonjt · 2022-05-13T18:08:58Z

mmvec/ALR.py

+
+        forward_dist = forward_dist.log_prob(self.metabolites)
+
+        l_y = forward_dist.sum(0).sum()


ok, so there are two ways you can deal with the data

You try to use the mini-batches to approximate the loss on the entire dataset

You just compute the per-sample loss for each mini-batch
For all intents and purposes, I think it is ok to just compute the per-sample loss -- this appears to be an emerging standard in deep learning.

I think taking a mean is very ok. It'll basically be just l_y = forward_dist.sum(0).mean(). I'm able get the tests to pass once I run this model locally.

mortonjt · 2022-05-13T18:16:00Z

mmvec/train.py

+def mmvec_training_loop(model, learning_rate, batch_size, epochs):
+    optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate, 
+                                 betas=(0.8, 0.9), maximize=True)
+    for epoch in range(epochs):


This line was largely to make the concept of epoch more interpretable. And yes, nnz is the number of non-zeros.

One epoch is completed if you read through the entire dataset, which means that you should be able to process all of the reads. Since the batch size is computed over the number of reads, this is used to compute the number of "iterations" within each loop.

So it should read like this : 1 epoch = num iterations / epoch = (total number of reads [aka nnz] ) / (num reads per batch)

We're basically calculating how many batches are within an epoch, in order to read through the entire dataset.

That being said -- I don't think you really need this. I think the current implementation is fine -- we just need a way to make the term epochs interpretable to the user.

mortonjt · 2022-05-17T17:20:06Z

mmvec/ALR.py

+        z = z + self.encoder_bias[X].reshape((*X.shape, 1))
+        y_pred = self.decoder(z)
+
+        forward_dist = Multinomial(total_count=0,


I'd suggest getting rid of the total_count=0 parameter -- we don't actually need it for log_prob.
And it may introduce a bug downstream (since the total_count isn't actually zero).

This was actually a result of doing things in batch. We would run into an issue where the log_prob would indicate our calculation was out of the support of the distribution, because it had different counts sample to sample, so we solved it via this suggestion:

pytorch/pytorch#42407 (comment)

That said, looking at the documentation again, I wonder if we should be using logits instead of probs?

ebolyen · 2022-05-17T18:00:47Z

Thanks for the review @mortonjt!

… pytorch-refactor

Keegan-Evans added 18 commits April 12, 2022 15:34

DRAFT: first draft of model

853b65d

REFACTOR: beginning of refactor

c0a07e6

archived old code, new files, basic model rough-in

FEAT: forward likelihood now sum(lu, lv, ly)

d57f4c0

DEBUG: first pass package refactoring

4a29a12

IMP: split ILR and ALR, ALR done to ranks

14e2b13

IMP: ALR outputs working

b6778c1

FEAT: function for ordination created

fcc066b

/getting ready to refactor model init and data

FEAT: Produces OrdinationResults

dc3eaa9

IMP: cleanup before working on tests.

449f4a2

TEST: test_multimodal runs but fails.

5946e33

checkpoint before cleanup for pr

b32d5f1

IMP: cleanup refactor examples directory

11e65ff

IMP: remove sneaky notebook checkpoints

b45f04e

IMP: restore original mmvec/multimodal.py

26377ff

IMP: restore q2 goodies

5cf2dcc

IMP: q2 goodies

15bfa63

IMP: add min pytorch version

e7c7f89

DEBUG: conda-> pip pytorch install name

edc554a

ebolyen commented May 5, 2022

View reviewed changes

Keegan-Evans added 2 commits May 5, 2022 16:11

IMP: name tweaks

5308aa1

IMP: var renaming and observation based epochs

6d75cd1

mortonjt marked this pull request as ready for review May 17, 2022 16:09

mortonjt self-requested a review May 17, 2022 16:09

mortonjt reviewed May 17, 2022

View reviewed changes

Keegan-Evans added 4 commits May 19, 2022 12:04

FEAT: plugin-standup

b38a1d2

Merge branch 'pytorch-refactor' of github.com:Keegan-Evans/mmvec into…

55d1c66

… pytorch-refactor

BUG: getting test interface to work after plugin

ae8c948

IMP: q2 paired-omics method wired up

88d5dbc

Keegan-Evans added 4 commits May 25, 2022 15:21

BUG: fixing index filtering on biom-tables

aa967e4

BUG: index dfs based on intersection of indexes

a26f56e

BUG: paired-omics working now.

9d4d26e

TEST: adding tests for ranks_bare

da1bdb8

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pytorch refactor #168

Pytorch refactor #168

ebolyen commented Apr 19, 2022

ebolyen left a comment

ebolyen May 5, 2022

ebolyen May 5, 2022

mortonjt May 13, 2022 •

edited

Loading

ebolyen May 17, 2022

ebolyen May 5, 2022

ebolyen May 5, 2022

mortonjt May 13, 2022 •

edited

Loading

mortonjt commented May 13, 2022

mortonjt left a comment

mortonjt Apr 19, 2022

ebolyen May 17, 2022

mortonjt Apr 19, 2022

ebolyen May 17, 2022

mortonjt May 13, 2022 •

edited

Loading

mortonjt May 13, 2022 •

edited

Loading

mortonjt May 17, 2022

ebolyen May 17, 2022

ebolyen commented May 17, 2022


		forward_dist = forward_dist.log_prob(self.metabolites)

		l_y = forward_dist.sum(0).sum()

Pytorch refactor #168

Are you sure you want to change the base?

Pytorch refactor #168

Conversation

ebolyen commented Apr 19, 2022

ebolyen left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mortonjt May 13, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mortonjt May 13, 2022 • edited Loading

Choose a reason for hiding this comment

mortonjt commented May 13, 2022

mortonjt left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mortonjt May 13, 2022 • edited Loading

Choose a reason for hiding this comment

mortonjt May 13, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ebolyen commented May 17, 2022

mortonjt May 13, 2022 •

edited

Loading

mortonjt May 13, 2022 •

edited

Loading

mortonjt May 13, 2022 •

edited

Loading

mortonjt May 13, 2022 •

edited

Loading