Convert to N x K storage for U_kln matrix #3

mrshirts · 2013-07-05T01:44:29Z

For data sets where we have different numbers of samples at different states, rather than having the energies stored in a KxKxN_each matrix, it would be more efficient to store data as a KxN_tot matrix, where N is the total number of sampled collected, and K is the number of states we are evaluating the energies at. This will take a bit of extra work, however,

kyleabeauchamp · 2013-07-05T06:46:56Z

So interestingly, we recently ran into a possibly similar issue in MSMBuilder. We previously stored the state assignments as a (num_trajectories, max_length) numpy array. This led to over-allocation of huge blocks of memory when we had data sets where some trajectories were very long, but others were very short (e.g. combining FAH and ANTON data).

Our proposed solution (hasn't been merged yet, but probably will eventually be) was to use Pandas to store the data in a different format (msmbuilder/msmbuilder-legacy#206 and msmbuilder/msmbuilder-legacy#205). Instead of a 2D array, we use a Pandas Dataframe that has fields for "trajectory ID", "timestep", and "conformational state".

I suspect a similar thing could be done in pymbar--we just create a DataFrame object that has entries for "sample ID", "sampled state", etc.

kyleabeauchamp · 2013-07-05T06:47:44Z

Obviously that requires an extra prerequisite, but it could be an elegant solution if this problem is a real efficiency breaker.

jchodera · 2013-08-11T01:02:22Z

I like this idea---it's effectively a sparse representation.

Optimally, I think we would implement a simple linear storage system that doesn't require pandas (all N samples are concatenated and evaluated at all K states) and a pandas-interoperable interface, so that either form could be passed in.

The big question is whether we can deprecate the old interface where an Nmax x Nmax x K matrix u_kln is passed in. I imagine this would be a bad idea, and we can simply support both as input (but use NxK internally) and determine which one is being passed by the dimension.

mrshirts · 2013-08-11T03:41:15Z

I think the first step is to convert to an N x K representation, using the
N_k data to identify where we transition between samples from state 1,
state 2, etc. I have some ideas on this, but not sure when I'll have time
to do too much.

On Sat, Aug 10, 2013 at 9:02 PM, John Chodera notifications@github.comwrote:

I like this idea---it's effectively a sparse representation.

Optimally, I think we would implement a simple linear storage system that
doesn't require pandas (all N samples are concatenated and evaluated at all
K states) and a pandas-interoperable interface, so that either form could
be passed in.

The big question is whether we can deprecate the old interface where an
Nmax x Nmax x K matrix u_kln is passed in. I imagine this would be a bad
idea, and we can simply support both as input (but use NxK internally) and
determine which one is being passed by the dimension.

—
Reply to this email directly or view it on GitHubhttps://github.com//issues/3#issuecomment-22450344
.

jchodera · 2013-08-12T00:36:06Z

This should be straightforward, but will require some time investment to complete and test. We should only tackle this once we have a sufficient number of tests in place to ensure we haven't screwed anything major up.

mrshirts · 2013-08-12T01:26:32Z

Agreed on all counts mentioned.

On Sun, Aug 11, 2013 at 8:36 PM, John Chodera notifications@github.comwrote:

This should be straightforward, but will require some time investment to
complete and test. We should only tackle this once we have a sufficient
number of tests in place to ensure we haven't screwed anything major up.

—
Reply to this email directly or view it on GitHubhttps://github.com//issues/3#issuecomment-22468659
.

kyleabeauchamp · 2013-08-12T01:36:25Z

I agree. Once tests are in place, it would be easy to hammer this out.
On Aug 11, 2013 6:26 PM, "mrshirts" notifications@github.com wrote:

Agreed on all counts mentioned.

On Sun, Aug 11, 2013 at 8:36 PM, John Chodera notifications@github.comwrote:

This should be straightforward, but will require some time investment to
complete and test. We should only tackle this once we have a sufficient
number of tests in place to ensure we haven't screwed anything major up.

—
Reply to this email directly or view it on GitHub<
https://github.com/choderalab/pymbar/issues/3#issuecomment-22468659>
.

—
Reply to this email directly or view it on GitHubhttps://github.com//issues/3#issuecomment-22469468
.

kyleabeauchamp · 2013-09-04T19:33:01Z

Below, $q_{kj}$ refers to the unnormalized probability of conformation j in state k. (e.g. N x K storage)

So I've been playing with this a bit and have some initial thoughts regarding acceleration:

If we switch from log-space to exponential space (e.g. from $f_i$ to $c_i$) (Eqns. C1, C2), the self-consistent equations of MBAR become essentially just two matrix multiplications.
We can still handle overflow by dividing $q_{kj}$ by the largest element in each row (e.g. the most probable state for each sample).
We can't handle underflow and still use fast BLAS / matrix multiplication--underflow requires Kahan summation, which is generally not a BLAS accelerated operation. Note that I don't think the current code actually handles underflow--just overflow. Correct me if I'm wrong here.
My implementation of these ideas (with self-consistent iteration) gives a 100X speedup over the current C++ Newton code. Also, this requires only like 10 lines of code.
If 100X is fast enough, do we still need to have the Newton code and c++ code? Do we want to run calculations for n_states > 10000? In such cases, we can't use NR because of the memory footprint of the Hessian--right? One possible solution is to transform the self-consistent equations into a nonlinear optimization problem (e.g. possibly using L2 error) and apply LBFGS. I'm not sure if this is worth the extra code required, though.

I can give more quantitative results / PR later.

mrshirts · 2013-09-04T23:19:22Z

Hi, all-

Perhaps you can check code in as a branch so we can look at it directly?

So I've been playing with this a bit and have some initial thoughts
regarding acceleration:

Great!

If we switch from log-space to exponential space (e.g. from $f_i$
to $c_i$) (Eqns. C1, C2), the self-consistent equations of MBAR become
essentially just two matrix multiplications.

There are some places where the log space representation really is
necessary for precision. Most was originally in exponential space, and
more and more got shifted over time to log space. I'd have to dig in a bit
more. Is it possible to shift to exponential space just before the time
consuming calculations?

We can still handle overflow by dividing $q_{kj}$ by the largest
element in each row (e.g. the most probable state for each sample).

I don't THINK this is sufficient to save the error -- but we'd have to
check.

We can't handle underflow and still use fast BLAS / matrix
multiplication--underflow requires Kahan summation, which is generally not
a BLAS accelerated operation. Note that I don't think the current code
actually handles underflow--just overflow. Correct me if I'm wrong here.

I would have to look a little more carefully. I think we don't have to
worry about underflow.

My implementation of these ideas (with self-consistent iteration)
gives a 100X speedup over the current C++ Newton code. Also, this requires
only like 10 lines of code.

Great!

If 100X is fast enough, do we still need to have the Newton code
and c++ code?

NR is MUCH faster than self consistent solution near the answer. It might take 8 steps
where self-consistent iteration still hasn't converged after 8000. Of particular
worry is cases where the delta F is below the specified limit -- but each
dF is correlates, so you could still be 0.01 away from the solution even
though the increment has dropped below 0.00001, for example. So NR is
still good for now. Don't remove NR even because of a factor of 100, it's worth more than that in many cases. If you can replace the old self-consistent version w/o removing NR, then more power, of course.

Do we want to run calculations for n_states > 10000? In such cases,
we can't use NR because of the memory footprint of the Hessian--right?

Longer term, we do want to handle more states. However, we really only
need this for states for which there are samples; states for which there
are not samples do not require a Hessian matrix to be generated.
The maximum size is K_sampled x K_sampled.

The likely number of cases where we want to handle many states are when we
are exploring a highly multidimensional space, and n_states >> n_samples,
in which case we would be limited by the number of samples instead. And I
think we wan to think about these a bit differently.

One possible solution is to transform the self-consistent equations
into a nonlinear optimization problem (e.g. possibly using L2 error) and
apply LBFGS. I'm not sure if this is worth the extra code required, though.

I think this is a great idea, but I think it can wait until we've
stabilized the current changes a bit more.

mrshirts · 2013-09-04T23:41:26Z

One possible solution is to transform the self-consistent equations into
a nonlinear optimization problem (e.g. possibly using L2 error) and apply
LBFGS. I'm not sure if this is worth the extra code required, though.

Note that there is an objective function that when minimized, it gives the
free energies. See the function _objectiveF. Certainly can be written
more efficiently.

On Wed, Sep 4, 2013 at 3:33 PM, kyleabeauchamp notifications@github.comwrote:

Below, $q_{kj}$ refers to the unnormalized probability of conformation j
in state k. (e.g. N x K storage)

So I've been playing with this a bit and have some initial thoughts
regarding acceleration:

If we switch from log-space to exponential space (e.g. from $f_i$
to $c_i$) (Eqns. C1, C2), the self-consistent equations of MBAR become
essentially just two matrix multiplications.

We can still handle overflow by dividing $q_{kj}$ by the largest
element in each row (e.g. the most probable state for each sample).

We can't handle underflow and still use fast BLAS / matrix
multiplication--underflow requires Kahan summation, which is generally not
a BLAS accelerated operation. Note that I don't think the current code
actually handles underflow--just overflow. Correct me if I'm wrong here.

My implementation of these ideas (with self-consistent iteration)
gives a 100X speedup over the current C++ Newton code. Also, this requires
only like 10 lines of code.

If 100X is fast enough, do we still need to have the Newton code
and c++ code? Do we want to run calculations for n_states > 10000? In such
cases, we can't use NR because of the memory footprint of the
Hessian--right? One possible solution is to transform the self-consistent
equations into a nonlinear optimization problem (e.g. possibly using L2
error) and apply LBFGS. I'm not sure if this is worth the extra code
required, though.

I can give more quantitative results / PR later.

—
Reply to this email directly or view it on GitHubhttps://github.com//issues/3#issuecomment-23817337
.

jchodera · 2013-09-05T10:42:30Z

If we switch from log-space to exponential space (e.g. from $f_i$ to $c_i$) (Eqns. C1, C2), the self-consistent equations of MBAR become essentially just two matrix multiplications.
We can still handle overflow by dividing $q_{kj}$ by the largest element in each row (e.g. the most probable state for each sample).
There are some places where the log space representation really is necessary for precision. Most was originally in exponential space, andmore and more got shifted over time to log space. I'd have to dig in a bit more. Is it possible to shift to exponential space just before the time consuming calculations?

Note that Kyle says he can handle overflow by scaling each row. That could fix nearly all of the issues that caused us to go to log space. I think this might be worth exploring to see if underflow ends up being unproblematic.

To test this, I think we'd need to have the harmonic oscillator test be automated with a variety of different overlap conditions (good, medium, poor, very bad).

jchodera · 2013-09-05T10:44:17Z

My implementation of these ideas (with self-consistent iteration) gives a 100X speedup over the current C++ Newton code. Also, this requires only like 10 lines of code.
If 100X is fast enough, do we still need to have the Newton code and c++ code? Do we want to run calculations for n_states > 10000? In such cases, we can't use NR because of the memory footprint of the Hessian--right? One possible solution is to transform the self-consistent equations into a nonlinear optimization problem (e.g. possibly using L2 error) and apply LBFGS. I'm not sure if this is worth the extra code required, though.

I'd fully support this, since it would make for code that is more easily understood, still fast, and easier to maintain.

For the really big problems, maybe we could use a scheme that doesn't require explicit construction of the full matrix, or computes the elements on the fly as needed, possibly in a pyopencl/pycuda implementation (which would be compact and fast).

mrshirts · 2013-09-05T11:49:44Z

Note that Kyle says he can handle overflow by scaling each row. That could fix nearly all of the issues that
caused us to go to log space. I think this might be worth exploring to see if underflow ends up being unproblematic.

Sure, just remember that almost everything in there is there for a reason. Before we make a major switch like this, I'll need to go though the check in dates for these changes, check the emails at that time, and make sure there's not some nasty test case that we're forgetting something important for. Limiting to the exponential case, limits us to a dynamic range of of 700x (log 10^300 - log 10^-300). I believe the specific problem was looking at heat capacities for protein aggregation.

To test this, I think we'd need to have the harmonic oscillator test be automated with a variety of different overlap > conditions (good, medium, poor, very bad).

Unfortunately, harmonic oscillators are "nice" in a way that a lot of other problems are not. So just because something works with harmonic oscillators doesn't mean it will work with other problems. If it breaks harmonic oscillators, it will certainly break other problems.

The solution, of course, is to dig those problems out of the email so they can be tested as well. But given my schedule in sep/oct, I'm not going to be able to do that.

In conclusion: The todos as I see them:

Fix the K x K x N representation.

This will provide make formulas simpler because we won't need to do nearly as much index juggling, and save speed with everything as well as saving memory -- in most cases where memory fails, it fails because of K x K x N memory issues, not because of K x K issues.

Make our test set include all the problems that failed in the past (will require some data archeology in the emails and subversion code), so that when we have good ideas, we can tell if they will break things.
Then experiment with things like only representing exponential spaces, highly sparse spaces, and removing methods that are working pretty well. If it ain't broke, be very careful about fixing it.

mrshirts · 2013-09-05T12:03:31Z

OK, I found the example that caused me to harden the code. Basically, free energy differences greater than ~710 KbT cannot be handled in the exponential regime alone. Note that this has nothing to do with overlap -- if the energy differences alone cause the free energy differences, it will still fail.

I'll try to figure out where to upload this.

kyleabeauchamp · 2013-09-05T13:47:10Z

In principle, I agree that we don't want to change things that aren't broken. However, if it leads to major code simplification and maintainability, such changes might be worthwhile.

jchodera · 2013-09-05T13:50:59Z

In principle, I agree that we don't want to change things that aren't broken. However, if it leads to major code simplification and maintainability, such changes might be worthwhile.

I agree completely.

OK, I found the example that caused me to harden the code. Basically, free energy differences greater than ~710 KbT cannot be handled in the exponential regime alone. Note that this has nothing to do with overlap -- if the energy differences alone cause the free energy differences, it will still fail.

We should create a special tests/ directory for difficult cases, provided they are small enough to include with the standard package. Otherwise, they should go into pymbar-examples or pymbar-datasets. Or maybe we can create a pymbar-tests?

kyleabeauchamp · 2013-09-05T13:58:27Z

So here's my proposed organization:

Create a directory of test systems, each of which is a class that generates the necessary data
Have wrappers in our nosetests that runs all the tests (some might be disabled for speed reasons, if necessary)

kyleabeauchamp · 2013-09-05T14:28:42Z

Regarding the 700 kt underflow limit, while working in log space is one solution, another solution would be to switch to float128 in situations where more precision is needed.

mrshirts · 2013-09-05T15:12:34Z

That only doubles the range to 1400 kBT. There are things that are fundamentally better to do logarithmically, and its surprisingly easy to bump up against them. Two harmonic oscillators with equal spring constants but energy offsets of 2000 kBT can still break parts of pymbar if done solely in exponential.

mrshirts · 2013-09-05T15:22:00Z

So here's my proposed organization:

Create a directory of test systems, each of which is a class that generates the necessary data

Can you clarify this? Sometimes, there will be data sets where the data cannot be generated on the fly.

Have wrappers in our nosetests that runs all the tests (some might be disabled for speed reasons, if necessary)

Yes, there should be standard tests that always run.

I'm thinking that there are really three types of data sets and examples:

tests (currently existing):
This is supposed to provide the automated tests. For example tests/harmonic-oscillators/harmonic-oscillators.py runs through all the functionality in pymbar and checks analytical error estimates by making sure biases are within 1-2 std. It could probably be integrated much better. The tests/timeseries/timeseries/test-statistical-efficiency code checks the timeseries.py. The tests/harmonic-oscillators/harmonic-oscillators.py is supposed to test the distribution of error, which it basically does, but doesn't test all functionality.
examples (currently existing) Pedagogical examples, should be part of the main download. Are for users, and may involve small tests sets. But the main goal is to help people write code using these examples. Data sets should be small so that it can fit in the main download.
large-datasets (kind of existing): This kind of already exists, and should be for larger scale testing that is 1) to expensive to automate and 2) a database of all problems we've encountered, so we can make sure that new code doesn't break weird but important cases. Stress testing ( like the energy range problem) should be here.

kyleabeauchamp · 2013-09-05T15:22:47Z

Thanks. Here's another question: do we need everything in log space, or just $f_i$. For example, can we pre-compute $Q_{ij} = exp(-u_{ij})$, rather than doing it over and over again during the self-consistent iteration? That's probably a large chunk of the slowdown.

mrshirts · 2013-09-05T15:25:32Z

In principle, I agree that we don't want to change things that aren't broken. However, if it leads to major code
simplification and maintainability, such changes might be worthwhile.

Certainly! I'm just pointing out that it is currently easy to break the code because we don't have all of the cases explicitly laid out -- they are all in emails between John and I that need to be added as test cases in pymbar. So one should not assume that just because harmonic-oscillators.py is unchanged that a new change will not break the code right now. The situation does need to change longer term (which for me, might mean a month or two).

mrshirts · 2013-09-05T15:37:32Z

Thanks. Here's another question: do we need everything in log space, or just $f_i$. For example, can we pre-
compute $Q_{ij} = exp(-u_{ij})$, rather than doing it over and over again during the self-consistent iteration?
That's probably a large chunk of the slowdown.

That's a really good question that I would need to play around with. I THINK the answer is no, but I can't remember.

My first reaction is caution, so I'd still advocating taking the current code, changing everything to NxK, and then all subsequent proposed modifications (like this) will be much easier to make. I'd like to see how much of the problem is all the annoying is-sampled-state index dancing, and then work from there. Getting rid of that will make the code much more readable without changing any convergence properties, and then we can separately address optimizations that may result in different convergence.

I'm sorry I'm being a pain here. Clearly, the specific problems in numerical stability that resulted in the switch to log form need to be documented better. I'm just trying to make sure we minimize pain in switching back as I'm almost certain will be needed :) Though I could be wrong.

One of the reasons we haven't been worrying as much about the self-consistent speed is that NR is so much more robust as an optimization method than self-consistent iteration when it can be used (which is when states with samples < 1,0000 or so).

kyleabeauchamp · 2013-09-05T15:39:57Z

So note that in the benchmark, I think my proposed self-consistent code is actually 100X faster than the NR code--it's better than apples to apples...

kyleabeauchamp · 2013-09-05T15:42:08Z

I agree that speed is lower priority than accuracy and readability, though.

kyleabeauchamp · 2013-09-05T15:46:04Z

git clone https://github.com/kyleabeauchamp/pymbar.git
cd pymbar
git checkout refactor

kyleabeauchamp · 2013-09-05T15:46:15Z

Oops, wrong thread.

mrshirts · 2013-09-05T16:00:05Z

So note that in the benchmark, I think my proposed self-consistent code is actually 100X faster than the NR code--it's better than apples to apples...

The problem with self-consistent is that relative_tolerance means something very different. for NR, given quadratic convergence, the df_i+1 - df_i is essentially df_i+1 - df_converged. For self-consistent iteration, df_i+1 - df_i =/= df_i+1 - df_converged, since it's sub-linear convergence. These are issues that only appear with nasty (but still common) cases.

jchodera · 2013-09-06T13:03:46Z

The problem with self-consistent is that relative_tolerance means something very different. for NR, given quadratic convergence, the df_i+1 - df_i is essentially df_i+1 - df_converged. For self-consistent iteration, df_i+1 - df_i =/= df_i+1 - df_converged, since it's sub-linear convergence. These are issues that only appear with nasty (but still common) cases.

Zhiqiang Tan had a great idea for looking at the relative tolerance for both of these schemes in a consistent fashion: Compute the norm of the gradient of the related minimization problem. We compute this gradient in our adaptive scheme to decide whether self-consistent iteration or Newton-Raphson is getting "closer" to the converged result:

      # gradient (defined by Eq. C6 of [1])
      # g_i(theta) = N_i - \sum_n N_i W_ni

mrshirts · 2013-09-06T14:06:39Z

This does lose some of the advantages of self-consistent iteration if you
have to do the gradient calculation at each step. Though perhaps we can
play around with the variables so that not much extra work is required at
each step.

Please don't get me wrong in all of this. I think it's great Kyle is
finding these tricks. We just need to get the robustness tests in so that
we don't have to rely on me remembering what things broke the code in the
past leading to the somewhat more convoluted than ideal version that we
have now.

On Fri, Sep 6, 2013 at 9:03 AM, John Chodera notifications@github.comwrote:

The problem with self-consistent is that relative_tolerance means
something very different. for NR, given quadratic convergence, the df_i+1 -
df_i is essentially df_i+1 - df_converged. For self-consistent iteration,
df_i+1 - df_i =/= df_i+1 - df_converged, since it's sub-linear convergence.
These are issues that only appear with nasty (but still common) cases.

Zhiqiang Tan had a great idea for looking at the relative tolerance for
both of these schemes in a consistent fashion: Compute the norm of the
gradient of the related minimization problem. We compute this gradient in
our adaptive scheme to decide whether self-consistent iteration or
Newton-Raphson is getting "closer" to the converged result:
  # gradient (defined by Eq. C6 of [1])
  # g_i(theta) = N_i - \sum_n N_i W_ni
—
Reply to this email directly or view it on GitHubhttps://github.com//issues/3#issuecomment-23938121
.

mrshirts · 2013-09-06T14:08:24Z

Note also that even if the self-consistent iteration and NR are using the
same test for convergence, that does not change the self-consistent
iteration performance. It's just that it becomes clearer than it takes
more steps.

On Fri, Sep 6, 2013 at 10:06 AM, Michael Shirts mrshirts@gmail.com wrote:

This does lose some of the advantages of self-consistent iteration if you
have to do the gradient calculation at each step. Though perhaps we can
play around with the variables so that not much extra work is required at
each step.

Please don't get me wrong in all of this. I think it's great Kyle is
finding these tricks. We just need to get the robustness tests in so that
we don't have to rely on me remembering what things broke the code in the
past leading to the somewhat more convoluted than ideal version that we
have now.

On Fri, Sep 6, 2013 at 9:03 AM, John Chodera notifications@github.comwrote:
The problem with self-consistent is that relative_tolerance means
something very different. for NR, given quadratic convergence, the df_i+1 -
df_i is essentially df_i+1 - df_converged. For self-consistent iteration,
df_i+1 - df_i =/= df_i+1 - df_converged, since it's sub-linear convergence.
These are issues that only appear with nasty (but still common) cases.

Zhiqiang Tan had a great idea for looking at the relative tolerance for
both of these schemes in a consistent fashion: Compute the norm of the
gradient of the related minimization problem. We compute this gradient in
our adaptive scheme to decide whether self-consistent iteration or
Newton-Raphson is getting "closer" to the converged result:
  # gradient (defined by Eq. C6 of [1])
  # g_i(theta) = N_i - \sum_n N_i W_ni
—
Reply to this email directly or view it on GitHubhttps://github.com//issues/3#issuecomment-23938121
.

kyleabeauchamp · 2014-07-30T23:35:33Z

So NxK is done. There are some useful discussions here regarding precision and performance, but I think I've refreshed my memory of the key points.

ghost assigned mrshirts Jul 5, 2013

kyleabeauchamp mentioned this issue Sep 5, 2013

[WIP] Added new self-consistent code and notes. #8

Closed

kyleabeauchamp mentioned this issue Sep 6, 2013

[WIP] New tests #9

Merged

mrshirts modified the milestones: 3.0 release, 2.0 release Apr 22, 2014

kyleabeauchamp modified the milestones: 2.0 release, 3.0 release, 2.1 release Apr 22, 2014

kyleabeauchamp closed this as completed Jul 30, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Convert to N x K storage for U_kln matrix #3

Convert to N x K storage for U_kln matrix #3

mrshirts commented Jul 5, 2013

kyleabeauchamp commented Jul 5, 2013

kyleabeauchamp commented Jul 5, 2013

jchodera commented Aug 11, 2013

mrshirts commented Aug 11, 2013

jchodera commented Aug 12, 2013

mrshirts commented Aug 12, 2013

kyleabeauchamp commented Aug 12, 2013

kyleabeauchamp commented Sep 4, 2013

mrshirts commented Sep 4, 2013

mrshirts commented Sep 4, 2013

jchodera commented Sep 5, 2013

jchodera commented Sep 5, 2013

mrshirts commented Sep 5, 2013

mrshirts commented Sep 5, 2013

kyleabeauchamp commented Sep 5, 2013

jchodera commented Sep 5, 2013

kyleabeauchamp commented Sep 5, 2013

kyleabeauchamp commented Sep 5, 2013

mrshirts commented Sep 5, 2013

mrshirts commented Sep 5, 2013

kyleabeauchamp commented Sep 5, 2013

mrshirts commented Sep 5, 2013

mrshirts commented Sep 5, 2013

kyleabeauchamp commented Sep 5, 2013

kyleabeauchamp commented Sep 5, 2013

kyleabeauchamp commented Sep 5, 2013

kyleabeauchamp commented Sep 5, 2013

mrshirts commented Sep 5, 2013

jchodera commented Sep 6, 2013

mrshirts commented Sep 6, 2013

mrshirts commented Sep 6, 2013

kyleabeauchamp commented Jul 30, 2014

Convert to N x K storage for U_kln matrix #3

Convert to N x K storage for U_kln matrix #3

Comments

mrshirts commented Jul 5, 2013

kyleabeauchamp commented Jul 5, 2013

kyleabeauchamp commented Jul 5, 2013

jchodera commented Aug 11, 2013

mrshirts commented Aug 11, 2013

jchodera commented Aug 12, 2013

mrshirts commented Aug 12, 2013

kyleabeauchamp commented Aug 12, 2013

kyleabeauchamp commented Sep 4, 2013

mrshirts commented Sep 4, 2013

mrshirts commented Sep 4, 2013

jchodera commented Sep 5, 2013

jchodera commented Sep 5, 2013

mrshirts commented Sep 5, 2013

mrshirts commented Sep 5, 2013

kyleabeauchamp commented Sep 5, 2013

jchodera commented Sep 5, 2013

kyleabeauchamp commented Sep 5, 2013

kyleabeauchamp commented Sep 5, 2013

mrshirts commented Sep 5, 2013

mrshirts commented Sep 5, 2013

kyleabeauchamp commented Sep 5, 2013

mrshirts commented Sep 5, 2013

mrshirts commented Sep 5, 2013

kyleabeauchamp commented Sep 5, 2013

kyleabeauchamp commented Sep 5, 2013

kyleabeauchamp commented Sep 5, 2013

kyleabeauchamp commented Sep 5, 2013

mrshirts commented Sep 5, 2013

jchodera commented Sep 6, 2013

mrshirts commented Sep 6, 2013

mrshirts commented Sep 6, 2013

kyleabeauchamp commented Jul 30, 2014