Runtime of the cost computation #37

dkobak · 2018-09-20T13:15:53Z

While running the code on the n=1.3mln dataset, I noticed that the CPU usage looks like that:

There are exactly 50 short valleys (each valley takes ~1s) with short periods of 100% CPU for all cores (also ~1s) -- this I assume corresponds to the attractive force computation (parallel on all cores) and the repulsive force computation (currently not parallel). Together they take ~2s per iteration, i.e. ~100s per 50 iterations. But after 50 iterations there is a long ~30s period when something happens on one core. WHAT IS IT?!

The only thing that is done every 50 iterations, is the cost computation. Does the cost computation really take so long? Can it be accelerated? If not, and if it really takes 25% of the computation time, then I think it should be optional and switched off by default (at least for large datasets).

dkobak · 2018-09-20T13:22:50Z

By the way, see Sections 3.1 and 3.2 in @pavlin-policar's write-up here: https://github.com/pavlin-policar/fastTSNE/blob/master/notes/notes.pdf. If I understood 3.1 correctly, then this should accelerate the computation 2 times (even though I would argue that 12% of the runtime for cost computation is still quite a lot). The 3.2 is about something else but it would be a cool thing to implement as well.

dkobak · 2018-09-20T13:31:09Z

Also, the cost is computed once before the iterations start (iter==0) but never displayed! See https://github.com/KlugerLab/FIt-SNE/blob/master/src/tsne.cpp#L497

linqiaozhi · 2018-09-21T03:54:01Z

Wow, this is a great catch. I had forgotten to parallelize the cost computation. It is now parallelized in my fork; I will PR it hopefully tomorrow (just want to test on linux and fix the windows problems from the other issue).

I agree it would be nice to accelerate it further using a faster computation as in Pavlin's writeup...but for now, this should make it negligible relative to the rest of the code. If you don't see that, then we can work on accelerating it or making it optional as you suggested.

Thanks for catching this!

dkobak · 2018-09-21T06:35:38Z

Cool, looking forward to all the changes!

pavlin-policar · 2018-09-21T13:35:55Z

The thing in Section 3.1 is pretty useful since we can reduce the number of passes needed over the data from 2 to 1 and can be done completely during the gradient computation. Actually evaluating the error also turns out to take a long time - I guess computing the log in the KL divergence a lot of times can be quite slow.

I like the correction in Section 3.2 because with exaggeration, p_{ij}s are not actual probabilities and the KL divergence is just plain wrong. However, if we apply the correction the true KL divergence can go up during the early exaggeration phase which could be unsettling if you don't know what's going on (I imagine most people expect optimization to reduce the error, not increase it!).

dkobak · 2018-09-22T20:28:04Z

@pavlin-policar Let me see if I understand eq (29) in your PDF correctly: it seems it can be further simplified as

$$\frac{1}{\alpha} \text{KL}_\text{raw} - \frac{1}{\alpha}\sum_{ij}\alpha p_{ij} \log(\alpha) 
= \frac{1}{\alpha} \text{KL}_\text{raw} - \log(\alpha) \sum_{ij} p_{ij} 
= \frac{1}{\alpha} \text{KL}_\text{raw} - \log(\alpha),$$

right?

So the adjustment boils down to: take the "raw" KL divergence (computed with exaggerated p's), divide by alpha and subtract log(alpha).

pavlin-policar · 2018-09-22T21:08:41Z

@dkobak so I'll just convert your expression to an image so it's easier to see:

And yes, you're exactly right! and now I feel very silly for not seeing this before. And this is also slightly more efficient since we get rid of the sum (although this is evaluated so rarely it won't really matter) but the expression is much nicer.

I've also tested this numerically in my code and, indeed, it is the same - like the maths say.

dkobak · 2018-09-22T21:28:57Z

@pavlin-policar I agree it's a nice expression! Feel free to add it to your Section 3.2.

I've just implemented it and added to the pending #39.

However, if we apply the correction the true KL divergence can go up during the early exaggeration phase which could be unsettling if you don't know what's going on (I imagine most people expect optimization to reduce the error, not increase it!).

Not sure what you mean here exactly: if KL_raw decreases, then KL_raw/alpha - log(alpha) will also decrease. Do you mean that the adjusted KL can jump up after the exaggeration is switched off?

pavlin-policar · 2018-09-22T21:57:28Z

Yes, sorry for the confusion. This happens when we use late exaggeration and not in general. When switching from the normal regime to the late exaggeration regime, the corrected KL divergence does indeed go up.

I've copied below a run on the MNIST data set - ignore the timing and output - I've used my implementation - but the effect should be the same. The horizontal lines indicate the switch of regime - from early exaggeration to no exaggeration to late exaggeration. During the early exaggeration phase and no exaggeration regime the KL divergence drops down to 3.3663, and it rises to 3.4769 during late exaggeration.

------------------------------------------------------------------
Iteration   50, KL divergence  7.5186, 50 iterations in 2.1725 sec
Iteration  100, KL divergence  7.4820, 50 iterations in 2.1296 sec
Iteration  150, KL divergence  6.4166, 50 iterations in 2.1392 sec
Iteration  200, KL divergence  6.0529, 50 iterations in 2.1661 sec
Iteration  250, KL divergence  5.8980, 50 iterations in 2.1653 sec
------------------------------------------------------------------
Iteration   50, KL divergence  5.5385, 50 iterations in 2.1775 sec
Iteration  100, KL divergence  5.0582, 50 iterations in 2.2110 sec
Iteration  150, KL divergence  4.6849, 50 iterations in 2.2959 sec
Iteration  200, KL divergence  4.4088, 50 iterations in 2.3607 sec
Iteration  250, KL divergence  4.2014, 50 iterations in 2.5555 sec
Iteration  300, KL divergence  4.0415, 50 iterations in 2.7112 sec
Iteration  350, KL divergence  3.9136, 50 iterations in 2.9802 sec
Iteration  400, KL divergence  3.8079, 50 iterations in 3.3894 sec
Iteration  450, KL divergence  3.7185, 50 iterations in 3.3327 sec
Iteration  500, KL divergence  3.6412, 50 iterations in 3.7742 sec
Iteration  550, KL divergence  3.5734, 50 iterations in 4.9178 sec
Iteration  600, KL divergence  3.5135, 50 iterations in 4.6910 sec
Iteration  650, KL divergence  3.4595, 50 iterations in 5.3508 sec
Iteration  700, KL divergence  3.4106, 50 iterations in 6.0794 sec
Iteration  750, KL divergence  3.3663, 50 iterations in 6.1707 sec
------------------------------------------------------------------
Iteration   50, KL divergence  3.3697, 50 iterations in 6.6408 sec
Iteration  100, KL divergence  3.3889, 50 iterations in 4.5863 sec
Iteration  150, KL divergence  3.4169, 50 iterations in 9.6097 sec
Iteration  200, KL divergence  3.4477, 50 iterations in 5.0035 sec
Iteration  250, KL divergence  3.4769, 50 iterations in 5.4145 sec

This is what I had in mind and this doesn't apply at all if we don't use late exaggeration, which I believe is the default.

dkobak · 2018-09-22T22:02:40Z

@pavlin-policar Hmm. Looking at your output, I don't quite understand why the KL divergence keeps growing during the last 250 iterations. The "raw" KL divergence should be decreasing because that's what gradient descent is doing, and the adjusted KL divergence is a linear function of it, as we discussed above. What am I missing?

dkobak · 2018-09-22T22:19:28Z

@pavlin-policar Actually by now I got myself completely confused. We seem to have concluded that the loss (KL divergence) after exaggeration is given by a linear function of the loss without exaggeration. If so, then how would any exaggeration have any effect?.. This does not make any sense; I should probably go to bed and get some sleep.

pavlin-policar · 2018-09-22T22:30:47Z

@dkobak You're right, this makes no sense - I have no idea what's going on. I'll have to look into this a bit more - it seems like it could be a bug in my code, but it seems unlikely it since I use the same code for all optimization.

I've tried to run a simple example using this implementation using the R wrapper, but I don't think the costs are being transferred into R properly and I just get an array of zeros (or I may be using it wrong). I'll look at this again tomorrow, hopefully, a bit of sleep does us both some good.

pavlin-policar · 2018-09-22T22:42:01Z

Of course, I had forgotten to compile the pulled C++ code... But at least now I know I'm not completely crazy. Running the R wrapper on iris with the following commands:

source('fast_tsne.R')
tmp = as.matrix(iris[, 1:4])
fftRtsne(tmp, start_late_exag_iter=750, perplexity=12, late_exag_coeff=5, get_costs=T)

I get the following output (using the latest master)

Symmetrizing...
Randomly initializing the solution.
Exaggerating Ps by 12.000000
Input similarities computed (sparsity = 0.291244)!
Learning embedding...
Using FIt-SNE approximation.
Iteration 50 (50 iterations in 0.35 seconds), cost 54.519008
Iteration 100 (50 iterations in 0.35 seconds), cost 55.227940
Iteration 150 (50 iterations in 0.33 seconds), cost 54.600685
Iteration 200 (50 iterations in 0.33 seconds), cost 52.697806
Unexaggerating Ps by 12.000000
Iteration 250 (50 iterations in 0.33 seconds), cost 1.957103
Iteration 300 (50 iterations in 0.60 seconds), cost 1.399361
Iteration 350 (50 iterations in 0.44 seconds), cost 0.474211
Iteration 400 (50 iterations in 0.35 seconds), cost 0.303934
Iteration 450 (50 iterations in 0.33 seconds), cost 0.262422
Iteration 500 (50 iterations in 0.33 seconds), cost 0.258651
Iteration 550 (50 iterations in 0.34 seconds), cost 0.255248
Iteration 600 (50 iterations in 0.33 seconds), cost 0.257843
Iteration 650 (50 iterations in 0.33 seconds), cost 0.258827
Iteration 700 (50 iterations in 0.33 seconds), cost 0.253146
Exaggerating Ps by 5.000000
Iteration 750 (50 iterations in 0.33 seconds), cost 9.311141
Iteration 800 (50 iterations in 0.35 seconds), cost 13.777067
Iteration 850 (50 iterations in 0.33 seconds), cost 13.705468
Iteration 900 (50 iterations in 0.33 seconds), cost 14.461525
Iteration 950 (50 iterations in 0.33 seconds), cost 14.457056
Iteration 999 (50 iterations in 0.33 seconds), cost 14.456085
Wrote the 150 x 2 data matrix successfully.
Done.

So in this case, it's not going up consistently - probably because the data set is different, but it does go up. I still have no idea what's going on, but it seems to be the "correct" behaviour. Maybe @linqiaozhi can shed some light on this?

linqiaozhi · 2018-09-23T18:03:55Z

@dkobak @pavlin-policar I like the adjusted KL divergence, if only because people don't get freaked out by the fact that it changes so wildly when exaggeration turns on/off.

@pavlin-policar I think the reason that the cost is bouncing around in the last 250 iterations is that the gradient descent is overshooting. This dataset has 150 points, after a few iterations it has essentially converged. So, you are at the minimum and bouncing around with every iteration, so some times it goes up, some times it goes down.

This behavior is much more apparent with exaggeration, because exaggeration effectively increases the step size. That's not strictly true, because the exaggeration is not only increasing the magnitude of each step, but also the direction of each step (whereas the step size only increases the magnitude). So, when you are already at the minimum, increasing the magnitude of each step by late exaggeration means you will overshoot more easily.

Note that you see the same behavior without late exaggeration (Iteration 550 to 600 it increases), but it is more obvious in the late exaggeration phase for the above reason. That is my interpretation, at least.

dkobak · 2018-09-23T20:41:56Z

@linqiaozhi @pavlin-policar George, I don't think you have fully appreciated Pavlin's and mine confusion! Not sure if it's because you did not notice the apparent "paradox" or because its resolution was obvious to you :)

Here is the paradox:

The t-SNE loss is KL(p || q).
Exaggeration means $p$ is replaced by $ap$.
Meaning that after the exaggeration the loss is KL(ap || q).
Elementary calculation shows that KL(ap || q) = a * KL(p || q) + a * log(a).
Which is a linear function of KL(p || q), meaning that it has the same minimum.
Meaning that the exaggeration should have zero effect.
But we know that it has a very strong effect.
PARADOX.

To be honest, yesterday I was stupefied by this. 1-2 are true by definition. 4-6 are really straightforward. Where is the mistake??!!

Well, I think I got it now. The mistake is in 3. The t-SNE loss after exaggeration is not KL(ap || q). When one uses the standard loss KL(p || q) to arrive to the expression for the gradient which is a sum of attractive and repulsive forces, along the way one uses that \sum p_ij = 1. This is used for the repulsive term. When doing exaggeration, the coefficient \alpha is inserted into the attractive term only, meaning that the resulting gradient is not the gradient of KL(ap || q), because \sum (ap_ij) is not 1. Does it make sense? It's a bit cumbersome to tell without writing all the math out.

This resolves the paradox. As far as the computation of KL(p||q) is concerned, I think Pavlin's adjustment still works fine because KL computation is not influenced by the gradient and remains correct. Now it also isn't surprising that the KL(p||q) can go up during late exaggeration.

What I am wondering, is what actually is the form of the loss that is (implicitly) optimized when using exaggeration!

linqiaozhi · 2018-09-23T21:23:51Z

@dkobak Oh, I see what you are saying! Yes I totally missed the point with my first response!

Yes, you are absolutely right. I think that part of the confusion was in the fact that in the code, we implement early/late exaggeration by multiplying the P_ijs by a constant alpha to give \tilde{P_ij}. This suggests that we are minimizing the KL divergence between \tilde{P_ij} and Q_ij. However, this is not the case.

As you know, when you work out the gradient of the KL divergence between P_ij and Q_ij, you end up with two terms, corresponding to the attractive force and the repulsive force. We multiply alpha times this attractive force to make it stronger. This changes everything...the result is that we are no longer optimizing the KL divergence between P_ij and Q_ij, but something totally different. To figure out what we are optimizing (as you asked in the end of your message), one would essentially have to integrate this new "gradient" and see what it looks like. That actually might be quite interesting, not sure how hard it would be.

But anyways, it will not look like KL(\tilde{P_ij}, Q_ij). As you pointed out, if you go to Appendix 1 of the original t-SNE paper, where van der Maaten and Hinton derive the derivative of KL(P_ij, Q_ij), you'll see that they use the fact that P_ij sums to 1. If you plug in \tilde{P_ij} instead of P_ij and follow that derivation, I expect that the resulting gradient would not be alpha*(Attractive Term) + (Repulsive Term).

In other words, I do not believe the "adjusted KL divergence" that we are now reporting is actually what is being minimized...but probably something close to it.

dkobak · 2018-09-23T22:06:25Z

@linqiaozhi Yes.

I think this might be relevant to the stuff we've been discussing via email about exaggeration vs. fatter tails.

To figure out what we are optimizing (as you asked in the end of your message), one would essentially have to integrate this new "gradient" and see what it looks like. That actually might be quite interesting, not sure how hard it would be.

One actually does not need to integrate. I looked at the derivation of the gradient, and the whole thing can be followed backwards very straightforwardly. The standard loss is

KL(p || q) = const - \sum p_ij log q_ij ~ -\sum p_ij log w_ij + \sum p_ij log Z = -\sum p_ij log w_ij + log Z

where q = w/Z.

Introducing \alpha in the gradient is equivalent to introducing \alpha into the first term here in this loss. So this gives

L = - a \sum p_ij log w_ij + log Z = - \sum (ap_ij) log w_ij + \sum (ap_ij) (log Z)/a = \sum (ap_ij) log (w_ij/Z^1/a)

meaning that the new loss is KL (ap || w/Z^(1/a)).

However, using $ap$ here is equivalent to using $p$ (it only transforms the loss linearly), so one can equivalently say that the new loss is KL(p || w/Z^(1/a)).

I like it! The input probabilities stay the same, but the output probabilities (not really probabilities anymore as they do not sum to 1) are changed by effectively decreasing Z. In the context of our email discussion the question now becomes, is there a different output kernel for $w$ (e.g. fatter-tailed one?) that could mimic this behaviour to some extent.

linqiaozhi · 2018-09-23T22:41:54Z

@dkobak Wow, I like that very much...this seems to be a really fruitful perspective on the early/late exaggeration phases.

I'll have to think more about what this means in terms of the output kernel and late exaggeration--I agree it could lead to a more principled approach. Very cool.

pavlin-policar · 2018-09-24T08:25:21Z

@linqiaozhi, @dkobak Awesome that you've figured it out. I too was wrong at the third point in @dkobak's list, but looking at the derivation, it makes sense. Intuitively, it seemed to me that if we want to pull the points closer together, this should change the objective, but I was really confused how the correction was linear for the exaggeration phase if the cost was changed. Anyways, really cool.

So one implication of this expression I guess is that piling on exaggeration won't really help much, because of the 1/a relation. So switching from no exaggeration to 12 will have a much larger effect than switching from 12 to 24. Does that sound about right?

dkobak · 2018-09-24T08:49:35Z

@pavlin-policar Well, going from 1 to 12 is 12x increase, so I'd guess a comparable increase would be from 12 to 144, instead of 12 to 24. But it's hard to say. I think our understanding of exaggeration is still very poor.

In general, one should really distinguish "early" exaggeration from "late" exaggeration (by the way, I haven't seen any benefit in using a phase with exaggeration=1 between early and late; I prefer to switch from early to late directly). Early exaggeration is an optimization trick. Late exaggeration is a meaningful change of the loss function. For MNIST with n=70k early alpha=12 and late alpha=4 work quite well, see https://github.com/KlugerLab/FIt-SNE/blob/master/examples/test.ipynb. For the 10x dataset with n=1.3 mln, Linderman & Steinberger seem to suggest something like alpha=1.3e+6/10/200 = 650 (given learning rate h=200) for the early phase (right @linqiaozhi ?) , whereas I find that alpha=8 works well for the late. I am doing some additional experiments with it right now...

dkobak · 2018-09-24T10:36:30Z

@linqiaozhi Running current master on the same data as in the first post in this issue (n=1.3mln), I see that the cost computation still happens on a single thread; I think this is due to #40.

More worryingly, however, I now get 160s runtime per 50 iterations, instead of 130s previously (before the big PR over the weekend) :-(

linqiaozhi · 2018-09-24T12:19:24Z

@dkobak Wow, that's not the right direction!

I compared the current master with 61e4a2b, for N=1 million, and the times are about the same on my machine. Until we address #40, this is what I would have expected, since as I explained with the "negative result" in that PR over the weekend, we didn't really get much out of multithreading the repulsive forces.

However, it should definitely not be slower! I wonder if the multithreading in "Step 3" of that function is causing overhead on your architecture. Would you mind comparing the times with and without multithreading the n_body_fft_2d() function? That is, you could add the line nthreads=1 to here and compare to times without that line?

Anyways, this is hugely helpful. I am finding that small improvements from multithreading don't generalize well, and can even backfire in very bad ways on different machines (which is hard to know unless other people compare the times on their machines). If we confirm that the problem is overhead from multithreading, then we can just remove it for that function. The real boost that you get from multithreading is for the attractive forces, anyways.

dkobak · 2018-09-24T13:30:16Z

I recompiled the source and restarted my Python scripts. I get 168s with the current master, 174s if n_body_fft_2d() is called with 1 instead of nthreads in https://github.com/KlugerLab/FIt-SNE/blob/master/src/tsne.cpp#L786, and 132s if I do git checkout 61e4a2b...

The machine is Intel Xeon CPU E3-1230 v5 @ 3.40GHz, 4 cores 2 threads each.

linqiaozhi · 2018-09-24T15:24:52Z

@dkobak I also tried comparing on my linux server, and still can't replicate the difference in your timings.

In my fork, I added timings to 61e4a2b in a branch called profiled_legacy. Would you mind compiling that branch with -DTIME_CODE, and also compiling the master of that fork with -DTIME_CODE and posting the results? You could just copy the resulting binaries into whatever test directory structure you have already been using, if that is easier. Hopefully this will tell us what is slowing down. Sorry to bother you with this...

For example, on my server, a typical iteration has the following timings (and I added in the cost of error too):

61e4a2:						
Step 1: 140 ms						
FFT: 9 ms                                               
Step 3: 77 ms                                           
Total Interpolation: 261 ms                             
Attractive Forces: 306 ms                               
Computing Error: 12734 ms                               

Current:
Step 1: 138 ms
FFT: 8 ms
Step 3: 20 ms
Total Interpolation: 201 ms
Attractive Forces: 332 ms
Computing Error: 14900 ms

Everything should be the same except for Step 3, which is a bit faster due to the parallelization.

dkobak · 2018-09-24T15:54:26Z

Sure. The timings jump around a bit (would be good to see averages over some iterations...), especially Attractive Forces (it's also by far the longest here), but from quickly eyeballing the output for a couple of minutes it seems that maybe Step 1 got consistently slower?.. However, the difference is tiny and I don't quite understand why the difference per 50 iterations is 160s vs 130s.

I have to leave now, let me double check it tomorrow.

Profiled_legacy:

Step 1: 119 ms
FFT: 7 ms
Step 3: 77 ms
Total Interpolation: 222 ms
Attractive Forces: 2298 ms

Master:

Step 1: 136 ms
FFT: 7 ms
Step 3: 38 ms
Total Interpolation: 201 ms
Attractive Forces: 2362 ms

linqiaozhi · 2018-09-24T15:57:38Z

Okay, thanks. When you get the chance, could you also report the times for the computation of error? That's essentially all that's left...in the times you just sent, the Interpolation in Master is faster in total (Because speedup of Step 3).

dkobak · 2018-09-24T16:11:18Z

Yes. Something does not add up here. I'm pretty sure that the error took the same time (around 30s). I'll try to analyse the output for 50 consecutive iterations.

dkobak · 2018-09-25T09:39:04Z

Good morning! I spent a fair amount of time profiling the two versions of code and preparing a plot of the timing difference for each step over 50 iterations...

... before realizing that in profiled_legacy the error evaluation is excluded from the reported time of 50 iterations
https://github.com/linqiaozhi/FIt-SNE/blob/profiled_legacy/src/tsne.cpp#L501
whereas in the master the error evaluation it is included:
https://github.com/linqiaozhi/FIt-SNE/blob/master/src/tsne.cpp#L527

That's exactly the 30s difference I was observing 😱 Apart from that, the interpolation did get a little faster and the rest is unchanged.

We should both be embarrassed now! :)

linqiaozhi · 2018-09-25T09:44:59Z

@dkobak Alas...oh man, that is rough. So sorry you had to spend all that time on it.

Anyways, thanks for figuring it out. Of course, that explains it perfectly. That should have been the first thing I thought of, but I guess I did not realize that I made that change when I switched timing libraries (which was just so that it would run on Windows). Ouch, man.

dkobak · 2018-09-25T09:46:42Z

No problem, that's how these things are... I am relieved now.

Closing this issue.

linqiaozhi mentioned this issue Sep 22, 2018

Multithreading gradient descent, required FFT to be of preset sizes, clean up timing, fix Windows bug #38

Merged

dkobak mentioned this issue Sep 22, 2018

Some tiny fixes #39

Merged

linqiaozhi mentioned this issue Sep 24, 2018

Refactored default nthreads setting #41

Merged

linqiaozhi mentioned this issue Sep 24, 2018

Handling of nthreads #40

Closed

dkobak closed this as completed Sep 25, 2018

pavlin-policar mentioned this issue Oct 20, 2018

Simplify KL divergence correction pavlin-policar/openTSNE#14

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Runtime of the cost computation #37

Runtime of the cost computation #37

dkobak commented Sep 20, 2018 •

edited

dkobak commented Sep 20, 2018

dkobak commented Sep 20, 2018

linqiaozhi commented Sep 21, 2018

dkobak commented Sep 21, 2018

pavlin-policar commented Sep 21, 2018

dkobak commented Sep 22, 2018 •

edited

pavlin-policar commented Sep 22, 2018

dkobak commented Sep 22, 2018 •

edited

pavlin-policar commented Sep 22, 2018

dkobak commented Sep 22, 2018

dkobak commented Sep 22, 2018

pavlin-policar commented Sep 22, 2018

pavlin-policar commented Sep 22, 2018

linqiaozhi commented Sep 23, 2018

dkobak commented Sep 23, 2018 •

edited

linqiaozhi commented Sep 23, 2018

dkobak commented Sep 23, 2018 •

edited

linqiaozhi commented Sep 23, 2018

pavlin-policar commented Sep 24, 2018

dkobak commented Sep 24, 2018

dkobak commented Sep 24, 2018

linqiaozhi commented Sep 24, 2018 •

edited

dkobak commented Sep 24, 2018

linqiaozhi commented Sep 24, 2018

dkobak commented Sep 24, 2018

linqiaozhi commented Sep 24, 2018 •

edited

dkobak commented Sep 24, 2018

dkobak commented Sep 25, 2018

linqiaozhi commented Sep 25, 2018

dkobak commented Sep 25, 2018

Runtime of the cost computation #37

Runtime of the cost computation #37

Comments

dkobak commented Sep 20, 2018 • edited

dkobak commented Sep 20, 2018

dkobak commented Sep 20, 2018

linqiaozhi commented Sep 21, 2018

dkobak commented Sep 21, 2018

pavlin-policar commented Sep 21, 2018

dkobak commented Sep 22, 2018 • edited

pavlin-policar commented Sep 22, 2018

dkobak commented Sep 22, 2018 • edited

pavlin-policar commented Sep 22, 2018

dkobak commented Sep 22, 2018

dkobak commented Sep 22, 2018

pavlin-policar commented Sep 22, 2018

pavlin-policar commented Sep 22, 2018

linqiaozhi commented Sep 23, 2018

dkobak commented Sep 23, 2018 • edited

linqiaozhi commented Sep 23, 2018

dkobak commented Sep 23, 2018 • edited

linqiaozhi commented Sep 23, 2018

pavlin-policar commented Sep 24, 2018

dkobak commented Sep 24, 2018

dkobak commented Sep 24, 2018

linqiaozhi commented Sep 24, 2018 • edited

dkobak commented Sep 24, 2018

linqiaozhi commented Sep 24, 2018

dkobak commented Sep 24, 2018

linqiaozhi commented Sep 24, 2018 • edited

dkobak commented Sep 24, 2018

dkobak commented Sep 25, 2018

linqiaozhi commented Sep 25, 2018

dkobak commented Sep 25, 2018

dkobak commented Sep 20, 2018 •

edited

dkobak commented Sep 22, 2018 •

edited

dkobak commented Sep 22, 2018 •

edited

dkobak commented Sep 23, 2018 •

edited

dkobak commented Sep 23, 2018 •

edited

linqiaozhi commented Sep 24, 2018 •

edited

linqiaozhi commented Sep 24, 2018 •

edited