Improve performance #12

gregr · 2016-08-13T01:25:21Z

Fix improper walks by list-split-ground

We were only getting the benefit of set-var-val! vars by not properly
passing the state's substitution to walk. This fix reduces latency by
between 1/3 and 2/3 on the harder benchmarks (the harder, the more
savings, as usual).

We were only getting the benefit of set-var-val! vars by not properly passing the state's substitution to walk. This fix reduces latency by between 1/3 and 2/3 on the harder benchmarks (the harder, the more savings, as usual).

gregr · 2016-08-13T01:27:58Z

So the former 3s benchmark (which was 30s before today) now takes about 2s, and the new former 19s benchmark now takes about 7s.

gregr · 2016-08-13T02:58:03Z

FYI, I'm also experimenting with optionally limiting search depth in faster-mk. I have a depth-limiting version which about doubles performance again on top of this pull request (2s -> 1s, 7s -> 4s).

Unfortunately I've still yet to see the newest/hardest queries finish without running out of memory, even then.

gregr · 2016-08-13T03:22:58Z

I finally found a way to synthesize virtually all of append (in under 10 seconds, even):

(define append
  (lambda (l s)
    (if ,A ,B ,C))

But the change needed is disappointing to have to make. I changed the definition of if-primo to expect the condition to always return a boolean. This means that instead of (=/= #f t) for the true case we have (== #t t), which is not the usual Scheme semantics.

The next hardest query (having to guess that if is needed in the first place) still doesn't finish.

webyrd · 2016-08-13T06:18:21Z

How are you keeping the synthesis from cheating in these examples? With
careful tests?

The next hardest query (having to guess that if is needed in the first
place) still doesn't finish.

I think it would be reasonable, when running on a machine with many cores,
to have at least one core guess that the definition is recursive, and
therefore has an if as its body. Of course, speeding up synthesis so
that this isn't needed would be nice, but I think for more complex examples
we'll need these sorts of heuristics/speculative evaluations.

But the change needed is disappointing to have to make. I changed the
definition of if-primo to expect the condition to always return a
boolean. This means that instead of (=/= #f t) for the true case we have (== #t t), which is not the usual Scheme semantics.

Once again, given enough parallelism at the hardware level, it seems
reasonable to have at least one thread guess that the test of an if
returns a Boolean. Also, we should probably have at least one thread guess
that the entire definition is well-typed (for some type system closer to
that of ML or whatever), and try to perform type inference. If this
happens to be the case, synthesis could be far faster.

On Fri, Aug 12, 2016 at 9:22 PM, Greg Rosenblatt notifications@github.com
wrote:

I finally found a way to synthesize virtually all of append (in under 10
seconds, even):

(define append
(lambda (l s)
(if ,A ,B ,C))

But the change needed is disappointing to have to make. I changed the
definition of if-primo to expect the condition to always return a
boolean. This means that instead of (=/= #f t) for the true case we have (==
#t t), which is not the usual Scheme semantics.

The next hardest query (having to guess that if is needed in the first
place) still doesn't finish.

—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#12 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/ABHThJR-4OjRnyA_Q_oEvzxQkEmC8QF7ks5qfTiSgaJpZM4Jjkzt
.

gregr · 2016-08-13T09:31:38Z

Great! Running multiple versions in parallel should also allow us to make use depth limiting for faster queries in many cases without jeopardizing completeness.

To get this new synthesis result, I haven't had to change the nature of test cases at all, believe it or not. I'm using the same test cases as seen in test-old-interp.scm for the other benchmarks. I just changed the interpreter and it worked everything out by itself!

gregr · 2016-08-13T09:45:57Z

I just managed to synthesize all of append from test cases alone, with Barliman's starting skeleton! That means it guesses the name append, the parameter structure of the lambda, and that an if statement will work.

(define ,A
  (lambda ,B ,C))

Guess:

(define append
  (lambda (_.0 _.1)
    (if (null? _.0)
      _.1
      (cons (car _.0) (append (cdr _.0) _.1)))))

Side conditions: (sym _.0 _.1)

Still using the same test cases as before, with the boolean-only if semantics, but now I've moved if-primo out of prim-expo and placed it just ahead of variadic lambda application. Synthesis takes about 39 seconds.

gregr · 2016-08-13T09:52:16Z

I should note that the 39 second timing is with a depth limit. Without the limit, I haven't seen a result even after a few minutes of waiting.

gregr · 2016-08-13T09:57:33Z

By the way, it turns out you don't even need Barliman's starting skeleton for the full synthesis result. Without the skeleton, you get the same result in the same amount of time.

gregr · 2016-08-13T10:04:52Z

By providing just a little more than the starting skeleton ...

(define append
  (lambda (l s) ,A))

... performance improves dramatically. This one completes correctly in less than 5 seconds.

gregr · 2016-08-13T14:55:01Z

Oh, if you're confused as to how this is possible without additional tests to prevent cheating, I forgot to remind you that the depth limit itself also helps to protect against cheating.

To avoid confusion, note that this effect is not caused by cheats being pruned by the depth limit directly: the cheats we've seen are found at shallower depths than real answers, so that cannot explain this effect.

The real explanation seems to be that interesting lines of search retain more search resources as their siblings die off from hitting the limit. Interesting lines of search are disproportionately rewarded by this sibling pruning, compared with cheating lines of search.

This is the same reason that non-gensym test cases with a depth limit are enough to solve some of the problems that require gensym test cases without a limit. We figured this out while trying to understand why run-da-dls can solve hard 9 without gensyms in its test cases a while back: https://github.com/gregr/racket-misc/blob/8ee15173e25521b57bf14507c6bf61422fd6a57e/interp-tests.rkt#L740-L754

gregr · 2016-08-13T18:16:11Z

I've pushed an experimental branch (don't merge it!) for depth-limited search and tweaks to if semantics for you to play with: https://github.com/webyrd/Barliman/commits/do-not-merge-experimental-dls

You can manually adjust max-search-depth in mk.scm to adjust the depth limit. Let me know if you have any questions.

webyrd · 2016-08-13T18:17:36Z

If we had ~100 hardware threads, could we just increase the
max-search-depth exponentially on several threads? Would that well, do you
think?

On Sat, Aug 13, 2016 at 12:16 PM, Greg Rosenblatt notifications@github.com
wrote:

I've pushed an experimental branch (don't merge it!) for depth-limited
search and tweaks to if semantics for you to play with:
https://github.com/webyrd/Barliman/commits/do-not-merge-experimental-dls

You can manually adjust max-search-depth in mk.scm to adjust the depth
limit. Let me know if you have any questions.

—
You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub
#12 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/ABHThBQ9Fr39lEmd1MYEPmmLV0KAwt4Qks5qfgnsgaJpZM4Jjkzt
.

gregr · 2016-08-13T18:18:30Z

Absolutely, that's the exact arrangement I was thinking we should try!

webyrd · 2016-08-13T18:21:38Z

Okay! It's probably time to think about getting access to a beefy machine,
or renting an Amazon X1 with spot pricing. :)

On Sat, Aug 13, 2016 at 12:18 PM, Greg Rosenblatt notifications@github.com
wrote:

Absolutely, that's the exact arrangement I was thinking we should try!

—
You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub
#12 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/ABHThAjoQJHQxnnFUdKZSkffJhcAvjORks5qfgp3gaJpZM4Jjkzt
.

gregr · 2016-08-13T18:22:19Z

Here's a screenshot of the complete synthesis:

gregr · 2016-08-13T18:24:59Z

For the simple examples we've been trying, a small depth is always going to work and parallel processes aren't going to help much.

Maybe we should first find more involved examples that can benefit from searching across multiple depth limits simultaneously.

webyrd · 2016-08-16T05:45:22Z

I was able to use your branch to synthesize all of 'zip' in ~30 seconds,
using the three obvious tests with gensyms. I had no luck synthesizing
map, perhaps partly because I wasn't sure which tests might make sense.

On Sat, Aug 13, 2016 at 12:24 PM, Greg Rosenblatt notifications@github.com
wrote:

For the simple examples we've been trying, a small depth is always going
to work and parallel processes aren't going to help much.

Maybe we should first find more involved examples that can benefit from
searching across multiple depth limits simultaneously.

—
You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub
#12 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/ABHThCrkJ5PLqwG4dwyrkyrvUuRXWTaAks5qfgv7gaJpZM4Jjkzt
.

gregr · 2016-08-16T15:32:28Z

It seems to have a lot of trouble figuring out to pass the procedure as an argument. This doesn't seem to come back:

(define map
  (lambda (f xs)
    (if (null? xs)
     '() (cons (f (car xs)) (map ,A (cdr xs))))))

But it doesn't take too long for it to guess everything else once you help it with passing the procedure:

(define map
  (lambda (f xs)
    (,A ,B
     ,C (,D ,E (,F f ,G)))))

I wonder why this is. Maybe backwards information flow in this case is more difficult?

webyrd · 2016-08-16T16:45:26Z

Seems like unfolding might be very helpful here. We are no where close to
using the full information of mostly-ground code. I'm not sure which
approach would work best in general, though.

Do you think Mercury would be able to do something clever with

`
(define map
(lambda (f xs)
(if (null? xs)
'() (cons (f (car xs)) (map ,A (cdr xs))))))

`

On Tue, Aug 16, 2016 at 9:32 AM, Greg Rosenblatt notifications@github.com
wrote:

It seems to have a lot of trouble figuring out to pass the procedure as an
argument. This doesn't seem to come back:

(define map
(lambda (f xs)
(if (null? xs)
'() (cons (f (car xs)) (map ,A (cdr xs))))))

But it doesn't take too long for it to guess everything else once you help
it with passing the procedure:

(define map
(lambda (f xs)
(,A ,B
,C (,D ,E (,F f ,G)))))

I wonder why this is. Maybe backwards information flow in this case is
more difficult?

—
You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub
#12 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/ABHThJrLWUyzmzVYkdhRbHTPi_dRI_wJks5qgdgNgaJpZM4Jjkzt
.

gregr · 2016-08-16T17:08:19Z

Not really thinking in terms of Mercury anymore, but I have an idea for using mostly-instantiated code to get to the point faster. I'm going to try adding a conde1 to faster-mk, representing conde with cases that are all mutually exclusive w.r.t. some goal prefix (index), and defer interleaving for as long as we have enough information to make a deterministic choice (pruning all but one branch based on the prefixes).

webyrd · 2016-08-16T19:02:40Z

That might work!

There has to be a way to take advantage of the determinism, when we have
it. I also wonder if this might be a reasonable case for abstract
interpretation or partial evaluation.

On Tue, Aug 16, 2016 at 11:08 AM, Greg Rosenblatt notifications@github.com
wrote:

Not really thinking in terms of Mercury anymore, but I have an idea for
using mostly-instantiated code to get to the point faster. I'm going to try
adding a conde1 to faster-mk, representing conde with cases that are all
mutually exclusive w.r.t. some goal prefix (index), and defer interleaving
for as long as we have enough information to make a deterministic choice
(pruning all but one branch based on the prefixes).

—
You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub
#12 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/ABHThHLJf1OrFBOLsk65_OO70Fc6AL4rks5qge6GgaJpZM4Jjkzt
.

gregr · 2016-08-16T20:05:57Z

Unfortunately my first attempt with conde1 hasn't improved performance, and may have actually reduced it by activating the garbage collector more often. I'll have to figure this out later.

But I did find a set of tests that makes it easier to find a definition for map by passing primitives as the procedure, probably because they are simpler to evaluate than lambdas.

This query completes in about 20 seconds:

(define map
  (lambda (f xs)
    ,A))

Tests:
(map ',g1 '())
'()

(map car '((,g2 . ,g3)))
`(,g2)

(map cdr '((,g4 . ,g5) (,g6 . ,g7)))
`(,g5 ,g7)

Best Guess:
(define map (lambda (f xs) (if (null? xs) xs (cons (f (car xs)) (map f (cdr xs))))))

gregr · 2016-08-16T20:14:59Z

The full query takes just under 3 minutes:

,A

Best Guess:
(define map (lambda (_.0 _.1) (if (null? _.1) _.1 (cons (_.0 (car _.1)) (map _.0 (cdr _.1))))))

Side conditions:
(sym _.0 _.1)

webyrd · 2016-08-18T01:47:56Z

Nice! I particularly like this test:

(map ',g1 '())
'()

Parallelization/templated starting guesses should help a lot, I'd think.

On Tue, Aug 16, 2016 at 2:15 PM, Greg Rosenblatt notifications@github.com
wrote:

The full query takes just under 3 minutes:

,A

Best Guess:
(define map (lambda (_.0 _.1) (if (null? _.1) .1 (cons (.0 (car _.1)) (map _.0 (cdr _.1))))))

Side conditions:
(sym _.0 _.1)

—
You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub
#12 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/ABHThMTVIEd3NTOH-eV12tRjrVXEXceAks5qghpEgaJpZM4Jjkzt
.

gregr added 2 commits August 12, 2016 21:16

Revert accidental removal of handle-matcho

0325fd0

Fix improper walks by list-split-ground

01255cb

We were only getting the benefit of set-var-val! vars by not properly passing the state's substitution to walk. This fix reduces latency by between 1/3 and 2/3 on the harder benchmarks (the harder, the more savings, as usual).

webyrd merged commit 2358146 into webyrd:master Aug 13, 2016

gregr deleted the improve-performance branch August 13, 2016 18:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve performance #12

Improve performance #12

gregr commented Aug 13, 2016

gregr commented Aug 13, 2016

gregr commented Aug 13, 2016

gregr commented Aug 13, 2016

webyrd commented Aug 13, 2016

gregr commented Aug 13, 2016

gregr commented Aug 13, 2016

gregr commented Aug 13, 2016

gregr commented Aug 13, 2016

gregr commented Aug 13, 2016

gregr commented Aug 13, 2016

gregr commented Aug 13, 2016

webyrd commented Aug 13, 2016

gregr commented Aug 13, 2016

webyrd commented Aug 13, 2016

gregr commented Aug 13, 2016

gregr commented Aug 13, 2016

webyrd commented Aug 16, 2016

gregr commented Aug 16, 2016

webyrd commented Aug 16, 2016

gregr commented Aug 16, 2016

webyrd commented Aug 16, 2016

gregr commented Aug 16, 2016

gregr commented Aug 16, 2016

webyrd commented Aug 18, 2016

Improve performance #12

Improve performance #12

Conversation

gregr commented Aug 13, 2016

gregr commented Aug 13, 2016

gregr commented Aug 13, 2016

gregr commented Aug 13, 2016

webyrd commented Aug 13, 2016

gregr commented Aug 13, 2016

gregr commented Aug 13, 2016

gregr commented Aug 13, 2016

gregr commented Aug 13, 2016

gregr commented Aug 13, 2016

gregr commented Aug 13, 2016

gregr commented Aug 13, 2016

webyrd commented Aug 13, 2016

gregr commented Aug 13, 2016

webyrd commented Aug 13, 2016

gregr commented Aug 13, 2016

gregr commented Aug 13, 2016

webyrd commented Aug 16, 2016

gregr commented Aug 16, 2016

webyrd commented Aug 16, 2016

gregr commented Aug 16, 2016

webyrd commented Aug 16, 2016

gregr commented Aug 16, 2016

gregr commented Aug 16, 2016

webyrd commented Aug 18, 2016