Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use of in-set makes thing slower #17

Open
samth opened this issue Jan 5, 2016 · 16 comments
Open

use of in-set makes thing slower #17

samth opened this issue Jan 5, 2016 · 16 comments

Comments

@samth
Copy link

samth commented Jan 5, 2016

Right now, the in-neighbors sequence unnecessarily features a call to in-set, which wraps the set in an opaque sequence. Sets are sequences, so this isn't necessary, and makes everything slower.

@stchang
Copy link
Owner

stchang commented Jan 5, 2016

Thanks, I'll fix it but I'd say this is a bug with Racket. Using in-set with sets seems like a natural thing for programmers to do.

@stchang
Copy link
Owner

stchang commented Jan 5, 2016

@samth Actually, I'm not seeing any slowdown (with my existing tests). Are you using with Typed Racket? Do you have an example?

@samth
Copy link
Author

samth commented Jan 5, 2016

This did come from typed racket, where the sequence/c contract is slower than set/c. But you should be able to put in-set in a for loop over the neighbors and get a performance win compared to the sequence version.

@stchang
Copy link
Owner

stchang commented Jan 5, 2016

Strangely, I get slower times (by ~15%) without in-set (racket 6.3.0.7).

Without in-set:

$ racket timing-test-in-neighbors.rkt
cpu time: 10221 real time: 10247 gc time: 4
cpu time: 10089 real time: 10113 gc time: 4
cpu time: 10088 real time: 10116 gc time: 0
cpu time: 10177 real time: 10203 gc time: 0
cpu time: 10093 real time: 10122 gc time: 4
cpu time: 10028 real time: 10054 gc time: 4
cpu time: 10005 real time: 10029 gc time: 4
cpu time: 10616 real time: 10643 gc time: 584
cpu time: 10037 real time: 10066 gc time: 0
cpu time: 10045 real time: 10067 gc time: 4

With in-set:

$ racket timing-test-in-neighbors.rkt
cpu time: 8797 real time: 8819 gc time: 8
cpu time: 8724 real time: 8743 gc time: 4
cpu time: 8681 real time: 8705 gc time: 16
cpu time: 8685 real time: 8705 gc time: 0
cpu time: 8680 real time: 8705 gc time: 8
cpu time: 8793 real time: 8814 gc time: 0
cpu time: 8688 real time: 8709 gc time: 0
cpu time: 9369 real time: 9394 gc time: 584
cpu time: 8928 real time: 8953 gc time: 0
cpu time: 8829 real time: 8851 gc time: 8

The test is:

(for ([i 10])
  (time
   (for* ([v (in-vertices g/scc)]
          [u (in-neighbors g/scc v)]
          [w (in-neighbors g/scc u)])
         (void))))

where g/scc has 875714 vertices and 5105043 edges.

@stchang
Copy link
Owner

stchang commented Jan 5, 2016

I pushed the test if you want to try it.

@samth
Copy link
Author

samth commented Jan 5, 2016

Somehow I'm unable to run it properly, but what if you take out the in-set in in-weighted-graph-neighbors, and put it in the for loop?

@stchang
Copy link
Owner

stchang commented Jan 5, 2016

I changed the test to

(for ([i 10])
  (time
   (for* ([v (in-vertices g/scc)]
          [u (in-set (in-weighted-graph-neighbors g/scc v))]
          [w (in-set (in-weighted-graph-neighbors g/scc u))])
         (void))))

and I get the same (faster) times as above.

@stchang
Copy link
Owner

stchang commented Jan 5, 2016

This behavior is consistent with my past experience. Relying on the implicit conversion is generally slower.

@samth
Copy link
Author

samth commented Jan 5, 2016

Right, but the call to in-set ought to be much faster when it can be specialized by the for loop (unless in-set doesn't do that?).

@stchang
Copy link
Owner

stchang commented Jan 5, 2016

Right, but the call to in-set ought to be much faster when it can be specialized by the for loop (unless in-set doesn't do that?).

I don't understand this. It is faster with the call to in-set.

@samth
Copy link
Author

samth commented Jan 5, 2016

Right, it is, but not much. Consider these three loops:

(define l (build-list 1000 add1))
(define lseq (in-list l))

(for/sum ([i l]) i)
(for/sum ([i lseq]) i)
(for/sum ([i (in-list l)]) i)

The third one will be much faster (about 5x by my count).

Then do the same for sets. They're all about 10x slower than the slow list ones, and the same speed. So in-set inside a for loop doesn't win, but it should.

@stchang
Copy link
Owner

stchang commented Jan 5, 2016

Oh I understand now. You're talking about expand-clause? You're right it looks like it doesn't specialize sets.

@stchang
Copy link
Owner

stchang commented Jan 5, 2016

I guess it's because "sets" are generic?

@samth
Copy link
Author

samth commented Jan 5, 2016

Yes, but either that could work better, or we could have in-hash-set.

@stchang
Copy link
Owner

stchang commented Jan 5, 2016

Agreed. I'll look into it.

@stchang
Copy link
Owner

stchang commented Jan 7, 2016

Started a pull request: racket/racket#1199

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants