Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add set->hash #2766

Open
wants to merge 6 commits into
base: master
Choose a base branch
from
Open

add set->hash #2766

wants to merge 6 commits into from

Conversation

stchang
Copy link
Contributor

@stchang stchang commented Jul 26, 2019

Adds a set->hash function that returns the underlying hash table of a hash set.

Currently, it's impossible to efficiently iterate over a hash set manually, ie in-mutable-set and friends cannot be used when defining a custom iteration via define-sequence-syntax for a data structure that uses sets.

There's set-first and set-rest but those are an order of magnitude worse than the hash iteration functions. We could manually add unsafe-immutable-set-first, etc, analogous to unsafe-immutable-hash-iterate-first, but this is easier.

@@ -128,6 +128,13 @@ respectively.

}

@defproc[(set->hash [st (or/c set? set-mutable? set-weak?)]) hash?]{

Converts a set to a hash table, if it is @tech{hash set}.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wording: "... if it is a hash set"?

Also, it might be worth mentioning that, if the set is mutable, changes to it will be reflected in the returned hash and vice versa.

@rmculpepper
Copy link
Collaborator

This seems to allow a user to make the set library violate its contracts and behave strangely. For example:

(define s (mutable-set 'apple))
(define h (set->hash s))
(hash-set! h 'orange 'orange)
(set-member? s 'orange) ;; => 'orange, not #t -- no contract error
(hash-set! h 'banana #f) 
s ;; => (mutable-set 'apple 'orange 'banana)
(set-member? s 'banana) ;; => #f 

Would the performance still be okay if the hash is returned with a chaperone that forbids mutation?

@97jaz
Copy link
Contributor

97jaz commented Jul 30, 2019

Or can we define the hash set's membership predicate in terms of hash-has-key? instead of hash-ref?

@gus-massa
Copy link
Contributor

Doesn't this leak too much information about the internal implementation of set? Perhaps in the future someone will make a implementation of set that use a list for sets with a small number of arguments and switch to a hash when it get's big.

@stchang
Copy link
Contributor Author

stchang commented Aug 1, 2019

Or can we define the hash set's membership predicate in terms of hash-has-key? instead of hash-ref?

@rmculpepper Is this acceptable?

@stchang
Copy link
Contributor Author

stchang commented Aug 1, 2019

Doesn't this leak too much information about the internal implementation of set?

I feel it's ok because:

  1. It doesn't actually reveal anything about the implementation. It's just a conversion function, like the many other X->Y functions in racket
  2. It's explicitly a function for hash sets. Giving any other kind of argument will error. It's not part of the generic set interface.

What do others think?

@97jaz
Copy link
Contributor

97jaz commented Aug 1, 2019

@stchang There are a bunch of other places in the implementation that would need to be similarly changed, e.g., custom-set=?, where hash-ref is used to check for the existence of a key in the other table.[*] Same thing for custom-[proper]-subset?, etc.

[*] Aside: the implementation of custom-set=? looks rather inefficient.

@stchang
Copy link
Contributor Author

stchang commented Aug 1, 2019

@stchang There are a bunch of other places in the implementation that would need to be similarly changed

Good catch, thanks. I think I got them all now.

[*] Aside: the implementation of custom-set=? looks rather inefficient.

What did you have in mind? Is something like this any faster?

  (and (= (hash-count table1) (hash-count table2))
       (for/and ([k1 (in-hash-keys table1)]
                 [k2 (in-hash-keys table2)])
         (and (hash-has-key? table1 k2)
              (hash-has-key? table2 k1))))

@97jaz
Copy link
Contributor

97jaz commented Aug 1, 2019

What did you have in mind?

I was thinking we'd do what the expander does.

For immutable hashes, this will typically be a good deal faster, especially in the negative case.

@stchang
Copy link
Contributor Author

stchang commented Aug 1, 2019

Ah, I didnt know about hash-keys-subset? Several other set fns should probably also use that. Would hash-union offer similar speedups?

@97jaz
Copy link
Contributor

97jaz commented Aug 1, 2019

Would hash-union offer similar speedups?

Not right now, but it could, in principle.

@rmculpepper
Copy link
Collaborator

Here's another example that violates the invariants of a custom set type:

> (define-custom-set-types mumble
    ;; these details aren't important
    #:elem? exact-integer?
    (lambda (a b) (= (modulo a 7) (modulo b 7)))
    (lambda (a) (modulo a 7))
    (lambda (b) 1))
> (define s (make-mutable-mumble))
> (set-add! s 1)
> (set-add! s 2)
> (define h (set->hash s))
> h
(hash (wrapped-elem 1 ...) #t (wrapped-elem 2 ...) #t)
> (hash-set! h 8 #t)
> (set->list s)
; custom-elem-contents: contract violation
;   expected: custom-elem?
;   given: 8

Also, anyone iterating over the hash would also need some way of extracing the actual key from the wrapped-elem structure.

@stchang
Copy link
Contributor Author

stchang commented Aug 1, 2019

Ok I understand now. What if it only worked for immutable hash sets? Or, could it just be added as an unsafe function?

@97jaz
Copy link
Contributor

97jaz commented Aug 1, 2019

I guess I understand, too, but it's not clear to me that this library respects its own abstractions:

#lang racket/base
(require racket/set)

(define-custom-set-types mumble
    #:elem? exact-integer?
    (lambda (a b) #f))

(define s1 (make-mutable-mumble))
(define s2 (make-mutable-mumble))

;; According to the comparison proc,
;; these 5s should be distinct.
(set-add! s1 5)
(set-add! s1 5)

(set-add! s2 5)
(set-add! s2 5)

(set-count s1) ;; => 1
(set=? s1 s2) ;; => #t

@gus-massa
Copy link
Contributor

I think that set->hash should be like list->vector or any of the other x->y functions that create a "new" object that is not connected to the old object. (I think that x->stream are an exception.) I like the idea of initially restricting the function to immutable sets.

Perhaps it's worth noticing in the docs that it may not return a new fresh hash. I prefer no avoid promising to return the same hash in spite it is the obvious implementation. (Unless someone adds in the future a special case for small sets that secretly use lists instead of hashes.)

In the future set->hash can be extended to other sets, where the function creates in the spot a "new" hash. Perhaps an immutable hash. Perhaps it may be cached, because I guess a very common use case is to add some stuff to the set an later use it without adding or removing elements. If the immutable hash is cached, the function may return the same hash, or for some reason it may decide to return a new hash. So it's better to avoid any promise.

(I'm a strong proponent of "Make an RFC for immutability and freshness in the standard library" racket/rhombus-prototype#22 and I think it's better to future-proof this function. Restricting it to immutable sets is a good compromise if it is enough to get the speedup.)

@stchang
Copy link
Contributor Author

stchang commented Aug 1, 2019

I think that set->hash should be like list->vector or any of the other x->y functions that create a "new" object that is not connected to the old object.

Good point. I guess my first point in #2766 (comment) is not really true then, since it doesnt create a new object. I worry that making a copy will negate any performance gain though?

@gus-massa
Copy link
Contributor

What about set->immutable-hash? When the result is immutable it is not so important to get a fresh one.

@rmculpepper
Copy link
Collaborator

Re set->immutable-hash: for a custom set, if the hash is exposed directly then it still has the key wrappers. If a new hash is created with unwrapped keys, then what kind? An equal?-based hash might lose elements, if equal? equates more keys than the custom set's equality, and an eq? or eqv? hash would be useless for doing lookups. Anyway, if allocation isn't a problem, then set->list seems preferable for iteration.

It seems to me that set->hash (including variants proposed above) doesn't solve the efficient iteration problem in general. I think that adding a partial solution might cause more confusion than good.

Going back to the original issue, what's an example of something that can't currently be done efficiently?

@stchang
Copy link
Contributor Author

stchang commented Aug 8, 2019

what's an example of something that can't currently be done efficiently

A concrete example is implementing sequences for my graph library, where graphs are implemented with hashes and sets.

I'm trying to improve iteration speed by implementing sequences that take advantage of the "clause transform" case in define-sequence-syntax.

With this example:

  1. If the sequences default to the expression case (current implementation), the test runs in ~7.2sec.
  2. If I switch to define-sequence-syntax and :do-in and I use set-first and set-rest in the clause-transform case, the test run time balloons to ~31sec.
  3. If instead I extract the hash and then iterate on that, the test runs in ~1.7sec.

An alternative to extracting the hash is to add unsafe-immutable-set-first, etc, analogous to unsafe-immutable-hash-iterate-first and friends?

@gus-massa
Copy link
Contributor

I'll try to read the code more carefully later. Just two ideas:

  • Rename to set->immutable-dict? Can the result be a hash?

  • Does the runtime change if you use for*/and instead of for*, and change (void) to (and u v w)? I'm always worried that the optimizer may notice that the variables are unused and eliminate the last hash-ref or something. Most of the times the optimizer is not smart to notice that for*/and is doing nothing.

@stchang
Copy link
Contributor Author

stchang commented Aug 9, 2019

Forgot some details.

Here is the code for the new sequences I defined. To repeat my experiments, change the example to use the following in-weighted-graph-neighbors instead of in-neighbors.

;; slow: with set-first/next
(define-sequence-syntax in-weighted-graph-neighbors
  (λ () #'in-weighted-graph-neighbors*)
  (syntax-parser
    [[(id) (_ g-expr v-expr)]
     ;; with set-first/rest
     (for-clause-syntax-protect
      #'[(id)
         (:do-in
          ;; outer bindings
          ([(ht) (hash-ref (get-adjlist g-expr) v-expr)])
          ;; outer check
          #t ;; TODO: fix
          ;; loop bindings
          ([s ht])
          ;; pos check
          (not (set-empty? s))
          ;; inner bindings
          ([(id) (set-first s)])
          ;; preguard
          #t
          ;; post guard
          #t
          ;; loop args
          ((set-rest s)))])]
    [_ #f]))
;; fast: with set->hash
(define-sequence-syntax in-weighted-graph-neighbors
  (λ () #'in-weighted-graph-neighbors*)
  (syntax-parser
    [[(id) (_ g-expr v-expr)]
     ;; with set->hash
     (for-clause-syntax-protect
      #'[(id)
         (:do-in
          ;; outer bindings
          ([(ht) (set->hash (hash-ref (get-adjlist g-expr) v-expr))])
          ;; outer check
          #t ;; TODO: fix
          ;; loop bindings
          ([i (unsafe-immutable-hash-iterate-first ht)])
          ;; pos check
          i
          ;; inner bindings
          ([(id) (unsafe-immutable-hash-iterate-key ht i)])
          ;; preguard
          #t
          ;; post guard
          #t
          ;; loop args
          ((unsafe-immutable-hash-iterate-next ht i)))])
    [_ #f]))

The timings do not change when using for*/sum instead of for*.

@rmculpepper
Copy link
Collaborator

I haven't run this, but couldn't you write it as the following instead?

(define-sequence-syntax in-weighted-graph-neighbors
  (λ () #'in-weighted-graph-neighbors*)
  (syntax-parser
    [[(id) (_ g-expr v-expr)]
     (for-clause-syntax-protect
      #'[(id) (in-set (hash-ref (get-adjlist g-expr) v-expr))])]))

@stchang
Copy link
Contributor Author

stchang commented Aug 9, 2019

couldn't you write it as the following instead?

I'll try it but I didn't think def-sequence-syntaxes were composable like that.

@stchang
Copy link
Contributor Author

stchang commented Aug 9, 2019

I'll try it but I didn't think def-sequence-syntaxes were composable like that.

I tried the in-set suggestion. The timing is unimproved (it's the same as the old implementation) which suggests that it's using the "expression" clause and not the "fast path". My understanding is that the "fast path" is only used when the sequence constructor directly appears syntactically in the for loop.

@rmculpepper
Copy link
Collaborator

Ah, it appears that in-set always calls the generic method. If in-mutable-set or in-immutable-set are applicable, try it instead.

The expansion of the following example suggests that sequence syntaxes can expand into other sequence syntaxes and get the fast path behavior:

(define-sequence-syntax in-range-from-1
  (lambda () (error 'unimplemented))
  (lambda (stx)
    (syntax-case stx ()
      [[(id) (_ n)]
       (for-clause-syntax-protect
        #'[(id) (in-range 1 n)])])))
(syntax->datum (expand '(for/list ([i (in-range-from-1 10)]) i)))

@97jaz
Copy link
Contributor

97jaz commented Aug 10, 2019

I was just playing around with this, and @rmculpepper's version works and cuts the time on my machine from about 11.1s to about 6.3. So a big improvement.

On the other hand, it is still about 3x what you get if you replace racket/set with the implementation in expander/common/set. The overhead in racket/set for the common case is pretty onerous.

[Update:]
In Racket 7.4 CS, the difference between the two is smaller. in-immutable-set is right around 4s (as opposed to 6.3 in Racket 7.3) and in-set (using expander/common/set) is a bit slower than in 7.3, running at about 2.8s (as opposed to around 2 flat).

@stchang
Copy link
Contributor Author

stchang commented Aug 10, 2019

Ok I tried it too and I'm seeing similar almost-2x speedup. Thanks for the suggestion Ryan.

It's still 2-3x slower than just iterating on the underlying hash for some reason, which is strange because the code is similar. Only difference is the (identity) wrapper fns. Could that make such a big difference?

I can't think of another way to extract the underlying hash that addresses the issues Ryan pointed out though, other than maybe putting it in racket/unsafe? Otherwise I guess I'll close this ticket.

@rmculpepper
Copy link
Collaborator

In define-in-set-sequence-syntax in collects/racket/private/set-types.rkt, manually defunctionalizing fn (that is, binding a flag instead and doing an if test in the right-hand side of the id binding) reduces the times from about 3.5s to 3s on my machine. Eliminating the test entirely cuts it to 1.5s (at the expense of being wrong on custom sets).

Bizarrely, starting with the defunctionalized version and changing (custom-elem-contents _) to (unsafe-struct-ref _ 0) also cuts the time down to about 1.5s, even though IIUC that's the path not taken. So if you see the same speedup, and the tests pass, and you know that no bad keys can sneak into the hash, then maybe that's the way to go.

@97jaz
Copy link
Contributor

97jaz commented Aug 10, 2019

Bizarrely, starting with the defunctionalized version and changing (custom-elem-contents _) to (unsafe-struct-ref _ 0) also cuts the time down to about 1.5s

You might get the same effect by declaring the custom-elem struct type as #:authentic.

@97jaz
Copy link
Contributor

97jaz commented Aug 10, 2019

A bit off-topic: it's not a big surprise that racket (classic) has better iteration performance than racketcs on immutable hashes. And racketcs does do better on constructing the graph. But when I tested a version of racketcs that uses a HAMT instead of the current Patricia trie implementation for immutable hashes, it did better on both construction and iteration. That was surprising. (The microbenchmark I've used in the past has consistently shown that the Patricia tries are so much faster at writes than HAMTs that it doesn't make sense to use the latter. But this seems like a better benchmark:

Patricia version:

building the graph
cpu time: 6510 real time: 6512 gc time: 2431
cpu time: 6398 real time: 6399 gc time: 2329
cpu time: 6231 real time: 6233 gc time: 2126
cpu time: 6247 real time: 6249 gc time: 2161
cpu time: 6697 real time: 6698 gc time: 2622
cpu time: 6508 real time: 6511 gc time: 2418
cpu time: 5821 real time: 5823 gc time: 1737
cpu time: 6151 real time: 6152 gc time: 2067
cpu time: 6603 real time: 6604 gc time: 2556
cpu time: 6390 real time: 6391 gc time: 2328

traversing the graph
cpu time: 2684 real time: 2685 gc time: 13
cpu time: 2738 real time: 2739 gc time: 13
cpu time: 2843 real time: 2845 gc time: 13
cpu time: 2877 real time: 2877 gc time: 14
cpu time: 2794 real time: 2797 gc time: 13
cpu time: 2758 real time: 2759 gc time: 13
cpu time: 2736 real time: 2737 gc time: 13
cpu time: 2741 real time: 2741 gc time: 13
cpu time: 2719 real time: 2719 gc time: 12
cpu time: 2719 real time: 2720 gc time: 13

HAMT version:

building the graph
cpu time: 5673 real time: 5676 gc time: 1385
cpu time: 5817 real time: 5819 gc time: 1555
cpu time: 5752 real time: 5755 gc time: 1458
cpu time: 5785 real time: 5793 gc time: 1463
cpu time: 5763 real time: 5768 gc time: 1459
cpu time: 5727 real time: 5730 gc time: 1485
cpu time: 5782 real time: 5788 gc time: 1501
cpu time: 5700 real time: 5704 gc time: 1450
cpu time: 5880 real time: 5884 gc time: 1492
cpu time: 5889 real time: 5892 gc time: 1484

traversing the graph
cpu time: 2531 real time: 2532 gc time: 18
cpu time: 2550 real time: 2552 gc time: 18
cpu time: 2550 real time: 2551 gc time: 18
cpu time: 2586 real time: 2588 gc time: 18
cpu time: 2587 real time: 2588 gc time: 18
cpu time: 2564 real time: 2565 gc time: 18
cpu time: 2541 real time: 2542 gc time: 18
cpu time: 2531 real time: 2532 gc time: 18
cpu time: 2563 real time: 2563 gc time: 17
cpu time: 2589 real time: 2590 gc time: 18

I think I need to collect some more benchmarks.

@rmculpepper
Copy link
Collaborator

I just tried #:authentic, and it doesn't seem to make a difference.

@stchang
Copy link
Contributor Author

stchang commented Aug 12, 2019

Bizarrely, starting with the defunctionalized version and changing (custom-elem-contents _) to (unsafe-struct-ref _ 0) also cuts the time down to about 1.5s, even though IIUC that's the path not taken.

I see similar speedup but I'm confused why. Is there some kind of speculative execution going on? Is it normal for speculative execution to affect performance so much?

@samth
Copy link
Sponsor Member

samth commented Aug 12, 2019

It's unlikely that it's speculative execution; instead it's probably changing either inlining decisions or whether the compiler can prove that the function never errors/always returns 1 value/something else useful.

@samth
Copy link
Sponsor Member

samth commented May 7, 2020

I think this is unlikely to be merged in the current form, but I'm interested in (a) what this looks like on 7.7 Racket CS and (b) whether the improvements to the graph library actually got used. @stchang ?

@97jaz
Copy link
Contributor

97jaz commented May 9, 2020

Here are some benchmarks comparing 7.5cs vs 7.7cs (so, a patricia-based immutable hash with stencil vectors). Stencil vectors appear to be a marked improvement for the graph benchmarks.

results

@samth
Copy link
Sponsor Member

samth commented May 9, 2020

The k-cfa slowdown is a little worrying there.

@97jaz
Copy link
Contributor

97jaz commented May 9, 2020

@samth This big difference is consistent with my earlier patricia vs. hamt benchmarks (where the hamts in question were not using stencil vectors).

@97jaz
Copy link
Contributor

97jaz commented May 9, 2020

Here's an older measurement, pre-stencil vectors:
results

"7.4" here is the CS version. "7.4/hamt" is the same version using a non-stencil vector hamt implementation. "7.3/C" is the BC build.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants