Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better hash table implementation #1

Open
wants to merge 11 commits into
base: master
from

Conversation

Projects
None yet
2 participants
@jessymilare
Copy link
Owner

commented Jan 17, 2019

No description provided.

jessymilare added some commits Jan 8, 2019

Fix: SRFI-69 don't use handles with weak tables
 anymore.

Many procedures used hashx-get-handle and hashx-create-handle! without
checking whether real-hash-table was weak. Now that isn't the case
anymore (Bug 33827). A bug was fixed in hash-table-merge! and a test case
was added. A few other optimizations were made.

* module/srfi/srfi-69.scm (alist->hash-table):
(hash-table-delete!):
(hash-table-exists?):
(hash-table-ref): Don't use hashx-get-handle.
(hash-table-set!): If weakness is set, don't use hashx-create-handle!
and don't update size.
(hash-table-update!): If weakness is set, don't use hashx-get-handle
and don't update size.
(hash-table-update!/default): Added an implementation that doesn't call
hash-table-update!, avoiding allocating a procedure.
(hash-table-size): Set ht-size for weak hash-tables.
(hash-table-walk):
(hash-table-copy): Use native hash-for-each instead of hash-table-fold.
(hash-table->alist): Use native hash-map->list instead of hash-table-fold.
(hash-table-merge!): Use native hash-for-each instead of hash-table-fold.
Walks over other-ht rather than walking ht (and doing nothing).
* test-suite/tests/srfi-69.test: all appropriate test are replicated for
all possible #:weak arguments. Added a test for hash-table-merge!.
Implemented GENERIC-HASH-TABLES module.
This module implements an interface to intermediate hash tables and can
be used to implement SRFI-69, SRFI-125, SRFI-126 and R6RS hash table
libraries. That way, we avoid duplication of code, missing features and
incompatibilities. It reuses current SRFI-69 code and its procedures are
mostly based on SRFI-125 specifications with some changes. It does not
depend on SRFI-128: instead of using comparators, the procedures accept
the same arguments that are accepted by MAKE-HASH-TABLE. The weakness
argument is as specified by SRFI 126.
Fix bugs in GENERIC-HASH-TABLES
HASH-TABLE-PRUNE! didn't update size after removing keys.
HASH-TABLE-DELETE! accessed hash function and associator
  once per key, instead of accessing only once per
  procedure call.

* module/ice-9/generic-hash-tables.scm (hash-table-prune!):
  Now updates size after removing keys (bug).
* (hash-table-delete!): use WITH-HASHX-VALUES outside of loop,
  so that hash function and associator are accessed only once.
Created a procedure that returns the size of a hash table
The module (ICE-9 GENERIC-HASH-TABLES) used to keep track of hash table
size by itself. Now, a procedure HASH-N-ITEMS was implemented in
'libguile/hashtab.c' to access the n_items field of Guile hash table
structure.

* libguile/hashtab.c (scm_hash_n_items): created, it returns the number
  of items that the given hash table has. It works for normal and weak
  hash tables.
* module/ice-9/generic-hash-tables.scm: removed 'size' field of
  <generic-hash-table> record type. No procedures need to update it
  anymore.
(hash-table-size): now accesses the size using HASH-N-ITEMS. That
  guarantees O(1) procedure time.
(let ((len (length args)))
(unless (even? len)
(error "Odd number of key-value pairs" args))
(let* ((capacity (quotient len 2))

This comment has been minimized.

Copy link
@amirouche

amirouche Jan 20, 2019

let* is not required here.

This comment has been minimized.

Copy link
@jessymilare

jessymilare Jan 22, 2019

Author Owner

I removed all useless let* in that file.

`(let ((,(third bindings) (ht-real-table ,ht-var))
(,(first bindings) (ht-hash-function ,ht-var))
(,(second bindings) (ht-associator ,ht-var)))
. ,body-forms))

This comment has been minimized.

Copy link
@amirouche

amirouche Jan 20, 2019

I would have done this with a procedure and call-with-values.

This comment has been minimized.

Copy link
@jessymilare

jessymilare Jan 20, 2019

Author Owner

I would have done this with a procedure and call-with-values.

;; I am a macro only for efficiency, to avoid varargs/apply.

This comment was in old SRFI-69 code. If the compiler is able to inline and optimize away the construction of the argument list and call to apply, I think change to a procedure would be a good idea, otherwise I think this should be kept as it is.

(define (get-hash-functions equiv-function hash-function)
"Returns an internal and an external hash function."
(cond
(hash-function (cond

This comment has been minimized.

Copy link
@amirouche

amirouche Jan 20, 2019

I am not a great fan of this style:

(cond hash-function (cond ...) ...)

What I prefer is something like:

(let ((foo (compute-hash-function hash-function)))
  (cond foo ...)

Basically, less nested code.

This comment has been minimized.

Copy link
@amirouche

amirouche Jan 20, 2019

Also cond doesn't take a predicate so I am not sure what this code does.

This comment has been minimized.

Copy link
@jessymilare

jessymilare Jan 22, 2019

Author Owner

I split the code in two procedures, it should be clearer now.

;; SRFI-69 should give HASH as default hash-function.
((or (eq? (@ (guile) hash) hash-function)
(eq? hash hash-function))
(values (@ (guile) hash) hash-function))

This comment has been minimized.

Copy link
@amirouche

amirouche Jan 20, 2019

Here and below the code returns multiple values but there is not receive or call-with-values or let-values

This comment has been minimized.

Copy link
@jessymilare

jessymilare Jan 21, 2019

Author Owner

Actually, there is. %make-hash-table uses receive.

(cond
(hash-function (cond
;; SRFI-69 should give HASH as default hash-function.
((or (eq? (@ (guile) hash) hash-function)

This comment has been minimized.

Copy link
@amirouche

amirouche Jan 20, 2019

(@ (guile) hash) is used several time through the code, maybe it is worth defining it as (define guile-hash (@ (guile) hash)) somehere to the top of the file.

This comment has been minimized.

Copy link
@jessymilare

jessymilare Jan 22, 2019

Author Owner

Imported guile:hash, guile:assoc and guile:make-hash-table now.

(define (print-hash-table ht port)
(let ((equiv-name (procedure-name (hash-table-equivalence-function ht))))
(format port "#<generic-hash-table ~@[~a ~]~@[~a ~]size: ~a mutable? ~a>"
equiv-name (ht-weakness ht)

This comment has been minimized.

Copy link
@amirouche

amirouche Jan 20, 2019

here it's not clear to me what ht-weakness returns, also you prefix with size: and mutable?: but not but there is no prefix for weakness

This comment has been minimized.

Copy link
@jessymilare

jessymilare Jan 22, 2019

Author Owner

Yes, I agree hash-table-weakness should have a clear documentation, I can change that tomorrow. About printing, weakness can be #f (in which case it's not printed) or it is one of weak-key, weak-value or weak-key-and-value, so I think prefixing it with weakness: seems redundant to me IMHO.

(define (hash-table-key-weakness ht)
"Returns WEAK-KEYS if HT has weak keys, or #f otherwise."
;; If Guile ever supports ephemeral keys, this procedure should
;; return EPHEMERAL-KEYS if the HT keys are ephemeral.

This comment has been minimized.

Copy link
@amirouche

amirouche Jan 20, 2019

It should return EPHEMERAL-KEYS or symbol ephemeral-keys. As far as I now upper case words refer to parameters.

This comment has been minimized.

Copy link
@jessymilare

jessymilare Jan 22, 2019

Author Owner

Done.

(define (hash-table-value-weakness ht)
"Returns WEAK-VALUES if HT has weak values, or #f otherwise."
;; If Guile ever supports ephemeral values, this procedure should
;; return EPHEMERAL-VALUES if the HT values are ephemeral.

This comment has been minimized.

Copy link
@amirouche

amirouche Jan 20, 2019

same as before.

This comment has been minimized.

Copy link
@jessymilare

jessymilare Jan 22, 2019

Author Owner

Done.

(car handle) (car kvs)))
(set-cdr! handle (cadr kvs)))
(loop (cddr kvs)))))))
ht)))

This comment has been minimized.

Copy link
@amirouche

amirouche Jan 20, 2019

Too big procedure it is.

This comment has been minimized.

Copy link
@jessymilare

jessymilare Jan 22, 2019

Author Owner

I split the procedure in two.

(else
(let ((new (updater (failure))))
(hashx-set! h a real-table key new)))))))
*unspecified*)

This comment has been minimized.

Copy link
@amirouche

amirouche Jan 20, 2019

I think I remember Mark saying *unspecified* should not be used.

This comment has been minimized.

Copy link
@jessymilare

jessymilare Jan 20, 2019

Author Owner

It's a global exported value, if it shouldn't be used, it should be internal. Plus, old SRFI-69 code already used that and 'tests/r6rs-hashtables.test' fails if these procedures don't return *unspecified*.

Refactor code in (ICE-9 GENERIC-HASH-TABLES)
* module/ice-9/generic-hash-tables.scm: rename variables to more
  clear names: HT changed to HASH-TABLE, GUILE-HT-CTOR changed to
  GUILE-HASH-TABLE-CONSTRUCTOR, the fields of generic-hash-table
  record HASH-FUNCTION changed to INTERNAL-HASH-FUNCTION and
  ORIGINAL-HASH-FUNCTION to HASH-FUNCTION, etc.
(get-hash-functions): simplify the code callying a new auxiliary
  procedure named MAYBE-WRAP-HASH-FUNCTION.
(hash-table): simplify the code callying a new auxiliary procedure
  named %MAKE-HASH-TABLE-WITH-ARGS.
(hash-table-key-weakness):
(hash-table-value-weakness): symbols in docstrings are now lowercase.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.