New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Port Remacs Lisp_Hash_Table to use FnvHashMap. #286

Closed
wants to merge 92 commits into
base: master
from

Conversation

Projects
None yet
3 participants
@DavidDeSimone
Collaborator

DavidDeSimone commented Aug 7, 2017

This is the WIP commit for #276.

TODO List:

  • Manually derive PartialEq and Eq for HashableLispObject
    • Implement user defined hash function and cmp function.
  • Add "purecopy" functionality.
  • Remove def. of Lisp_Hash_Table from C/Rust
  • Rename all "LispHashMap" functions to "LispHashTable" equiv.
    • Make "LispHashTable" be properly tagged as a HashTable pseudovector, with proper sizing.
    • Convert "get_hash_table_or_error" to return a LispHashTableRef that references a LispHashTable, instead of a Lisp_Hash_Table.
  • Have "LispHashTable" be allocated by the Lisp GC, and managed by the Lisp GC
    • Add a "hashtable_finalize" method for the cleanup at the rust layer.
  • Add weak table support.
  • Handle profiler.c's custom hash/cmp function for it's hash_table_test
  • Fix temacs build stage segfaults
  • Final test/integration pass
  • Final performance/benchmarking pass

DavidDeSimone added some commits Jul 28, 2017

Adding more functionality for LispHashMaps. This will test the abilit…
…y for the GC to find and cleanup these new objects
Updating the 'sweep' method for the Rust LispGC to have better perfor…
…mance by saving an iteration over the data. Adding more tests
WIP refactor of the LispGarbageCollector for better performance. We w…
…ill now have the ability to easily have per object type free lists.
Adding a tuple struct for LispVectorlikeHeader. Adding ability to tag…
… a pseudovector with a LispVectorlikeHeader.
Changing the LispGarbageCollector from a static class over to a struc…
…t with methods. This makes testing much easier, as we can create an instance of the GC per test. We create a helper macro to use the default static GC.
Fixing issue where mod hashmap couldn't find the 'garbage_collect' ma…
…cro, due to mod alloc being imported AFTER mod hashtable
Switching over hash table hasher to fnv hashing, to improve performan…
…ce. Updating the garbage collector to reduce the amount of hash table allocations in a block from 4096 Hash Tables to 4096 bytes worth of hash tables.
Removing level of indirection from the BlockAllocator, removing Vec<B…
…ox<...>> to just Vec<...>. Adding some inline directives for small functions. Adding some more GC related tests.
Removing alloc.rs file. Attempting to migrate the GC along side movin…
…g the hashtable over proved to be too much for a single PR. Instead we will work with the current GC, adding a minimal amount of C code needed to support this functionality.
Adding more functionality to the new LispHashTable, including ability…
… to clear the table, and the ability to get it's count.
Adding mark_hashtable and hashtable_finalize methods. This is the sta…
…rt of integrating our new hash table type into the lisp gc.
Changing LispHashTable::weak from a boolean to a LispObject, as it ca…
…n hold different symbol values based on weak gc behavior
Initial commit for handling the 'purespace' allocation problem for th…
…e new LispHashTable. This function has undergone basic testing, and seems to be working.
Stage one of converting the exsisting C code for a Lisp_Hash_Table ov…
…er to using a new Rust struct. We first update the Rust to assume that it is the definitive Hash Table struct.
@DavidDeSimone

This comment has been minimized.

Collaborator

DavidDeSimone commented Sep 11, 2017

So this is making good progress, and all but one testing is passing! This ended up being slightly more difficult then I anticipated, due to the fact that most of the C-layer makes a lot of assumptions about the internals of HashTables and how they work.

I need to spend some time benchmarking and optimizing this guy to make sure that we are still getting the advertised gains, but I hope to have this PR ready for wide-review by the end of the month.

DavidDeSimone and others added some commits Sep 28, 2017

Merge remote-tracking branch 'origin/master' into hash-table-v4
Conflicts:
	rust_src/Cargo.toml
	rust_src/alloc_unexecmacosx/src/lib.rs
	rust_src/remacs-sys/lib.rs
	rust_src/src/hashtable.rs
	rust_src/src/lib.rs
	rust_src/src/lisp.rs
	src/lisp.h
Updating our hash insertion function for a performance increase. Upda…
…ting hash key update to properly update our hashkey for both the map and the key and value array.
Merge branch 'master' into hash-table-v4
Conflicts:
	rust_src/src/lib.rs
Merge branch 'master' into hash-table-v4
 Conflicts:
	rust_src/src/lisp.rs
David DeSimone
Updating set_hash_value_slot to properly update the value of the inte…
…rnal hashmap. Adding some asserts to track down an OSX Segfault issue.
David DeSimone
David DeSimone
Refactor of our hashtable purecopy function, to make our purecopies 1…
…00% consistent with their previous incarnations, including the free list.
@DavidDeSimone

This comment has been minimized.

Collaborator

DavidDeSimone commented Oct 6, 2017

After a long journey, and a lot of code, I am going to advise that we do not use Rust's HashMap at this time, and that we make a straight port of existing HashTable code.

Reasons:

  • While the lisp layer has a very clean API for interacting with a HashTable, the C layer makes a lot of assumptions on the structure of the HashTable.
  • There are certain operations that update the key of the hash table in place. While I have a possible way of doing this, I am very leery of it, because we are making Rust's HashMap do something it explicitly doesn't support.
  • In order to support APIs that the C code bases uses to interact with hash tables (including an API to update a key/value in constant time by directly writing into it's private data vector), we have to maintain a lot of boilerplate.
  • After testing and playing around with the current code writing, I believe there is a strong chance that this refactor will introduce divergence and incompatibility between upstream in the future.
  • unexec does not support allocations from mmap(2). alloc.c has special case code that will disable mmap(2) in certain cases, and we will have to emulate that to support unexec'ing larger hashtables.

If no one objects, I am going to close this PR, and will open on a PR that focuses on converting the existing HashTable from C to Rust piecemeal.

@brotzeit

This comment has been minimized.

Collaborator

brotzeit commented Oct 6, 2017

If you think that's the best solution do it. Maybe we should avoid making things too complicated for now.

@Wilfred

This comment has been minimized.

Owner

Wilfred commented Oct 9, 2017

Whilst this hasn't resulted in us landing code that uses FnvHashMap, it's a really useful conclusion. Thanks for exploring this so thoroughly!

Could you comment on where the code assumes that it can mutate keys in hash tables? I would have thought we will eventually want to refactor that out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment