Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sparse Set #24

Open
ctreffs opened this issue Apr 14, 2021 · 2 comments
Open

Sparse Set #24

ctreffs opened this issue Apr 14, 2021 · 2 comments
Labels
enhancement New feature or request

Comments

@ctreffs
Copy link

ctreffs commented Apr 14, 2021

A Sparse Set is a specialized data structure for storing elements from a large amount of possible values (e.g. natural numbers) that are used very sparingly, optimized for iteration, addition and removal to the set.

Use cases are for example are components attached to an entity in an Entity-Component System (ECS), if you are dealing with arrays of indices, but it can also be used comparably to a binary search tree or hash map, ideally outperforming both of them.

The main advantages of a Sparse Set are its performance which can be
O(1) for it's primary actions (source spareset/lib.rs) and its low memory footprint.

Example Implementations:

Further Reading:

@ctreffs ctreffs added the enhancement New feature or request label Apr 14, 2021
@lorentey
Copy link
Member

Yep, this is something we'd be interested in adding. I think one major additional inspiration for the implementation could be LLVM's SparseSet implementation which @milseman says is an implementation of the 1993 paper by Preston Biggs and Linda Torczon. (Incidentally, we had a discussion on SparseSet on the forum's topic on TicketMap.)

If I understand correctly, Fireblade's implementation is effectively the same as what this package calls OrderedSet. (Although SparseSet has different semantics for remove, those semantics can be implemented on top of OrderedSet's swap and remove operations.)

Rather than wrapping OrderedSet, I'd expect SparseSet to implement its own storage (or build it from Array), and not rely on a conventional hash table at all. This is somewhat up for debate, of course -- but I suspect something like RawRepresentable might be a better constraint on the element type than Hashable.

I have objections to characterizing SparseSet as having a low memory footprint: the "sparse" table it uses needs to have as many elements as the size of the key universe, which can be very large, even for tiny sets. (Note: This doesn't negate its usefulness, it just puts an asterisk on it.)

@ctreffs
Copy link
Author

ctreffs commented Apr 16, 2021

Thanks for the differentiated support for the topic and I'm very glad I did not put up a request for something nobody else finds useful 😅 .

Fireblade implements UnorderedSpareSet solely for its current context, so it's not a very good template for a SparseSet implementation in regards to this package.
LLVM's implementation is a much better fit for that!
Furthermore I agree that RawRepresentable would be a better fit as a general constraint on the element type.
But this might be even part of the generic constraints on the SparseSet itself rather than being an intrinsic requirement?

In regards to memory footprint you are right as well considering the typical implementation of a SparseSet. Fireblade's implementation (for better or worse) backs its sparse storage with a Dictionary mitigating pre-allocation of the whole universe.
This has performance drawbacks in favour to its memory footprint.
An implementation in this package could however provide a selection for SparseSet's storage implementation allowing the user to choose between an implementation favouring performance over memory footprint or the other way round. Im thinking of something like SparseSet(storage: .memoryOptimized). A balanced approach to its storage implementation would however be much more user friendly.

I am however no expert in the field and barely managed to cobble together my own SparseSet implementation so I do not have particular expertise in the field of performance optimization of generalized collections, so please do not read too much into my opinions 😄

Thanks again for considering the proposal and I'm looking forward to seeing this become a part of this awesome package sometime in the future 👍

@ejmarchant ejmarchant mentioned this issue Aug 13, 2021
7 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants