Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Storage and Data structures #14

Open
stranger-danger-zamu opened this issue Oct 25, 2021 · 2 comments
Open

Storage and Data structures #14

stranger-danger-zamu opened this issue Oct 25, 2021 · 2 comments
Labels

Comments

@stranger-danger-zamu
Copy link

Data Structs

You have comments about using a std::collections::Vec, from what I know it's a perfectly serviceable stack. You could spring for std::collections::VecDeque instead so you can properly cycle (ie. moving the top most to the bottom) instead of swapping the first and second items.

You could also spring for a std::collections::BTreeMap where the key would be the position in the stack and the BTreeMap would naturally just sort the items for you. Moving items in the stack becomes more expensive if it isn't a limited operation (ie. swapping the first two items or cycling the top to the bottom since you can just get the bottom most key and "decrement" from there). But everything is a trade off.


Storage

Currently you store every item as a JSON object in a JSON array and write that to disk. There was some comment about scaling beyond 10k items (or stacks, I can't remember), but that might just be disk access. You could glue everything into one big JSON object and only read and/or write once per CLI evocation. You could just glue the related stacks together since you don't need to deal with cross stack transfers outside of related stacks.

For anything more robust, you probably are best off utilizing SQLite. Note, you don't have to do anything fancy for the schema, you could just do a single key-value table where the key would be "equivalent" to the file path and the value would be the content. You do get transactions so concurrent access is viable of you set journal_mode to WAL.

@booniepepper
Copy link
Collaborator

Thanks again for the interest in my little project!

For data structures and storage, I've been playing around with ways to represent a stack-based database over in my https://github.com/hiljusti/kamajii project (proof of concepts so far, although several of them already do the core push/pop/list functionality). The idea I'm arriving at over there is to have a daemon running that can serve as both a persistence layer and a memory cache. Using SQLite is something I've definitely considered, but I have a broader vision for what a stack-based database paradigm can do. (Although Sigi is definitely my first candidate for a client)

Once I make more progress over there, I think instead of loading entire stacks into memory and manipulating them, I'll do simple transactions for actions like create, fire-and-forget for something like next that needs to deeply traverse, and use IO streams for anything large like list.

This should also open up a possibility of decentralized data. (E.g. I could run a server for my lists, and use Sigi from my laptop, smartphone, other deices, etc. and have them all sync)

I'm open to feedback here though, I'm not sure if I want to close just yet.

@stranger-danger-zamu
Copy link
Author

I think that having a data persistence interface would be great for Sigi (the organization tool). It'd allow you to separate out the data persistence mechanics from the Sigi UI code. You could have the interface for the backends either expose a repository's stacks and have Sigi operate on the interfaced stacks (eg. item = pop_from($stack), push_onto( $target_stack, item)) or expose higher level APIs which operate on the repository ignoring the internals of the backend (eg. pop_push(from=$stack, to=$target_stack)).

Supporting other backends such as the local file system or SQLite would be super helpful since sometimes a user doesn't want to run a server or daemon. Or they are already running other servers and would like to reduce overhead on constrained environments (eg. reusing a Redis instance on a Raspberry PI rather than starting another process).

On the other hand, I totally get just sticking to your stack-based database idea. Have you looked at Redis for the server? I'm pretty sure it's has most of the functionality you want and if you want custom operations you can implement them via Lua scripts.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants