Minimal / simple implementation of SQLite-backed history #376

phiresky · 2022-04-01T21:21:03Z

This PR implements a SQLite-based history. I kept it minimal for easy merging and future expansion.

It should reproduce all features of the txt-backed history (arrow navigation, prefix filtering, history menu etc) and passes all the tests from file_backed. The History trait stayed almost the same so far.

It doesn't really offer any additional features currently except storing an arbitrary context together with each history entry. In the future there should be functionality to filter by session, PWD, hostname, etc, but those will require deeper changes.

when run with --features sqlite, the demo binary will store the history into history.sqlite3 and also add the run timestamp as metadata.

some notes:

most of the history methods I think should probably be fallible. i didn't change this since i don't know what kind of error propagation you want - so right now there's more unwrap than should be.
the tests for file and sqlite backend should probably be merged (same for every "History" impl). right now i copy-pasted all the tests to the sqlite backed history, but barely modified anything.
i think the "cursor" functionality (back, forward, etc) should be separated out into a separate trait, but it works as is
i (intentionally) didn't implement the capacity limiting feature
the iterator currently loads everything in memory (like the file-based one does), to fix that some of the ownership / lifetimes would need changing
right now i json-serialize the context so the library user can decide what to store. this is inefficient in regards to storage, but it's going to be hard to store structured data in a generic manner otherwise. One solution might be to use a binary serialization method that's more efficient, but due to the sqlite json functions it's easy to create efficient indices on the schema as it is right now (e.g. create index on history(context->>'cwd', context->>'timestamp') for directory-based history

fdncred · 2022-04-02T00:28:07Z

@phiresky, First of all, let me say, wow! wow! wow! I was excited to see this PR. Thank you very much.

As you may have heard, if you participate on Discord, that we've been planning this feature for over 7 months now. Your PR kind of justifies our ideas that there would be a demand for such a feature. We have many plans on what I've been calling historydb for some time now. In fact, you'll laugh at this, I had a dream about it last night.

Having thought about this so much, I personally have written two demos/prototypes for such a feature. The most recent, prototype, is here. Without going into a ton of detail, my gut reaction after the wows was to look at my early prototype and your code here and compare and contrast. To be clear, I'm confident that what you have done here is more elegant and coded better than what I wrote - there is no doubt in my mind. I was just hacking things together. But there are a few things we may want to consider updating.

One thing that I'd like to work with you on is storing more information. I don't want to get into a position where we have to start a db with 2 columns, update to 3, update to 4, etc. I'd really like to bypass db migration as much as possible, even though I'm confident we'll have to do it at some point. I'd rather just start out with kind of what we're planning on storing and run with that. If you look at the HistoryItem struct in rsq you can see what we're planning. Some of the columns are debatable, others probably won't be, and still others may be used for secret plans - haha.

Another thing we want to ensure is that we are able to have a Database trait where other datasources can be used like sqlite, redis, others. We've also discussed a connection pool using something like r2d2 although we don't have final decisions.

There are also minor things like sqlite feature flags, initialization things like WAL, transactions, naming the file, etc that are so in the weeds right now that it doesn't really matter, but 'the devil is in the details' so we'll need to get there eventually.

Sorry for the wall of text but like I hope you hear my appreciation for what you've done, and I'm also hoping that you'll be willing to work with us. But I'm not the only voice either. I'm tagging @jntrnr @sholderbach @elferherrera @stormasm @kubouch to jump into the discussion.

elferherrera · 2022-04-02T02:53:05Z

Thank a lot for helping with this one. As @fdncred pointed out it is a much desired feature.

To complement the previous comment, would you mind moving your implementation to nushell?
The idea for reedline repo is to stay slim, meaning that it has to contain enough for people to use it to build their own projects with it. The DB history doesn't fit the description. The plan for the history is to be in a new create inside nushell crates and to be an implementation for nushell.

If you are willing to continue working with this feature (which I really hope you do), do you mind moving this to nushell? We can work together to make it work like we want it there.

stormasm · 2022-04-02T03:59:57Z

@phiresky we really look forward to working with you on getting this concept going... And based on your experience clearly you have some expertise in this area that will complement our skills as a team...

My 2 cents following up to @fdncred and @elferherrera is to possibly (prior to moving the code to nushell / crate) to work with us on a nice design....

Moving the code to a crate is important but we don't want you to do more work than is needed...

One of the reasons for doing this is we are not clear yet on exactly where all of the functionality will lie in terms of the crate / nushell / reedline facets...

As a team we have been hashing out / thinking about / discussing this idea for awhile, with no solid design / or code in place yet except for a nice demo that @fdncred wrote which you should review just to get a sense of some of the concepts we are thinking about....

https://github.com/fdncred/rsq
@fdncred mentions this above in his comments...

I could go on with some more things / but for now I will close...

I have your branch installed locally on my machine and will probably have some more questions about your code. I am in the process of reviewing your code now, and understanding more of what you have written...

Again, many thanks ! for all of your hard work so far... As a team, we clearly look forward to working with you...

stormasm · 2022-04-02T04:15:23Z

@phiresky I have your code installed and I am trying to see the history land in the sqlite database...

Here are the steps I am running:

cargo build --features=sqlite
cargo run

Then I type in some commands / however they are landing in the history.txt file...

What other steps do I need to run to see the history land in the db ?

phiresky · 2022-04-02T10:32:36Z

I was just hacking things together

Me too ;)

storing more information

The way I wrote it is there's a single "context" column that can store whatever information wanted - which in the case of nushell would be e.g. timestamp, hostname, pwd, exit status, runtime.
The advantage here is that migrations shouldn't be needed, since you can just add more (optional) items to your HistoryContext struct. The disadvantage is mainly the increased storage.

The DB history doesn't fit the description

The reason I did it like this was that I imagined that reedline could be used as a library for other REPLs, where you still want a history but probably want a different kind of historical context (e.g. for a PostgreSQL history it might be the database the command ran in instead of the PWD).

If you think it makes more sense to integrate it more tightly with nushell (i.e. move the code @elferherrera) then the considerations change and it might make more sense to use separate DB columns instead (since it doesn't need to be as generic) - but it'll probably need some capability of migrations.

@stormasm you need to use cargo run --features=sqlite, run ignores the feature flags given to build. I can recommend SqliteBrowser and sqlite3 history.sqlite3 .dump to look at the db

I did look at the rsq repo a bit before writing my prototype, but then I kind of ignored it - I have a few comments about that as well, I'll join the discord.

sholderbach · 2022-04-02T10:39:17Z

I disagree with @elferherrera on the point about where the development of this should take place. And given that there will be changes necessary to the history interface, I favour this to take place in reedline (with the dependency behind the feature flag)

fdncred · 2022-04-06T12:54:52Z

@phiresky would you be willing to update this based on the conversations we've had here and in discord? The biggest changes are just to go away from the json context and to individual columns. Your other proposals sounds reasonable.

phiresky · 2022-04-06T13:34:42Z

sure, i'll update it / propose a new version

sholderbach · 2022-04-06T13:40:49Z

What is your idea for the API to provide the Session/Execution info before or after executing the command?

phiresky · 2022-04-06T13:46:40Z

In this PR I implemented it with a update function that takes the historical context and returns a new one:

           history
               .update_last_command_context(|mut c| {
                    c.timestamp = SystemTime::now().duration_since(SystemTime::UNIX_EPOCH).unwrap().as_millis() as u64;
                    c
                })

That function is called once immediately before running the command, and then should be run again after the command is done (with exit status etc)

fdncred · 2022-04-15T12:41:44Z

@phiresky how's it going on the new implementation based on our discussions?

phiresky · 2022-04-15T23:32:25Z

Hey! It's going okay. I'll try to get it to a PRable state this weekend. If I can't manage, I'll the current state into a draft pr. I'm not really sure what to do with the existing History trait yet. Maybe I'll remove it and add a HistoryCursor thing that handles the arrow navigation stuff (seems like it could be the same thing for all history databases)

fdncred · 2022-04-16T15:42:52Z

Thanks for the update. Not rushing you, just hadn't heard from you in a while. Good luck!

sholderbach · 2022-06-06T16:29:37Z

Closing as #401 successfully landed

phiresky added 4 commits April 1, 2022 23:00

basic sqlite history

a03d5e9

fix test compilation

d360457

final touches for MVP

b57135d

better documentation

4fabe54

fix for empty history

8bd459a

phiresky mentioned this pull request Apr 18, 2022

SQLite History (third version) #401

Merged

sholderbach marked this pull request as draft May 1, 2022 22:10

sholderbach closed this Jun 6, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Minimal / simple implementation of SQLite-backed history #376

Minimal / simple implementation of SQLite-backed history #376

phiresky commented Apr 1, 2022

fdncred commented Apr 2, 2022

elferherrera commented Apr 2, 2022

stormasm commented Apr 2, 2022

stormasm commented Apr 2, 2022 •

edited

phiresky commented Apr 2, 2022 •

edited

sholderbach commented Apr 2, 2022 •

edited

fdncred commented Apr 6, 2022

phiresky commented Apr 6, 2022

sholderbach commented Apr 6, 2022

phiresky commented Apr 6, 2022

fdncred commented Apr 15, 2022

phiresky commented Apr 15, 2022

fdncred commented Apr 16, 2022

sholderbach commented Jun 6, 2022

Minimal / simple implementation of SQLite-backed history #376

Minimal / simple implementation of SQLite-backed history #376

Conversation

phiresky commented Apr 1, 2022

fdncred commented Apr 2, 2022

elferherrera commented Apr 2, 2022

stormasm commented Apr 2, 2022

stormasm commented Apr 2, 2022 • edited

phiresky commented Apr 2, 2022 • edited

sholderbach commented Apr 2, 2022 • edited

fdncred commented Apr 6, 2022

phiresky commented Apr 6, 2022

sholderbach commented Apr 6, 2022

phiresky commented Apr 6, 2022

fdncred commented Apr 15, 2022

phiresky commented Apr 15, 2022

fdncred commented Apr 16, 2022

sholderbach commented Jun 6, 2022

stormasm commented Apr 2, 2022 •

edited

phiresky commented Apr 2, 2022 •

edited

sholderbach commented Apr 2, 2022 •

edited