Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FR] Command to retro-actively apply history_filter on history #1742

Closed
xvello opened this issue Feb 20, 2024 · 1 comment · Fixed by #1743
Closed

[FR] Command to retro-actively apply history_filter on history #1742

xvello opened this issue Feb 20, 2024 · 1 comment · Fixed by #1743

Comments

@xvello
Copy link
Contributor

xvello commented Feb 20, 2024

As discussed offline, i have a lot of kubectl delete lines in my history after operations on the dev cluster. I really want to avoid executing these on a prod cluster, so I'd like to make sure they are never shown in the search.

I set history_filter to avoid capturing more of these, but I'd really like to remove them from the DB. I am aware that one can remove commands with ad-hoc calls to atuin search --delete, but I feel like a command to apply the filters would give me the incentive to keep adding to history_filter and avoid a repeat.

I'm happy to implement this once we agree on the design for it. My suggestion:

  • a new atuin history prune subcommand will scan the whole history, match history_filter on it, and issue the appropriate deletions
  • the subcommand can be extended later to pass arbitrary patterns, but by default will use the values from the config
  • as a destructive operation, we'll want to allow to preview the changes before they are executed (to make sure a buggy patterns matches everything). We could either:
    • add a unix-style -n / --dry-run argument and default to delete (easier)
    • list the commands and ask for confirmation before deleting (and add a -f / --force argument to skip the confirmation)
  • the most naive implementation would reuse History::should_save, so that all filtering config is applied without increasing maintenance cost
    • as a once-a-quarter operation, I think it's fine to have it be CPU-intensive and slow. We could try delegating the filtering to the Database but it would add more complexity, and possibly bugs if the regexp behaviours don't align perfectly
    • we need to inspect every single history record. Calling Database::list with unique = true would give us this, at the cost of unbounded memory usage. It might be good enough for a first iteration
    • to keep the memory usage bounded, we'd need a new access method from Database that would allow to iterate or paginate. Alternatively, we can use Database::range on a user-provided timestamp range (default on last 7 days maybe?)

What are your thoughts @ellie?

@ellie
Copy link
Member

ellie commented Feb 20, 2024

Hey! Thanks for writing this one up 😊

a new atuin history prune subcommand will scan the whole history, match history_filter on it, and issue the appropriate deletions

I think that sounds great to me, both in description + name

the subcommand can be extended later to pass arbitrary patterns, but by default will use the values from the config

I'm not sure that would be the best idea, I'd rather keep deleting stuff elsewhere. Prune makes sense for enforcing config, and it would be great to keep the "edit your config" incentive

as a destructive operation, we'll want to allow to preview the changes before they are executed...

I think the unix style dry run makes sense, and would be familiar to users. Otherwise, I think stating how many records will be deleted + asking for confirmation would work, without requiring confirmation + listing them.

Kinda like zsh with an rm *

the most naive implementation...

Reusing should_save would be best for now. Unbounded memory use is generally not a good thing, though in the case of Atuin it's a premature optimisation to be concerned with at this point. Even the largest databases are only a few 10s of mb on disk, and less when loaded into memory. It's definitely something to sort in the future, but not something to be overly concerned with right now.

The only times when we actively need to consider this are for UI/search cases, but for maintenance operations I think this is totally reasonable.

Otherwise - pagination is something I could do with implementing anyway, for some stats/UI use cases. So don't worry about it here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants