Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add --tag filtering to every command, where applicable #866

Merged
merged 10 commits into from
Mar 9, 2017
Merged

Add --tag filtering to every command, where applicable #866

merged 10 commits into from
Mar 9, 2017

Conversation

middelink
Copy link
Member

Add helper function FindFilteredSnapshots() to handle the bulk of iterating
over the snapshots and perform filtering.

Started using contexts (which means minimal requirement is now Go 1.7)

Fixed long silent runs of find by outputting results as we find them.
(We lost the ability to print how many matches we found however.)

Made mount more interesting, it now applies filters when filling the
snapshots directory.

Closes #863

}
}
return
} else {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if block ends with a return statement, so drop this else and outdent its block

@fd0
Copy link
Member

fd0 commented Mar 7, 2017

Hm. I appreciate the switch to context.Context, but doesn't that cause problems with the fuse library (which still imports the old context library)?

Copy link
Member

@fd0 fd0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First round of comments, gotta go now, I'll finish this review later!

err := findInSnapshot(repo, pat, snapshotID)

if err != nil {
ctx, cancel := context.WithCancel(context.Background())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use context.TODO() instead of context.Background(), the latter should only be used in the main() (or equivalent). The former is made specifically to allow refactoring a code base piece by piece and easily find the places where a new context has been introduced that'd need to be derived from a parent context.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uhm. This is main() or equivalent. Unless cobra is going to supply us with a context, this is as high in the tree as we can get. So context.Background() felt like the right choice there.

For me, context.TODO() would appear in say refactoring the backends, when the restic library itself does not plumb the context through, so you would have to make one up.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, I'd rather pass down a context from main(), so we can have the cleanup handlers cancel that context. And I think in main() is the right place for this to do (as we only need to do it once and not in each function). So I still think context.TODO() is the better choice here. But that should not be a merge blocker for now ;)

Copy link
Member Author

@middelink middelink Mar 8, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. I had to add it to the GlobalOptions, but at least that makes it accessable to all cmd's.
...
Oh, I see what you did there. You want to pass a context.WithCancel context down. Nice. Not super useful right now, but nice. Done.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool :)

if err != nil {
ctx, cancel := context.WithCancel(context.Background())
defer cancel()
for sn := range FindFilteredSnapshots(ctx, repo, opts.Host, opts.Tags, opts.Paths, opts.Snapshots) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe bundle the filter criteria for snapshots into a new struct, e.g.:

type SnapshotFilter struct {
    Host string
    Tags []string
    Paths []string
    IDs []restic.ID
}

@@ -55,8 +55,9 @@ func init() {
f.IntVarP(&forgetOptions.Yearly, "keep-yearly", "y", 0, "keep the last `n` yearly snapshots")

f.StringSliceVar(&forgetOptions.KeepTags, "keep-tag", []string{}, "always keep snapshots with this `tag` (can be specified multiple times)")
f.StringVar(&forgetOptions.Hostname, "hostname", "", "only forget snapshots for the given hostname")
f.StringSliceVar(&forgetOptions.Tags, "tag", []string{}, "only forget snapshots with the `tag` (can be specified multiple times)")
f.StringVar(&forgetOptions.Host, "host", "", "only consider snapshots for this `host`")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as in the other PR: we're changing from --hostname to --host, we need to find a workaround for people using --hostname in their scripts.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack, but... I made the change in the other PR, so if that gets submitted, I can resolve the conflicts (it's going to be more than just this one.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually rebased this tree on fix-841. So it should be fine now.

} else {
Verbosef("would remove snapshot %v\n", id.Str())
// Not cool, we lose the benefit of yielding, e.g. only one snapshot in memory.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this comment about? What should it tell future readers of this source code?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That using a channel to hand over a Snapshot one by one is completely defeated by putting them all in a list. But if you don't like the remainder, I can certainly make it go the way of the dodo's ^^

@codecov-io
Copy link

codecov-io commented Mar 7, 2017

Codecov Report

Merging #866 into master will increase coverage by 0.12%.
The diff coverage is 57.34%.

@@            Coverage Diff             @@
##           master     #866      +/-   ##
==========================================
+ Coverage   53.36%   53.48%   +0.12%     
==========================================
  Files          92       92              
  Lines        7603     7565      -38     
==========================================
- Hits         4057     4046      -11     
+ Misses       2931     2909      -22     
+ Partials      615      610       -5
Impacted Files Coverage Δ
src/cmds/restic/cmd_check.go 46.51% <ø> (ø)
src/restic/snapshot.go 34.65% <0%> (ø)
src/restic/fuse/snapshot.go 0% <0%> (ø)
src/cmds/restic/cmd_restore.go 60.31% <100%> (+0.64%)
src/cmds/restic/cmd_key.go 60.41% <100%> (ø)
src/cmds/restic/cmd_backup.go 28.17% <100%> (ø)
src/cmds/restic/format.go 20% <20%> (ø)
src/cmds/restic/global.go 23.35% <33.33%> (+0.31%)
src/cmds/restic/cmd_forget.go 36.08% <44.44%> (+9.47%)
src/cmds/restic/find.go 54.9% <54.9%> (ø)
... and 9 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 33c8dd4...8a05de5. Read the comment docs.

@middelink
Copy link
Member Author

Regarding the use of the old context in fuse. I don't think it matters much, they call us with a context and none of the restics fuse code uses it.

Then I found this:
"However, I'm not sure how important it is to transition right away. The two context types interoperate (the interfaces are equivalent). It's only an issue if you've got it in a func/interface signature that you're trying to use as a type, and that's not very common."

So while not having a definite answer, it seems not a super big issue out there.

@fd0
Copy link
Member

fd0 commented Mar 8, 2017

Ah, thanks for looking it up, I forgot that context.Context is an interface. We need to also update the docs (readme and manual).

Copy link
Member

@fd0 fd0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like what you did here, good work!

Some things still need to be done, from my point of view these are:

  • Add the new requirement for Go 1.7 to the readme and the manual
  • Add --tag to the snapshots command, that seems to be missing
  • Allow passing in a snapshot ID prefix (add a call to restic.FindSnapshot() for each ID

)

// FindFilteredSnapshots yields Snapshots, either given explicitly by `args` or filtered from the list of all snapshots.
func FindFilteredSnapshots(ctx context.Context, repo *repository.Repository, host string, tags []string, paths []string, args []string) <-chan *restic.Snapshot {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest renaming args to snapshotIDs so it becomes clear what should be in there from the function signature.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I named then ids as it is not sure yet they are actual snapshotIDs

continue
}
} else {
id, err = restic.ParseID(s)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code here now requires passing a complete (non abbreviated) snapshot ID, that's not a good user experience:

$ bin/restic snapshots       
ID        Date                 Host        Tags        Directory
----------------------------------------------------------------------
4b122706  2017-03-07 11:11:52  mopped                  /home/fd0/shared/work/restic/restic

$ bin/restic tag --add foo 4b122706
Create exclusive lock for repository
Ignoring "4b122706", it is not a snapshot id
No snapshots were modified

$ bin/restic list snapshots
4b12270680254642a8fe140dba8e7ede140e40af855899768133f4e01c0c875e

$ bin/restic tag --add foo 4b12270680254642a8fe140dba8e7ede140e40af855899768133f4e01c0c875e
Create exclusive lock for repository
Modified tags on 1 snapshots

There's restic.FindSnapshot() which finds the snapshot ID for from a prefix, it was removed in 63db3045c2bb8661a41ebd357984df6794e62d0d.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Args.... You are so right.
I think some rebase mucked up my code. Both here and cmd_snapshot are weird. In fact, cmd_snapshot looks rather pristine. :(
Fixed.

@middelink
Copy link
Member Author

Addressed your latest comments (i think; it getting hard to see...)
-> documentation
-> refresh of cmd_snapshot
-> FindSnapshot in find.go

Copy link
Member

@fd0 fd0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, looks great!

Still open things:

  • Add proper error handling for json.Marshal()
  • What about my idea of bundling together filter criteria in a separate struct? Future work?
  • The option --group-by-tags is on by default (-> bug in the code, see comment there)
  • The find command does not print the snapshot the results are found in any more, was that intentional? Sample usage for find on a repo with several snapshots:
$ bin/restic find Dockerfile  
/restic/Dockerfile
/restic/Dockerfile
  • The ls command used to return an error when no snapshot IDs are given, now it lists ALL snapshots. Bug or feature?

}
sort.StringSlice(sn.Paths).Sort()
k, _ := json.Marshal(key{Hostname: sn.Hostname, Tags: sn.Tags, Paths: sn.Paths})
Copy link
Member

@fd0 fd0 Mar 8, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uh, you're throwing away an error here, that's not a good idea. I agree that there shouldn't be an error most of the time, but if there is one, json.Marshal() may return an empty buffer, so string(buf) will be the empty string, which means that all snapshots are sorted into the same group. That's not a good idea, especially not for the forget command...

In addition, there's a bug in this line: use tags instead of sn.Tags.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, both fixed. Thanks!

@@ -49,6 +51,13 @@ func init() {
globalOptions.password = pw
}

var cancel context.CancelFunc
globalOptions.ctx, cancel = context.WithCancel(context.Background())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

@fd0
Copy link
Member

fd0 commented Mar 8, 2017

So, I went through the code and listed all open issues in the comment above. I know that it's hard to do, but it would be great if you could split such huge changes into sensible smaller commits. Reviewing one huge commit is really hard if so much is changed at once... Maybe have a look at how to rework and merge commits with git rebase --interactive?

@middelink
Copy link
Member Author

middelink commented Mar 8, 2017

The find command does not print the snapshot the results are found in any more, was that intentional?
Now I print a header when it finds a match:

$ restic -r rest:http://qnap.**.********.**:8000 find -l bin
Found matching entries in snapshot 0a4ce6e3
Lrwxrwxrwx     0     0      0 2016-12-14 12:54:30 /bin -> usr/bin
dr-xr-xr-x     0     0      0 2016-12-14 12:59:29 /usr/bin
Lrwxrwxrwx     0     0      0 2016-07-03 14:41:12 /usr/lib/debug/bin -> usr/bin
drwxr-xr-x     0     0      0 2016-07-03 14:41:12 /usr/lib/debug/usr/bin
drwxr-xr-x     0     0      0 2016-07-03 14:44:42 /usr/lib64/pm-utils/bin
drwxr-xr-x     0     0      0 2016-11-05 16:38:36 /usr/local/bin
drwxr-xr-x     0     0      0 2016-11-05 16:38:37 /usr/share/locale/bin
Found matching entries in snapshot 0f65f72a
Lrwxrwxrwx     0     0      0 2016-12-14 10:23:13 /bin -> usr/bin
drwxrwxr-x  1000  1000      0 2016-10-13 20:53:56 /home/middelin/.atom/packages/build/node_modules/cross-spawn-async/node_modules/which/bin
drwxrwxr-x  1000  1000      0 2016-10-13 20:53:56 /home/middelin/.atom/packages/build/node_modules/cson-parser/node_modules/coffee-script/bin
drwxrwxr-x  1000  1000      0 2016-10-13 20:53:56 /home/middelin/.atom/packages/build/node_modules/getmac/bin
drwxrwxr-x  1000  1000      0 2016-10-13 20:53:56 /home/middelin/.atom/packages/build/node_modules/js-yaml/bin
drwxrwxr-x  1000  1000      0 2016-10-13 20:53:56 /home/middelin/.atom/packages/build/node_modules/js-yaml/node_modules/esprima/bin
...

The ls command used to return an error when no snapshot IDs are given, now it lists ALL snapshots

Not sure. I can make a case for either ^^

  1. most other commands play smart and iterate. (snapshots, forget, tag)
  2. if we want to allow for filtering, we need to allow for no snapshot ids. Which, granted, does not mean that no filtering means all snapshots.

Alright, lets not rock the boat too much. I put in a check for it.

What about my idea of bundling together filter criteria in a separate struct? Future work?

Uhm. I don't like it very much. First it will make the call to the function not at all easier to understand:
FindFilteredSnapshots(ctx, repo, opts.Host, opts.Tags, opts.Paths, args) vs FindFilteredSnapshots(ctx, repo, restic.Filter{Host:opts.Host, Tags:opts.Tags, Paths:opts.Paths, SnapshotIDs: args})
Not much gain. Granted, you get some idea what the arguments mean, but if that is the reason you should do it with every function in the API... And worse, if we decide to add a new filter in some future world, one would change the function signature and the compiler would find all call paths for you. Now you need to do greps and whatnot to find them all.
(was not able to find any preference indication when Googling though; so the internet is for once silent)

I know that it's hard to do, but it would be great if you could split such huge changes into sensible smaller commits

Ouch. So one PR with a commit for say find.go and then all the cmd_xxx's separate? Aside some ordering issues where I reuse some functions from other commands, I think it should be do-able. Or did you mean a PR per commit? If you want smaller reviews, I think you meant that one...

Pauline Middelink added 4 commits March 8, 2017 19:55
Add `tags` argument to `FindLatestSnapshot()`
Set up a cancelble context in global options, hook it into the ctrl-C handler
for proper cancel propegation.

Bump up minimal requirement for Go to version 1.7 in documentation
and test-build files.
@fd0
Copy link
Member

fd0 commented Mar 8, 2017

What about my idea of bundling together filter criteria in a separate struct? Future work?

Uhm. I don't like it very much. First it will make the call to the function not at all easier to understand:
FindFilteredSnapshots(ctx, repo, opts.Host, opts.Tags, opts.Paths, args) vs FindFilteredSnapshots(ctx, repo, restic.Filter{Host:opts.Host, Tags:opts.Tags, Paths:opts.Paths, SnapshotIDs: args})
Not much gain.

Hm, I think it's way easier to read:

FindFilteredSnapshots(ctx, repo, opts.Host, opts.Tags, opts.Paths, args)

versus:

f:= restic.Filter{Host:opts.Host, Tags:opts.Tags, Paths:opts.Paths, SnapshotIDs: args}
FindFilteredSnapshots(ctx, repo, f)

Granted, you get some idea what the arguments mean, but if that is the reason you should do it with every function in the API...

And for this reason I've for example used this pattern for the snapshot filter for forget:
https://github.com/restic/restic/blob/master/src/restic/snapshot_filter.go#L30-L35
https://github.com/restic/restic/blob/master/src/restic/snapshot_filter.go#L38

I don't like functions with long argument lists, I'd rather prefer to group them into logical chunks via custom types.

And worse, if we decide to add a new filter in some future world, one would change the function signature and the compiler would find all call paths for you.

That is indeed true, but:

Now you need to do greps and whatnot to find them all.

That's not necessary, there's guru, from vim-go you can just use :GoCallers to get all callsites for a function under the cursor.

@fd0
Copy link
Member

fd0 commented Mar 8, 2017

But we can also leave the filter thing as an optimization for the future.

Pauline Middelink added 5 commits March 8, 2017 20:19
This helper function takes a set of filters and/or a list of snapshots
from the commandline. It returns a channel of *Snapshot.

When snapshot ids are given, they are checked for validity and their
corresponding Snapshots returned. The snapshot id "latest" is handled
special to return either the last snapshot (no filters) or the last
snapshot matching the filters.

When no arguments are given, the filters are applied over all available
snapshots and these are returned.
Implement filtering by using `FindFilteredSnapshots()` to iterate over the snapshots

Refactor cmd_snapshots' `PrintSnapshots()` so its pretty printing can be used from
both `forget` and `snapshots`.

Use contexts.
Implement filtering by using `FindFilteredSnapshots()` to iterate over the snapshots

Refactor cmd_ls' `PrintNode()` into format.go, reuse its pretty printing in both `find`
and `ls` commands.

Use contexts.
Factor out and reuse `rebuildIndex()` in cmd_rebuild_index and cmd_prune.

Use contexts.
@fd0 fd0 merged commit 8a05de5 into restic:master Mar 9, 2017
fd0 added a commit that referenced this pull request Mar 9, 2017
Add --tag filtering to every command, where applicable
@fd0
Copy link
Member

fd0 commented Mar 9, 2017

Thanks, merged!

@middelink middelink deleted the widespread-tags branch March 10, 2017 20:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants