Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Filter support on the command line #19

Open
jrs65 opened this issue May 8, 2017 · 4 comments
Open

Filter support on the command line #19

jrs65 opened this issue May 8, 2017 · 4 comments
Labels
client Issues relating to the alpenhorn client enhancement

Comments

@jrs65
Copy link
Contributor

jrs65 commented May 8, 2017

At the moment the command line tools (at best) have an --acq option for restricting operations like sync and clean to subsets of the archive. It would be great to replace this with a more generic filter option. This should probably be kept very simple, it's not meant to be a reimplementation of SQL, just enough to express common operations you might want to do a little more conveniently.

I haven't exactly figured out the syntax of how this would work. Something like this:

--filter 'acq=2017??01*; file=0000.h5,0001.h5'  # Filter on the acq (first day of month in 2017, either of two file names)
--filter 'acqtype=corr; filetype=log'  # Filter all log files in corr acquisitions

Thoughts:

  • Filtering allowed on acq, file, acqtype and filetype.
  • Basic wildcards should be allowed (use the SQL LIKE operation).
  • Multiple clause types are implicitly AND (e.g. acq= AND file=)
  • Do we want to allow OR? Maybe becoming too complex.
  • However, multiple alternatives within a clause should be allowed (e.g. acq=acq1,acq2 is acq=acq1 OR acq=acq2).
@jrs65
Copy link
Contributor Author

jrs65 commented May 8, 2017

@cubranic @kiyo-masui @ahincks any suggestions on this one?

@cubranic
Copy link
Contributor

cubranic commented May 8, 2017 via email

@ahincks
Copy link

ahincks commented May 8, 2017

I like the idea. No strong opinion about using glob or not, but would have reservations if it isn't fully translated/supported by peewee DB's.

I think the implicit AND/OR as Richard has it looks good.

Another approach would be for the argument of the --filter option to literally be an SQL expression that can be plugged right into a WHERE clause. Then alpenhorn doesn't need to do any work. But this may be me thinking too much in direct SQL mode rather than in database ORM mode ...

@jrs65
Copy link
Contributor Author

jrs65 commented May 9, 2017

Great. Thanks for the feedback!

Adam, I think Davor is suggesting we use the globre package that we already use to translate extended glob patterns into regular expressions, and then we use peewee's native regular expression support to do the query. That seems pretty reasonable to me, and I think everything should be fully translated.

I think you're always thinking too much in direct SQL mode! :)

I think another option for this would be to just break it out into standard command line arguments, e.g.

--acq="2017??01*" --file=0000.h5 --file=0001.h5  # Filter 1
--acqtype=corr --filetype=log  # Filter all log files in corr acquisitions

I guess there's two advantages to doing this:

  • It minimises the amount of parsing we need to do ourselves, and will follow the usual conventions for how it's done (with regards to spaces, special characters in file/dir names)
  • It's backwards compatible with the current --acq argument.

Disadvantages:

  • It's less compact.
  • Requires adding more boilerplate code to each command that needs the filtering (i.e. adding the four arguments for each command, rather than one)
  • Seems less elegant somehow.
  • Less extensible, because the arguments are hardcoded. I could see in the long term having the ability to add other ways of filtering in.

@ketiltrout ketiltrout added the client Issues relating to the alpenhorn client label Aug 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
client Issues relating to the alpenhorn client enhancement
Projects
None yet
Development

No branches or pull requests

4 participants