Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement feature to search jobs with a sp- and doc-integrated filter. #188

Closed
wants to merge 5 commits into from

Conversation

csadorf
Copy link
Contributor

@csadorf csadorf commented May 19, 2019

Description

This patch enables the search for jobs with a statepoint- and document-
combined filter. To specify whether a key is a statepoint- or a
document-key use prefixes 'sp.' and 'doc.' respectively. No prefix is
equivalent to the 'sp.' prefix.

This patch is backwards compatible with the exception that the index
scheme was slightly modified.

Related to issue glotzerlab/signac-dashboard#25 .

Minimal example

This patch enables queries such as this one:

>>> project.find_jobs({'foo': 0, 'doc.bar': True})
signac.contrib.project.JobsCursor({'project': 'project', 'filter': '{'sp.foo': 0, 'doc.bar': True}'})
>>> # equivalent to:
>>> project.find_jobs({'foo': 0}, {'bar': True})
signac.contrib.project.JobsCursor({'project': 'project', 'filter': '{'sp.foo': 0, 'doc.bar': True}'})

Equivalently on the command line:

$ signac find foo 0 doc.bar true

All keys without a prefix are automatically interpreted as having the sp. prefix.

Motivation and Context

It is difficult to impossible to search jobs by filter and document-filter encoded by a simple string. That makes it impossible for example to search properly using signac-dashboard. This patch enables searching with one combined filter, while searching with two filters is still possible

Types of Changes

  • Documentation update
  • Bug fix
  • New feature
  • Breaking change1

1The change breaks (or has the potential to break) existing functionality.

Checklist:

If necessary:

  • I have updated the API documentation as part of the package doc-strings.
  • I have created a separate pull request to update the framework documentation on signac-docs and linked it here.
  • I have updated the changelog.

Tasks

  • Add warning about the use of sp and doc as root-level keys in documents.

@csadorf csadorf requested review from a team and mikemhenry May 19, 2019 02:11
@codecov
Copy link

codecov bot commented May 19, 2019

Codecov Report

Merging #188 into master will decrease coverage by 0.03%.
The diff coverage is 79.62%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #188      +/-   ##
==========================================
- Coverage   65.06%   65.03%   -0.04%     
==========================================
  Files          37       37              
  Lines        5542     5554      +12     
==========================================
+ Hits         3606     3612       +6     
- Misses       1936     1942       +6
Impacted Files Coverage Δ
signac/contrib/linked_view.py 85.71% <100%> (ø) ⬆️
signac/contrib/import_export.py 84.93% <100%> (ø) ⬆️
signac/contrib/schema.py 89.16% <100%> (ø) ⬆️
signac/contrib/filterparse.py 77.64% <76.47%> (+1.45%) ⬆️
signac/contrib/project.py 90.18% <80%> (-0.45%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f8a08c3...8487e60. Read the comment docs.

@codecov
Copy link

codecov bot commented May 19, 2019

Codecov Report

Merging #188 into master will increase coverage by 0.06%.
The diff coverage is 85.52%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #188      +/-   ##
==========================================
+ Coverage   65.03%   65.10%   +0.06%     
==========================================
  Files          37       37              
  Lines        5546     5574      +28     
==========================================
+ Hits         3607     3629      +22     
- Misses       1939     1945       +6     
Impacted Files Coverage Δ
signac/contrib/project.py 90.09% <84.00%> (-0.54%) ⬇️
signac/contrib/filterparse.py 81.44% <84.78%> (+5.25%) ⬆️
signac/contrib/import_export.py 84.93% <100.00%> (ø)
signac/contrib/linked_view.py 85.71% <100.00%> (ø)
signac/contrib/schema.py 89.16% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update cbe589d...99a5847. Read the comment docs.

This patch enables the search for jobs with a statepoint- and document-
combined filter. To specify whether a key is a statepoint- or a
document-key use prefixes 'sp.' and 'doc.' respectively. No prefix is
equivalent to the 'sp.' prefix.

This patch is backwards compatible with the exception that the index
scheme was slightly modified.
To canonicalize all job-find filters at the JobsCursor stage.
@vyasr
Copy link
Contributor

vyasr commented May 21, 2019

I updated your note to point to what I think is the relevant signac-dashboard issue, but please correct it if that's not the one you wanted.

@csadorf
Copy link
Contributor Author

csadorf commented May 21, 2019

I updated your note to point to what I think is the relevant signac-dashboard issue, but please correct it if that's not the one you wanted.

Thx, I've edited it slightly to say "related to" instead of "resolves", because dashboard might still need a minor update to work with this properly (at least we would need to test that).

return _with_message(q, file)


def parse_simple(tokens):
Copy link
Contributor

@vyasr vyasr May 21, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to keep this "private" by renaming to _parse_simple and renaming the current _parse_simple to _parse_simple_single or so. I recognize that you probably did this to support importing into other modules where it's used, but I don't think we need to bloat the API.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm ok with the name change, however, I was actually hoping to expose this function publicly, to make it easier to implement custom user scripts like this:

import click

@click.command()
@click.argument('filter', nargs=-1)
def cli(filter):
    jobs = project.find_jobs(parse_simple(filter))

cli()

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I support making this a public method / official API - if it's not public then signac-dashboard has to rely on a private method.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, in that case I support making it part of the API as well. I still like renaming the _parse_simple method to indicate that it operates on one filter component at a time, but we don't have to change the parse_simple method.

elif '.' in key and key.split('.', 1)[0] in ('sp', 'doc'):
yield key, value
elif key in ('sp', 'doc'):
yield key, value
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the purpose of this branch?

On a related note, should we explicitly disallow sp and doc as statepoint/document keys in addition to dots in keys?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For these cases, we need to ensure to not add 'sp' or 'doc' as prefixes, for instance with a filter that is expressed like this: {"sp": {"foo": 0}}.

Concerning your second point, sp and doc are actually valid keys, you would need to "escape" them like this: {"sp.sp.foo": 0}. However, I agree that we should probably discourage that, possibly even with warnings, and make sure to explicitly test the correctness of the behavior in that case.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't consider that using sp and doc as root-level keys in a standard query format was an option, but in hindsight that makes sense. I support discouraging the usage of those as keys to future-proof ourselves against future changes where that might cause problems, though.

@@ -1306,7 +1295,7 @@ def _read_cache(self):
return cache

def index(self, formats=None, depth=0,
skip_errors=False, include_job_document=True):
skip_errors=False, include_job_document=True, **kwargs):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we actually need to support arbitrary kwargs here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, that is an error. I was trying to introduce backwards compatibility here, and I think we should probably still do that in the long run, but as of right now, the kwargs are unnecessary.

Copy link
Contributor Author

@csadorf csadorf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thx for your review @vyasr!

return _with_message(q, file)


def parse_simple(tokens):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm ok with the name change, however, I was actually hoping to expose this function publicly, to make it easier to implement custom user scripts like this:

import click

@click.command()
@click.argument('filter', nargs=-1)
def cli(filter):
    jobs = project.find_jobs(parse_simple(filter))

cli()

elif '.' in key and key.split('.', 1)[0] in ('sp', 'doc'):
yield key, value
elif key in ('sp', 'doc'):
yield key, value
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For these cases, we need to ensure to not add 'sp' or 'doc' as prefixes, for instance with a filter that is expressed like this: {"sp": {"foo": 0}}.

Concerning your second point, sp and doc are actually valid keys, you would need to "escape" them like this: {"sp.sp.foo": 0}. However, I agree that we should probably discourage that, possibly even with warnings, and make sure to explicitly test the correctness of the behavior in that case.

@@ -1306,7 +1295,7 @@ def _read_cache(self):
return cache

def index(self, formats=None, depth=0,
skip_errors=False, include_job_document=True):
skip_errors=False, include_job_document=True, **kwargs):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, that is an error. I was trying to introduce backwards compatibility here, and I think we should probably still do that in the long run, but as of right now, the kwargs are unnecessary.

@csadorf csadorf removed the request for review from mikemhenry July 5, 2019 14:02
@csadorf csadorf added the enhancement New feature or request label Jul 5, 2019
@csadorf csadorf modified the milestones: v2.0.0, v1.3.0 Jul 5, 2019
@csadorf
Copy link
Contributor Author

csadorf commented Sep 2, 2019

I'm going to close this until we work on 1.3.

@vyasr
Copy link
Contributor

vyasr commented Apr 25, 2020

@vishav1771 please comment on here and let me know if you're OK with switching to work on this for now (regarding my comments from #309).

@vishav1771
Copy link
Contributor

@vyasr That's okay. 👍

@csadorf csadorf modified the milestones: v1.3.0, v1.5.0 Apr 26, 2020
@vishav1771
Copy link
Contributor

This is moved to #332 .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants