Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LIKE, IN, BETWEEN operators -- filter extension #178

Merged
merged 7 commits into from
Jan 5, 2024

Conversation

jamesfisher-gis
Copy link
Collaborator

@jamesfisher-gis jamesfisher-gis commented Dec 15, 2023

Related Issue(s):

Description:
Adds the CQL2 LIKE search operator utilizing the Elasticsearch wildcard query. The code supports CQL wildcard characters ('_' and '%'), Elasticsearch wildcard characters ('?' and '*'), and escape characters in queries.

PR Checklist:

  • [x ] Code is formatted and linted (run pre-commit run --all-files)
  • [x ] Tests pass (run make test)
  • [x ] Documentation has been updated to reflect changes, if applicable
  • Changes are added to the changelog

Copy link
Member

@gadomski gadomski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems fine to me, but I'd like to get @philvarner or @jonhealy1 's eyes on it as well as I don't have great context on ES.

@@ -24,6 +24,11 @@
settings = ElasticsearchSettings()
session = Session.create_from_settings(settings)

filter_extension = FilterExtension(client=EsAsyncBaseFiltersClient())
filter_extension.conformance_classes.append(
"http://www.opengis.net/spec/cql2/1.0/req/advanced-comparison-operators"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this shouldn't be advertised unless IN and BETWEEN are also implemented.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I have implemented IN and BETWEEN in the latest commit.

The conformance class endpoint above aligns with what STAC Browser expects. The Filter extension README has the conformance class endpoint as http://www.opengis.net/spec/cql2/1.0/conf/advanced-comparison-operators. Which should be used? I get a 404 response from both...

I followed this example which has the BETWEEN bounds being passed as an array.

{
  "filter-lang": "cql2-json",
  "filter": {
    "op": "between",
    "args": [
      { "property": "eo:cloud_cover" },
      [ 0, 50 ]
    ]
  }
}

This required adjustments to the Clause filter class to allow lists and some typing checks in IN and BETWEEN

class Clause(BaseModel):
    """Filter extension clause."""

    op: Union[LogicalOp, ComparisonOp, AdvancedComparisonOp, SpatialIntersectsOp]
    args: List[Union[Arg, List[Arg]]]
elif self.op == AdvancedComparisonOp.between:
            if not isinstance(self.args[1], List):
                raise RuntimeError(f"Arg {self.args[1]} is not a list")
            return {
                "range": {
                    to_es(self.args[0]): {
                        "gte": to_es(self.args[1][0]),
                        "lte": to_es(self.args[1][1]),
                    }
                }
            }
elif self.op == AdvancedComparisonOp._in:
    if not isinstance(self.args[1], List):
        raise RuntimeError(f"Arg {self.args[1]} is not a list")
    return {
        "terms": {to_es(self.args[0]): [to_es(arg) for arg in self.args[1]]}
    }

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I have implemented IN and BETWEEN in the latest commit.

nice, will review.

The conformance class endpoint above aligns with what STAC Browser expects. The Filter extension README has the conformance class endpoint as http://www.opengis.net/spec/cql2/1.0/conf/advanced-comparison-operators. Which should be used? I get a 404 response from both...

Ah, that seems to be a bug in STAC Browser - they should be the urls with conf in them, not req. They aren't resolvable as a URL -- many of the OGC URL-like conformance classes aren't.

I followed this example which has the BETWEEN bounds being passed as an array.

This has been changed in newer versions of CQL2 (or v1.0.0-beta.5 had bad examples?). The current version of Filter Extension is here, specifically BETWEEN: https://github.com/stac-api-extensions/filter/blob/main/README.md#example-11-using-between
I think you'll probably need to make some changes to work with this instead of the beta.5 version.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay. Thank you, Phil.

Updated BETWEEN syntax and the conformance class endpoint in e7330dc

to_es(self.args[0]): {
"value": cql2_like_to_es(str(to_es(self.args[1]))),
"boost": 1.0,
"case_insensitive": "true",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be case sensitive

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean that case_insensitive should be set to false?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, since AFAIK LIKE is a case sensitive operation (e.g., b% matches ba but not Ba

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. Updated in e7330dc

@jamesfisher-gis jamesfisher-gis changed the title LIKE operator -- filter extension LIKE, IN, BETWEEN operators -- filter extension Dec 19, 2023
Copy link
Collaborator

@jonhealy1 jonhealy1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great @jamesfisher-gis

@jonhealy1 jonhealy1 merged commit ba11e0c into stac-utils:main Jan 5, 2024
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants