Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wildcard search in query parameter returning odd results #134

Closed
Tracked by #27
tloubrieu-jpl opened this issue May 16, 2022 · 9 comments
Closed
Tracked by #27

wildcard search in query parameter returning odd results #134

tloubrieu-jpl opened this issue May 16, 2022 · 9 comments
Assignees
Labels
B13.0 bug Something isn't working s.high High severity

Comments

@tloubrieu-jpl
Copy link
Member

🐛 Describe the bug

I am not sure about the implementation of wildcard in the API, however I submit a bug the result of this request does not make sense to me:

curl --location --request GET 'http://localhost:8080/products?limit=10&q=title eq "InSight RAD*"&fields=title' \
--header 'Accept: text/csv'

Returns:

title
"InSight RAD Derived Data Collection"
"InSight RAD Raw Data Collection"
"InSight HP3 Radiometer Experiment Derived Product:hp3_rad_der_00014_20181211_073042"
"InSight HP3 Radiometer Experiment Derived Product:hp3_rad_der_00122_20190401_123217"
"InSight HP3 Radiometer Experiment Derived Product:hp3_rad_der_00213_20190703_052044"
"InSight HP3 Radiometer Experiment Derived Product:hp3_rad_der_00305_20191006_053040"
"InSight HP3 Radiometer Experiment Derived Product:hp3_rad_der_00390_20200101_120222"
"InSight HP3 Radiometer Experiment Derived Product:hp3_rad_der_00478_20200401_121608"
"InSight HP3 Radiometer Experiment Derived Product:hp3_rad_der_00567_20200701_125545"
"InSight HP3 Radiometer Experiment Raw Product:hp3_rad_raw_00004_20181130_085325"

🕵️ Expected behavior

Return only products which title start with "InSight RAD".

📚 Version of Software Used

version 1.0

🩺 Test Data / Additional context

🏞Screenshots

🖥 System Info

  • OS: [e.g. iOS]
  • Browser [e.g. chrome, safari]
  • Version [e.g. 22]

🦄 Related requirements

⚙️ Engineering Details

@tloubrieu-jpl tloubrieu-jpl added bug Something isn't working needs:triage labels May 16, 2022
@tloubrieu-jpl tloubrieu-jpl added B13.0 s.high High severity labels May 16, 2022
@jordanpadams jordanpadams changed the title wildcard in q parameters do not work ? wildcard in q parameter does not work correctly May 17, 2022
@jordanpadams jordanpadams changed the title wildcard in q parameter does not work correctly wildcard search in query parameter returning odd results May 17, 2022
@al-niessner
Copy link
Contributor

@tloubrieu-jpl

From a change @jordanpadams may have approved it seems that eq has change to like.

@jordanpadams
Copy link
Member

@al-niessner @tloubrieu-jpl hmmm. apologies there. I think I may have described what I wanted incorrectly. 😥

The way I imagine "eq" functionality to work is it matches "eq" as you would think (e.g. ==) but also supports wildcards. So not quite "like" but does still evaluate wildcards. For instance:

title eq "InSight RAD*":

title matches?
InSight RAD 123 X
InSight RAD X
foobar InSight RAD X ???
insight rad 123 X
InSight foobar RAD

I don't know if that made any sense... We can discuss on Tuesday if it is did not.

@al-niessner
Copy link
Contributor

I vaguely remember the conversation and was implemented in PR 15

You then had @tdddblog modify it on PR 72 and gave the "like" a thumbs up -- clearly agreeing with being the best choice. I included the link so you could easily remind yourself why you thought that was better than eq to avoid re-introducing "like" the week after the changes are made and consequences are felt.

Warning, to undo like, will have to reinstate much of what was undone by @tdddblog as the PR states to get done what you asked for then required quoting to be removed. Probably get spacing problems back too.

If you want to revert back, then plan on it being a bit of work -- weeks. Ultimately, antlr is limiting you to choices that do not satisfy your needs.

@tloubrieu-jpl
Copy link
Member Author

tloubrieu-jpl commented Jun 14, 2022

  • The right request should be, for wildcard:
curl --location --request GET 'http://localhost:8080/products?limit=10&q=title like "InSight RAD*"&fields=title' \
--header 'Accept: text/csv'
  • like need to be added to the user guide

  • The following request:

 curl --location --request GET 'http://localhost:8080/products?limit=10&q=title eq "InSight RAD*"&fields=title' \
 --header 'Accept: text/csv' 

need to be fixed to return no results (since there is no such title as "InSight RAD*")

@al-niessner
Copy link
Contributor

al-niessner commented Jul 6, 2022

@jordanpadams and @tloubrieu-jpl this is in response to #134 and #144.

opensearch and elasticsearch are being too helpful in the same way. Given the string "Insight RAD*" in our query, it would have to be turned into "Insight RAD"* in opensearch to do what you want. title like "Insight RAD*" in our query means insight OR rad* in the field title. It is why you are getting your unwanted responses, they all have rad. I turned off the fuzziness but it hardly matters and I cannot make it case sensitive.

If we want to be smart, we can leverage the search syntax offered by opensearch and elasticsearch. They want to use "" for binding phrases together so maybe we should change our "" to ''.

--> addendum

To get the search results you want without using quotes and allowing opensearch to help:

$ curl --location --request GET 'http://localhost:8080/gid/any?limit=10&fields=title&q=title%20like%20%22Insight%2BRAD%22' --header 'Accept: text/csv'
title
"InSight RAD Calibrated Data Collection"
"InSight RAD Derived Data Collection"
"InSight RAD Raw Data Collection"

To unwrap the URL encoding: q=title like "Insight+RAD"

Here the + acts as an AND operator and I have fuzziness turned off.

@jordanpadams
Copy link
Member

thanks @al-niessner . to clarify, when you say "fuzziness" turned off, is that part of your query to OpenSearch telling it to disable fuzzy search? Or is it a setting in OpenSearch?

I think either proposed solution works for me.

@al-niessner
Copy link
Contributor

Sorry, I should have been clear. Part of the help that open/elastic search gives is that string query allow for fuzzy results so that if you ask for wnid it will still return wind and winds. They will be ordered that way too because wind a transposition and winds is an edit (added s). They then give a score say 0 for wind because transpositions are 0 edits and winds is 1 edit. The actual scores are more complex but this a good enough approximation. Because they return a list of scored results, it is a fuzzy search and results.

What I mean by turned fuzziness off is that I set an option to make transpositions 1 edit and to allow only 0 edit results in the java code to make operator like behave as you want.

The string queries are not the same as matching or term searches.

@tloubrieu-jpl
Copy link
Member Author

This ticket can be closed since it the eq operator works again.

However the like operator still does not work, a ticket is created to handle that #170

@tloubrieu-jpl
Copy link
Member Author

@gxtchen this should be tested from:

  • the latest stable version of the registry-api (1.0.2) which can be deployed using the latest stable registry docker compose (1.0.2)
  • the latest development version should also be tested but the CICD did not work on registry-api

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
B13.0 bug Something isn't working s.high High severity
Projects
None yet
Development

No branches or pull requests

3 participants