Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add suggestions with fuzzy text search when no relationship is found #2583

Merged
merged 8 commits into from
Dec 15, 2022

Conversation

laurenceisla
Copy link
Member

@laurenceisla laurenceisla commented Dec 10, 2022

Closes #2569

Contemplating two options:

  • Suggest either the parent or child one at a time. If no parent is found, then suggest the closest match and stop there, otherwise, continue looking for the child and suggest its closest match.
  • Suggest both parent and child, if neither is found then suggest the closest relationship between them.

Done so far:

  • Simple suggestions for parents or children (one at a time)
  • Doctests
  • Spectests

src/PostgREST/Error.hs Outdated Show resolved Hide resolved
@laurenceisla laurenceisla marked this pull request as ready for review December 12, 2022 23:01
@laurenceisla laurenceisla marked this pull request as draft December 12, 2022 23:07
else (<> "' instead of '" <> parent <> "'.") <$> suggestParent
where
findParent = HM.lookup (QualifiedIdentifier schema parent, schema) allRels
fuzzySetOfParents = Fuzzy.fromList [qiName (fst p) | p <- HM.keys allRels, snd p == schema]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this is expensive if allRels is big, in which case perhaps the hint could be disabled by looking at the map length?

Maybe this could be tested with the apflora schema, by doing time curl .. on a request with bad names.

Copy link
Member Author

@laurenceisla laurenceisla Dec 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In apflora, allRels has a size of 305. These are the times that I get:

  1. Relationship is found (no fuzzy search):

    $ curl -w "%{time_total}s" -s "http://localhost:3000/ziel?select=ap(*)"

    Response:

    []

    Times:

    0.002485s
    0.002125s
    
  2. Relationship not found (fuzzy search for parent)

    $ curl -w "%{time_total}s" -s "http://localhost:3000/ziels?select=ap(*)"

    Response:

    {
        "code": "PGRST200",
        "details": "Searched for a foreign key relationship between 'ziels' and 'ap' in the schema 'apflora', but no matches were found.",
        "hint": "Perhaps you meant 'ziel' instead of 'ziels'.",
        "message": "Could not find a relationship between 'ziels' and 'ap' in the schema cache"
    }

    Times:

    0.072776s
    0.056077s
    
  3. Relationship not found (fuzzy search for child)

    $ curl -w "%{time_total}s" -s "http://localhost:3000/ziel?select=app(*)"

    Response:

    {
        "code": "PGRST200",
        "details": "Searched for a foreign key relationship between 'ziel' and 'app' in the schema 'apflora', but no matches were found.",
        "hint": "Perhaps you meant 'ap' instead of 'app'.",
        "message": "Could not find a relationship between 'ziel' and 'app' in the schema cache"
    }

    Times:

    0.035553s
    0.027014s
    

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(fuzzy search for parent)
0.072776s
0.056077s
(fuzzy search for child)
0.035553s
0.027014s

So parent takes longer than child because it searches on the full allRels, while child only searches on the relationships that have parent. Makes sense.

Comment on lines 220 to 223
-- If no relationship is found between a parent and a child, then it looks for the parent first.
-- If the parent is not found then it does a fuzzy search to all the parents in the schema cache and
-- gives the best match as suggestion. Otherwise, it does a fuzzy search to all the corresponding children
-- of that parent and gives the best match as suggestion. If both are found, then no suggestion is given.
Copy link
Member

@steve-chavez steve-chavez Dec 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to make sure I'm understanding this.

If no rel is found:

  • Looks for parent suggestions if parent not found
  • Looks for child suggestions if children not found and parent is found
  • No suggestions if both are found(

Is that correct?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, if both are found then there's an error with the hint, which is more complex to look for.

@steve-chavez
Copy link
Member

Suggest both parent and child, if neither is found then suggest the closest relationship between them.

That one is really tricky and likely not worth it. Good that you didn't pursue it further.

CHANGELOG.md Outdated Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

reload the schema cache error message is misleading
2 participants