Skip to content
This repository has been archived by the owner on May 20, 2024. It is now read-only.

Full-text search in messages (and other content?) #1251

Closed
tiltec opened this issue Jan 28, 2019 · 2 comments
Closed

Full-text search in messages (and other content?) #1251

tiltec opened this issue Jan 28, 2019 · 2 comments

Comments

@tiltec
Copy link
Member

tiltec commented Jan 28, 2019

I played a bit around with full-text search in messages. It seems a useful feature for mid-term future, so this is mostly a report on backend technology to have a faster start next time.

Django has reasonable built-in support for Postgres text search features: https://docs.djangoproject.com/en/2.1/ref/contrib/postgres/search/
It's essentially:

query = SearchQuery('my search term')
vector = SearchVector('content')
rank = SearchRank(vector, query)

ConversationMessage.objects.annotate(rank=rank).filter(rank__gte=0).order_by('rank')

SearchQuery calls Postgres plainto_tsquery function, which is documented here. The function splits up the search term into "lexemes", dependent on the language
There's also a websearch_to_tsquery function, which has some nice properties, such as defining a nice small search syntax that supports or, phrases and exclusion. It was added in Postgres 11, so this would create a new dependency.

We can search in multiple fields by combining SearchVector.

To show a preview of the matched text, ts_headline returns an excerpt with the matched sections highlighted. This sounds very convenient, maybe it can return Markdown directly! (e.g. by setting StartSel and StopSel to ** it should make it bold, but it might interfere with user formatting...)

About performance: I didn't look very much into indexing, there's something written here.

It might also be faster to not rank results, but just match them. Very basic testing on a small result set didn't show notable difference, but this doesn't mean much.

Another question is about API structure. There was the idea to create a mega search endpoint /api/search/?q=term that can search in messages, but also in other data. The response should contain the results, related objects and pagination support. It could look similar to the conversation list endpoint, like this:

{
  "prev": "null",
  "next": "....",
  "results": [
    {
      "messages": [],
      "conversations": [],
      "applications": [],
      "pickups": [],
      "meta": {
        "term": "my search term",
      }
    }
  ]
}

The frontend is a completely different beast. We could take some inspiration from Discourse and Slack. In the beginning it would probably be a "dev UI" with just a few ugly buttons.

@tiltec tiltec added this to the Discussion (lower priority) milestone Jan 31, 2019
@github-actions
Copy link

This issue is marked as stale because it has not had any activity for 90 days, remove the stale label or add a comment on it, otherwise it will be automatically closed in 7 days. Thanks!

@github-actions
Copy link

This issue is marked as stale because it has not had any activity for 90 days.

It doesn't mean it's not important, so please remove the stale label if you like it, or add a comment saying what it means to you :)

However, if you just leave it like this, I'll close it in 7 days to help keep your issues tidy!

Thanks!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

1 participant