Skip to content

Commit

Permalink
lib-fts: generic simple tokeniser - distinguish "letters" from non-"l…
Browse files Browse the repository at this point in the history
…etters"

prev_type is only compared against SINGLE_QUOTE, so there will be no
behavioural differences. However, maintaining the state that we've just
seen something we are prepared to search for (very loosely, a "letter")
rather than something that we threw away (word breaks) will be important
when it comes to explicit prefix query parsing.

Signed-off-by: Phil Carmody <phil@dovecot.fi>
  • Loading branch information
Phil Carmody authored and villesavolainen committed Feb 12, 2019
1 parent 0757462 commit 6d292e2
Showing 1 changed file with 4 additions and 1 deletion.
5 changes: 4 additions & 1 deletion src/lib-fts/fts-tokenizer-generic.c
Expand Up @@ -255,7 +255,10 @@ fts_tokenizer_generic_simple_next(struct fts_tokenizer *_tok,
start = i + char_size;
shift_prev_type(tok, LETTER_TYPE_SINGLE_QUOTE);
} else {
shift_prev_type(tok, LETTER_TYPE_NONE);
/* Lie slightly about the type. This is anything that
we're not skipping or cutting on and are prepared to
search for - it's "as good as" a letter. */
shift_prev_type(tok, LETTER_TYPE_ALETTER);
}
}
/* word boundary not found yet */
Expand Down

0 comments on commit 6d292e2

Please sign in to comment.