Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP
Newer
Older
100644 210 lines (127 sloc) 9.021 kb
c763882 @floere + moved the docs into the Picky main repo
authored
1 ## Search{#search}
2
d242249 @floere + edit links for all the doc sections
authored
3 {.edit}
4 [edit](http://github.com/floere/picky/blob/master/web/source/documentation/_search.html.md)
5
ecf9ea7 @beatrichartz . some preliminary corrections to the documentation
beatrichartz authored
6 Picky offers a `Search` interface for the indexes. You instantiate it as follows:
c763882 @floere + moved the docs into the Picky main repo
authored
7
8 Just searching over one index:
9
10 books = Search.new books_index # searching over one index
11
12 Searching over multiple indexes:
13
14 media = Search.new books_index, dvd_index, mp3_index
15
16 Such an instance can then search over all its indexes and returns a `Picky::Results` object:
17
18 results = media.search "query", # the query text
19 20, # number of ids
20 0 # offset (for pagination)
21
22 Please see the part about [Results](#results) to know more about that.
23
24 ### Options{#search-options}
25
26 You use a block to set search options:
27
28 media = Search.new books_index, dvd_index, mp3_index do
29 searching tokenizer_options_or_tokenizer
30 boost [:title, :author] => +2,
31 [:author, :title] => -1
32 end
33
34 #### Searching / Tokenizing{#search-options-searching}
35
36 See [Tokenizing](#tokenizing) for tokenizer options.
37
38 #### Boost{#search-options-boost}
39
40 The `boost` option defines what combinations to boost.
41
42 This is unlike boosting in most other search engines, where you can only boost a given field. I've found it much more useful to boost combinations.
43
ecf9ea7 @beatrichartz . some preliminary corrections to the documentation
beatrichartz authored
44 For example, you have an index of addresses. The usual case is that someone is looking for a street and a number. So if Picky encounters that combination (in that order), it should promote the results containing that combination to a more prominent spot.
45 On the other hand, if picky encounters a street number followed by a street name, which is unlikely to be a search for an address (where I come from), you might want to demote that result.
c763882 @floere + moved the docs into the Picky main repo
authored
46
47 So let's boost `street, streetnumber`, while at the same time deboost `streetnumber, street`:
48
49 addresses = Picky::Search.new address_index do
50 boost [:street, :streetnumber] => +2,
51 [:streetnumber, :street] => -1
52 end
53
88cbaa2 @floere ! weights -> weight
authored
54 If you still want to boost a single category, check out the [category weight option](#indexes-categories-weight).
c763882 @floere + moved the docs into the Picky main repo
authored
55 For example:
56
57 Picky::Index.new :addresses do
88cbaa2 @floere ! weights -> weight
authored
58 category :street, weight: Picky::Weights::Logarithmic.new(+4)
c763882 @floere + moved the docs into the Picky main repo
authored
59 category :streetnumber
60 end
61
ecf9ea7 @beatrichartz . some preliminary corrections to the documentation
beatrichartz authored
62 This boosts the weight of the street category for all searches using the index with this category. So whenever the street category is found in results, it will boost these.
f58621d @floere + Explain weights and boost better
authored
63
64 ##### Note on Boosting
65
66 Picky combines consecutive categories in searches for boosting. So if you search for "star wars empire strikes back", when you defined `[:title] => +1`, then that boosting is applied.
67
68 Why? In earlier versions of Picky we found that boosting specific combinations is less useful than boosting a specific _order_ of categories.
69
ecf9ea7 @beatrichartz . some preliminary corrections to the documentation
beatrichartz authored
70 Let me give you an example from a movie search engine. instead of having to say `boost [:title] => +1, [:title, :title] => +1, [:title, :title, :title] => +1`, it is far more useful to say "If you find any number of title words in a row, boost it". So, when searching for "star wars empire strikes back 1979", it is less important that the query contains 5 title words than that it contains a title followed by a release year. So in this particular case, a boost defined by `[:title, :release_year] => +3` would be applied.
f58621d @floere + Explain weights and boost better
authored
71
d46b6dc @floere + New features (only, ignore)
authored
72 #### Ignoring Categories{#search-options-ignore}
c763882 @floere + moved the docs into the Picky main repo
authored
73
74 There's a [full blog post](http://florianhanke.com/blog/2011/09/01/picky-case-study-location-based-ads.html) devoted to this topic.
75
d46b6dc @floere + New features (only, ignore)
authored
76 In short, an `ignore :name` option makes that Search throw away (ignore) any tokens (words) that map to category `name`.
c763882 @floere + moved the docs into the Picky main repo
authored
77
d46b6dc @floere + New features (only, ignore)
authored
78 Let's say we have a search defined:
c763882 @floere + moved the docs into the Picky main repo
authored
79
80 names = Picky::Search.new name_index do
81 ignore :first_name
82 end
83
d46b6dc @floere + New features (only, ignore)
authored
84 Now, if Picky finds the tokens "florian hanke" in both `:first_name, :last_name` and `:last_name, :last_name`, then it will throw away the solutions for `:first_name` ("florian" will be thrown away) leaving only "hanke", since that is a last name. The `[:last_name, :last_name]` combinations will be left alone – ie. if "florian" and "hanke" are both found in `last_name`.
85
86 #### Ignoring Combinations of Categories{#search-options-ignore-combination}
87
88 The `ignore` option also takes arrays. If you give it an array, it will throw away all solutions where that _order_ of categories occurs.
89
90 Let's say you want to throw away results where last name is found before first name, because your search form is in order: `[first_name last_name]`.
91
92 names = Picky::Search.new name_index do
93 ignore [:last_name, :first_name]
94 end
95
96 So if somebody searches for "peter paul han" (each a last name as well as a first name), and Picky finds the following combinations:
97
98 [:first_name, :first_name, :first_name]
99 [:last_name, :first_name, :last_name]
100 [:first_name, :last_name, :first_name]
101 [:last_name, :first_name, :first_name]
102 [:last_name, :last_name, :first_name]
103
104 then the combinations
105
106 [:last_name, :first_name, :first_name]
107 [:last_name, :last_name, :first_name]
108
ecf9ea7 @beatrichartz . some preliminary corrections to the documentation
beatrichartz authored
109 will be thrown away, since they are in the order `[:last_name, :first_name]`. Note that `[:last_name, :first_name, :last_name]` is not thrown away since it is last-first-last.
d46b6dc @floere + New features (only, ignore)
authored
110
111 #### Keeping Combinations of Categories{#search-options-only-combination}
112
113 This is the opposite of the `ignore` option above.
114
115 Almost. The `only` option only takes arrays. If you give it an array, it will keep only solutions where that _order_ of categories occurs.
116
117 Let's say you want to keep only results where first name is found before last name, because your search form is in order: `[first_name last_name]`.
118
119 names = Picky::Search.new name_index do
120 only [:first_name, :last_name]
121 end
122
123 So if somebody searches for "peter paul han" (each a last name as well as a first name), and Picky finds the following combinations:
124
125 [:first_name, :first_name, :last_name]
126 [:last_name, :first_name, :last_name]
127 [:first_name, :last_name, :first_name]
128 [:last_name, :first_name, :first_name]
129 [:last_name, :last_name, :first_name]
130
131 then only the combination
132
133 [:first_name, :first_name, :last_name]
134
135 will be kept, since it is the only one where first comes before last, in that order.
c763882 @floere + moved the docs into the Picky main repo
authored
136
137 #### Ignore Unassigned Tokens{#search-options-unassigned}
138
139 There's a [full blog post](http://florianhanke.com/blog/2011/09/05/picky-ignoring-unassigned-tokens.html) devoted to this topic.
140
141 In short, the `ignore_unassigned_tokens true/false` option makes Picky be very lenient with your queries. Usually, if one of the search words is not found, say in a query "aston martin cockadoodledoo", Picky will return an empty result set, because "cockadoodledoo" is not in any index, in a car search, for example.
142
143 By ignoring the "cockadoodledoo" that can't be assigned sensibly, you will still get results.
144
145 This could be used in a search for advertisements that are shown next to the results.
146
147 If you've defined an ads search like so:
148
149 ads_search = Search.new cars_index do
150 ignore_unassigned_tokens true
151 end
152
153 then even if Picky does not find anything for "aston martin cockadoodledoo", it will find an ad, simply ignoring the unassigned token.
154
155 #### Maximum Allocations{#search-options-maxallocations}
156
157 The `max_allocations(integer)` option cuts off calculation of allocations.
158
159 What does this mean? Say you have code like:
160
161 phone_search = Search.new phonebook do
162 max_allocations 1
163 end
164
165 And someone searches for "peter thomas".
166
167 Picky then generates all possible allocations and sorts them.
168
169 It might get
170
d5abf0c @floere - old web site pages, ! md rendering error, + styling
authored
171 * `[first_name, last_name]`
172 * `[last_name, first_name]`
173 * `[first_name, first_name]`
c763882 @floere + moved the docs into the Picky main repo
authored
174 * etc.
175
176 with the first allocation being the most probable one.
177
178 So, with `max_allocations 1` it will only use the topmost one and throw away all the others.
179
180 It will only go through the first one and calculate only results for that one. This can be used to speed up Picky in case of exploding amounts of allocations.
181
182 #### Early Termination{#search-options-terminateearly}
183
184 The `terminate_early(integer)` or `terminate_early(with_extra_allocations: integer)` option stops Picky from calculate all ids of all allocations.
185
186 However, this will also return a wrong total.
187
ecf9ea7 @beatrichartz . some preliminary corrections to the documentation
beatrichartz authored
188 So, important note: Only use when you don't display a total. Or you want to fool your users (not recommended).
c763882 @floere + moved the docs into the Picky main repo
authored
189
190 Examples:
191
192 Stop as soon as you have calculated enough ids for the allocation.
193
194 phone_search = Search.new phonebook do
195 terminate_early # The default uses 0.
196 end
197
198 Stop as soon as you have calculated enough ids for the allocation, and then calculate 3 allocations more (for example, to show to the user).
199
200 phone_search = Search.new phonebook do
201 terminate_early 3
202 end
203
204 There's also a hash form to be more explicit. So the next coder knows what it does. (However, us cool Picky hackers _know_ ;) )
205
206 phone_search = Search.new phonebook do
207 terminate_early with_extra_allocations: 5
208 end
209
210 This option speeds up Picky if you don't need a correct total.
Something went wrong with that request. Please try again.