forked from floere/picky
-
Notifications
You must be signed in to change notification settings - Fork 0
/
walkthrough.html
295 lines (294 loc) · 19.4 KB
/
walkthrough.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html lang="en" xml:lang="en" xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta content="text/html; charset=utf-8" http-equiv="Content-Type" />
<meta content="EN" http-equiv="Content-Language" />
<meta content="Florian Hanke, florianhanke.com" name="author" />
<meta content="picky, ruby, single field, semantic small text, search engine" name="keywords" />
<meta content="Picky: The fast and easy to configure Ruby search engine" name="abstract" />
<meta content="Picky: The fast and easy to configure Ruby search engine. Offers a server, a client, and a statistics interface." name="description" />
<meta content="index, follow" name="robots" />
<meta content="3 days" name="revisit-after" />
<link href="favicon.ico" rel="shortcut icon" />
<link href="stylesheets/basic.css" rel="stylesheet" type="text/css" />
<link href="stylesheets/specific.css" rel="stylesheet" type="text/css" />
<link href="stylesheets/grid.css" rel="stylesheet" type="text/css" />
<title>
Picky:
Walkthrough
</title>
<script type="text/javascript">
//<![CDATA[
var _gaq = _gaq || [];
_gaq.push(['_setAccount', 'UA-20991642-1']);
_gaq.push(['_trackPageview']);
(function() {
var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true;
ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js';
var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s);
})();
//]]>
</script>
</head>
<body>
<div class="header">
<a href="http://github.com/floere/picky">
<img alt="Fork me on GitHub" src="images/forkme.png" style="position: fixed; top: -10px; right: 0; border: 0;" />
</a>
</div>
<div class="picky" title="Happy Picky (drawn on iPhone)"></div>
<div class="container_2">
<h1>Picky</h1>
<div class="navigation">
<a class="" href="index.html">about</a>
<a class="" href="details.html">semantic text?</a>
<a class="" href="getting_started.html">get started</a>
<a class="" href="features.html">features</a>
<a class="right" href="enterprise.html">enterprise?</a>
<a class="right" href="status.html">status/contributions</a>
<a class="right" href="videos.html">videos</a>
<a class="right" href="documentation.html">docs</a>
</div>
</div>
<div class="container_2">
<h2 id="walkthrough">Walkthrough Example: Server and client.</h2>
<div class="grid_1">
<h2>Client</h2>
<p>
This column describes using a few examples how to set up a client and a front end for the picky server, described in the right column.
</p>
<h3>Setup</h3>
<p>
The examples assume you're using a Sinatra/Padrino or Rails app.
</p>
<p>
Start by getting the picky-client gem and adding it to your Gemfile. You could go on without it but it helps a lot.
</p>
<code><pre>gem install picky-client

gem 'picky-client'</pre></code>
<p>
Don't forget to do a
</p>
<code><pre>bundle update</pre></code>
<p>
And that's already it for the client setup! Easy, isn't it? The configuration isn't much harder.
</p>
</div>
<div class="grid_1">
<h2>Server</h2>
<p>
This column describes using a few examples how to set up the picky server. You can actually read both columns back and forth if you want. Like ping pong. Played by two chinese master ping-pong pandas. (Not by me, then you'd already stop at gem install. And the table would be on fire.)
</p>
<h3>Setup</h3>
<p>
It starts out the same as in the Getting Started section. But this time, we do an actual example picky project called library_search. For that, we use the
</p>
<code><pre>picky generate unicorn_server <project name></pre></code>
<p>
command that has been installed with the picky gem.
</p>
<code><pre>gem install picky-generators

picky generate unicorn_server library_search

cd library_search

bundle install</pre></code>
<p>
You now have a nice directory (library_search) set up with all the needed Gems, ready to go!
</p>
</div>
<div class="grid_2"></div>
<div class="grid_1">
<h3>Configuration (Sinatra/Rails etc. Controller)</h3>
<p>
The Picky client provides an API to access the server. It looks like this:
</p>
<code><pre># The options define where the Picky server that
# you have already set up is found.
# (Haven't set it up yet – see the right column on
# how to do this, then come back here)
#
# Options are:
# * host # e.g. 'localhost'
# * port # e.g. 8080
# * path # e.g. '/books'
#
Picky::Client.new options</pre></code>
<p>
Usually, what I do is save the Picky client instance in a constant, like FullBooks, or BookSearch.
This is so I can reuse that instance.
</p>
<p>
Since this configuration is environment specific, it is best – in Rails – to put it into development.rb / production.rb / test.rb.
</p>
<code><pre># In development.rb:
#
BookSearch = Picky::Client.new(
 :host => 'localhost',
 :port => 8080,
 :path => '/books'
)</pre></code>
<p>
The BookSearch constant is ready for use in your controller actions!
</p>
<p>
Please continue below to see how to use the configured searches.
</p>
</div>
<div class="grid_1">
<h3>Configuration</h3>
<p>
The most important file in your project is
<strong>app/application.rb</strong>
</p>
<p>
It defines how all the indexing and the searching is handled, and even the routing.
</p>
<h4>Define how the indexing works</h4>
<p>What characters pass through, which words are removed (stopwords), how is the text tokenized, i.e., split?</p>
<code><pre># In app/application.rb, find this stub
# and adapt the examples.
#
default_indexing removes_characters:
 /[^a-zA-Z0-9\s\/\-\"\&\.]/
 ...</pre></code>
<h4>Define a few indexes.</h4>
<p>It's easy. If you have a filled database table ready, it's even easier.</p>
<code><pre># In app/application.rb, find this stub
# and adapt the examples.
#
# Indexes have an identifier, e.g., :books, a source,
# which here is a database table, and a number of categories.
# (With a database source, the categories are equivalent
# to the fields)
#
books_index = Index::Memory.new :books, Sources::DB.new(
 'SELECT id, title, author, description FROM books',
 :file => 'app/db.yml'
 ) do
 category :title, # identifier
 :qualifiers => [:t, :title],
 :similarity => Similarity::DoubleLevenshtone.new(3)
 category ...
end</pre></code>
<p>
An index has
<ol>
<li>an identifier (for index directory naming/referencing by Indexes[:identifier]),</li>
<li>
a data source (find out more on
<a href="http://github.com/floere/picky/wiki/Sources-Configuration">Sources in the Wiki</a>
), and
</li>
<li>
a number of categories (find out more
<a href="http://github.com/floere/picky/wiki/Categories-Configuration">Categories in the Wiki</a>
in the Wiki), and finally,
</li>
<li>a number of options.</li>
</ol>
</p>
<h4>Define how querying works, i.e., query text is handled.</h4>
<p>
After having defined the indexing, this is a piece of cake, since it works the same way.
</p>
<code><pre># In app/application.rb, find this stub
# and adapt the examples.
#
default_querying removes_characters:
 /[^a-zA-Z0-9\s\"\~\*\:]/
...</pre></code>
<h4>Queries</h4>
<p>Define a few queries.</p>
<code><pre># In app/application.rb, find this stub
# and adapt the examples.
#
# A full search returns ids, while a live search doesn't.
#
# The options define weights which will give bonus points
# to certain combinations. If only title words are found,
# a hefty bonus of 6 is given, which is very high.
#
# If a title is found before the author, like
# "the hobbit, tolkien", 3 points are awarded.
#
options = {
 :weights => Query::Weights.new([:title] => 6,
 [:title, :author] => 3)
}
full_search = Query::Full.new(books_index, options)
live_search = Query::Live.new(books_index, options)

# It's possible to use multiple indexes in a query.
#
multi_search = Query::Full.new(
 books_index,
 dvd_index,
 mp3_index
 )</pre></code>
<p>
Find out more in the
<a href="http://github.com/floere/picky/wiki/Queries-Configuration">Wiki on Query Configuration</a>
</p>
<h4>Map some URL paths</h4>
<p>Phew! Almost done :)</p>
<code><pre># In app/application.rb, find this stub
# and adapt the examples.
#
# The method "route" maps URL paths to queries.
# Use regexps or strings to define paths.
#
route %r{^/tracks/full} => full_search
route %r{^/tracks/live} => live_search</pre></code>
<p>
Find out more in the
<a href="http://github.com/floere/picky/wiki/Routing-configuration">Wiki on Routing Queries</a>
</p>
<h3>Indexing</h3>
<p>Finally! Let picky have a look at the data!</p>
<code><pre>rake index</pre></code>
<h3>Gentlemen, start your engines</h3>
<code><pre>rake start</pre></code>
<p>
will start an Unicorn.
</p>
<h3>Refine!</h3>
<p>Define similarity searches, more specific indexes, more searches, etc.</p>
</div>
<div class="grid_2"></div>
<div class="grid_1">
<h3>Usage (Controller)</h3>
<p>
Now that you have defined the constants, let's use them!
</p>
<p>
The client provides a handy #search method, with the signature
<strong>search(options)</strong>
where the options are:
<ul>
<li>query: the query text</li>
<li>offset: the result offset (default 0, only used in Full)</li>
</ul>
</p>
<code><pre># In a controller, e.g. the index action:
#
def index
 # A Picky client has a search method with some options:
 # * query: The query to be sent to Picky.
 # * offset: An offset on the result ids. # Default is 0.
 #
 results = FullBooks.search :query => 'hello picky'</pre></code>
<p>
If the server is running, just try it! The results should be a hash with the result data.
</p>
<p>
Now, this is nice, but not very useful, is it? Picky can make that hash a bit more accessible with Picky::Convenience™.
</p>
<code><pre># Still in the controller action:
#
results = FullBooks.search ...

# Make the hash a bit more self-aware.
#
results.extend Picky::Convenience

# Now you get:
#
results.empty?
results.ids 10 # First 10 ids. Default is 20.
results.clear_ids # Remove all ids.
results.allocations
results.allocations_size
results.total # The total amount of found ids.

# The method I use most often is
# populate_with, as this populates the results
# with rendered results (using the ids), not
# just the ids themselves.
#
# Note: Also clears the ids with clear_ids.
#
results.populate_with Book do |book|
 # book is a model. Render it however you want.
 book.to_s
end

# If you use the provided Picky JavaScript frontend,
# then encode it in JSON before sending it off.
#
ActiveSupport::JSON.encode results</pre></code>
<p>
And that was it for the controller. It looks large, but when reduced to the essential lines, it is just this:
</p>
<code><pre># In an initializer or environment.
#
FullBooks = Picky::Client::Full.new ...
LiveBooks = Picky::Client::Live.new ...

# In a controller action.
#
results = FullBooks.search ...
results.populate_with Book { |book| book.to_s }
ActiveSupport::JSON.encode results</pre></code>
<p>
Unbeatably easy, right?
</p>
<p>
If you don't want to render the results in the controller, use #entries to render them in a view and use #populate_with without the rendering block.
</p>
<code><pre># In a controller action.
#
results = FullBooks.search ...
results.populate_with Book

# In your view:
#
results.entries do |book|
 render book
end
ActiveSupport::JSON.encode results</pre></code>
</div>
<div class="grid_1">
<h3>Usage (Of the Server)</h3>
<p>
Either from Sinatra/Rails/Padrino/Camping etc. through the picky client (see left column) or using for example curl to access the json data from the server directly:
</p>
<code><pre>curl 'localhost:8080/books?offset=10&query=test'</pre></code>
<p>
Or access it from any app server in any language. The data you get is JSON, for which lots of good libraries are available.
</p>
<h3>Is something not correctly indexed?</h3>
<code><pre>rake 'try[My Words That Do Not Work]'</pre></code>
<p>Words to find should be indexed in basically the same way as the query processes them.</p>
</div>
<div class="grid_2"></div>
<div class="grid_1">
<h3>The provided JS frontend.</h3>
<p>
Picky provides a html structure which is in turn used by the Picky JS frontend.
</p>
<p>
Add the following line to your views (here in haml glory):
</p>
<code><pre>= Picky::Helper.cached_interface options</pre></code>
<p>or</p>
<code><pre>= Picky::Helper.interface options</pre></code>
<p>
The options (defaults after the ||) are
</p>
<code><pre>options[:button] || 'search'
options[:no_results] || 'Sorry, no results found!'
options[:more] || 'more'</pre></code>
<p>
This enables you to pass in your own translated texts. If you have only one language I suggest you use #cached_interface.
</p>
<p>
With the HTML structure in place, let's take a look at the Javascript.
</p>
<p>
The simplest example that works is:
</p>
<code><pre>new PickyClient({
 full: '/search/full', // Displays the rendered results.
 live: '/search/live' // Just updates the count.
});</pre></code>
<p>
You'd of course use the urls you want.
</p>
<p>
A more complicated example looks like this:
</p>
<code><pre>pickyClient = new PickyClient({
 // A full query displays the rendered results.
 //
 full: '/search/full',

 // A live query just updates the count.
 //
 live: '/search/live',

 // Optional. Default is 10.
 //
 showResultsLimit: 20,

 // Optional. Before Picky sends any data.
 //
 before: function(params, query, offset) {
 console.log('Going to send your query. Oh boy!');
 },

 // Optional. Just after Picky receives data.
 // (Get a PickyData object)
 //
 success: function(data, query) {
 console.log('Received the data.');
 },

 // Optional. After Picky has handled the
 // data and updated the view.
 //
 after: function(data, query) {
 console.log('Found what you were looking for?');
 },

 // This is used to generate the correct query
 // strings, localized. E.g. "subject:war".
 //
 // Optional. If you don't give these, the
 // category identifier given in the Picky server
 // is used.
 //
 qualifiers: {
 en:{
 subjects: 'subject'
 }
 },

 // This is used to explain the preceding word
 // in the suggestion text, localized.
 // E.g. "Peter (author)".
 //
 // Optional. Default are the category identifiers
 // from the Picky server.
 //
 explanations: {
 en:{
 title: 'titled',
 author: 'written by',
 isbn: 'ISBN-13',
 year: 'published in',
 publisher: 'published by',
 subjects: 'topics'
 }
 }
});

// An initial search text.
//
pickyClient.insert('initial search text');</pre></code>
<p>
And that's basically it. Wish you great success!
</p>
</div>
<div class="grid_1">
<h3>Usage (Become a Picky master)</h3>
<p>1. An asterisk (*) makes picky search for a partial hit. (If the index supports that)</p>
<code><pre>part*</pre></code>
<p>also finds partial, party, partogenesology.</p>
<p>2. The last word in a query is always partially searched.</p>
<code><pre>my beautiful query</pre></code>
<p>is actually</p>
<code><pre>my beautiful query*</pre></code>
<p>3. Asterisk searches can be stopped.</p>
<code><pre>"part"</pre></code>
<p>only finds "part", and nothing else.</p>
<p>4. If you have defined a similarity index on a category, a tilde (~) will look for similar matches.</p>
<code><pre>my beoootiful~ query</pre></code>
<p>will also find your "beautiful" query.</p>
<p>5. Qualifiers can be used with a colon (:)</p>
<code><pre>title:ulysses author:joyce</pre></code>
<p>will narrow the search space to complex novels.</p>
<p>6. The above options can be combined.</p>
<code><pre>name:flurion~ hank*</pre></code>
<p>will find me.</p>
<p>That is all, young grasshopper. Be on your way :)</p>
</div>
</div>
<div class="license">
Logos and all images are
<a href="http://creativecommons.org/licenses/by/1.0/">CC Attribution</a>
licensed to Florian Hanke.
</div>
<div class="footer"></div>
</body>
</html>