-
Notifications
You must be signed in to change notification settings - Fork 49
/
application.rb
250 lines (237 loc) · 8.19 KB
/
application.rb
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
# = Picky Applications
#
# A Picky Application is where you configure the whole search engine.
#
# This is a step-by-step description on how to configure your Picky app.
#
# Start by subclassing Application:
# class MyGreatSearch < Application
# # Your configuration goes here.
# end
# The generator
# $ picky generate unicorn_server project_name
# will generate an example <tt>project_name/app/application.rb</tt> file for you
# with some example code inside.
#
# == Index::Memory.new(name, source)
#
# Next, define where your data comes from. You use the <tt>Index::Memory.new</tt> method for that:
# my_index = Index::Memory.new :some_index_name, some_source
# You give the index a name (or identifier), and a source (see Sources), where its data comes from. Let's do that:
# class MyGreatSearch < Application
#
# books = Index::Memory.new :books, Sources::CSV.new(:title, :author, :isbn, file:'app/library.csv')
#
# end
# Now we have an index <tt>books</tt>.
#
# That on itself won't do much good.
#
# Note that a Redis index is also available: Index::Redis.new.
#
# == index_instance.define_category(identifier, options = {})
#
# Picky needs us to define categories on the data.
#
# Categories help your user find data.
# It's best if you look at an example yourself: http://floere.github.com/picky/examples.html
#
# Let's go ahead and define a category:
# class MyGreatSearch < Application
#
# books = Index::Memory.new :books, Sources::CSV.new(:title, :author, :isbn, file:'app/library.csv')
# books.define_category :title
#
# end
# Now we could already run the indexer:
# $ rake index
#
# (You can define similarity or partial search capabilities on a category, see http://github.com/floere/picky/wiki/Categories-configuration for info)
#
# So now we have indexed data (the title), but nobody to ask the index anything.
#
# == Query::Full.new(*indexes, options = {})
#
# We need somebody who asks the index (a Query object, also see http://github.com/floere/picky/wiki/Queries-Configuration). That works like this:
# full_books_query = Query::Full.new books
# Full just means that the ids are returned with the results.
# Picky also offers a Query that returns live results, Query::Live. But that's not important right now.
#
# Now we have somebody we can ask about the index. But no external interface.
#
# == route(/regexp1/ => query1, /regexp2/ => query2, ...)
#
# Let's add a URL path (a Route, see http://github.com/floere/picky/wiki/Routing-configuration) to which we can send our queries. We do that with the route method:
# route %r{^/books/full$} => full_books_query
# In full glory:
# class MyGreatSearch < Application
#
# books = index :books, Sources::CSV.new(:title, :author, :isbn, file:'app/library.csv')
# books.define_category :title
#
# route %r{^/books/full$} => Query::Full.new(books)
#
# end
# That's it!
#
# Now run the indexer and server:
# $ rake index
# $ rake start
# Run your first query:
# $ curl 'localhost:8080/books/full?query=hello server'
#
# Nice, right? Your first query!
#
# Maybe you don't find everything. We need to process the data before it goes into the index.
#
# == default_indexing(options = {})
#
# That's what the <tt>default_indexing</tt> method is for:
# default_indexing options
# Read more about the options here: http://github.com/floere/picky/wiki/Indexing-configuration
#
# Same thing with the search text – we need to process that as well.
#
# == default_querying(options = {})
#
# Analog to the default_indexing method, we use the <tt>default_querying</tt> method.
# default_querying options
# Read more about the options here: http://github.com/floere/picky/wiki/Querying-Configuration
#
# And that's all there is. It's incredibly powerful though, as you can combine, weigh, refine to the max.
#
# == Wiki
#
# Read more in the Wiki: http://github.com/floere/picky/wiki
#
# Have fun!
#
# == Full example
#
# Our example, fully fleshed out with indexing, querying, and weights:
# class MyGreatSearch < Application
#
# default_indexing removes_characters: /[^a-zA-Z0-9\.]/,
# stopwords: /\b(and|or|in|on|is|has)\b/,
# splits_text_on: /\s/,
# removes_characters_after_splitting: /\./,
# substitutes_characters_with: CharacterSubstituters::WestEuropean.new,
# normalizes_words: [
# [/(.*)hausen/, 'hn'],
# [/\b(\w*)str(eet)?/, 'st']
# ]
#
# default_querying removes_characters: /[^a-zA-Z0-9\s\/\-\,\&\"\~\*\:]/,
# stopwords: /\b(and|the|of|it|in|for)\b/,
# splits_text_on: /[\s\/\-\,\&]+/,
# removes_characters_after_splitting: /\./,
# substitutes_characters_with: CharacterSubstituters::WestEuropean.new,
# maximum_tokens: 4
#
# books = Index::Memory.new :books, Sources::CSV.new(:title, :author, :isbn, file:'app/library.csv')
# books.define_category :title,
# qualifiers: [:t, :title, :titre],
# partial: Partial::Substring.new(:from => 1),
# similarity: Similarity::Phonetic.new(2)
# books.define_category :author,
# partial: Partial::Substring.new(:from => -2)
# books.define_category :isbn
#
# query_options = { :weights => { [:title, :author] => +3, [:author, :title] => -1 } }
#
# route %r{^/books/full$} => Query::Full.new(books, query_options)
# route %r{^/books/live$} => Query::Live.new(books, query_options)
#
# end
# That's actually already a full-blown Picky App!
#
class Application
class << self
# API
#
# Returns a configured tokenizer that
# is used for indexing by default.
#
def default_indexing options = {}
Internals::Tokenizers::Index.default = Internals::Tokenizers::Index.new(options)
end
# Returns a configured tokenizer that
# is used for querying by default.
#
def default_querying options = {}
Internals::Tokenizers::Query.default = Internals::Tokenizers::Query.new(options)
end
# Create a new index for indexing and for querying.
#
# Parameters:
# * name: The identifier of the index. Used:
# - to identify an index (e.g. by you in Rake tasks).
# - in the frontend to describe which index a result came from.
# - index directory naming (index/development/the_identifier/<lots of indexes>)
# * source: The source the data comes from. See Sources::Base.
#
# Options:
# * result_identifier: Use if you'd like a different identifier/name in the results JSON than the name of the index.
#
# TODO Obsolete. Phase out.
#
def index name, source, options = {}
Index::Memory.new name, source, options
end
# Routes.
#
delegate :route, :root, :to => :rack_adapter
#
# API
# A Picky application implements the Rack interface.
#
# Delegates to its routing to handle a request.
#
def call env
rack_adapter.call env
end
def rack_adapter # :nodoc:
@rack_adapter ||= Internals::FrontendAdapters::Rack.new
end
# Finalize the subclass as soon as it
# has finished loading.
#
attr_reader :apps # :nodoc:
def initialize_apps # :nodoc:
@apps ||= []
end
def inherited app # :nodoc:
initialize_apps
apps << app
end
def finalize_apps # :nodoc:
initialize_apps
apps.each &:finalize
end
# Finalizes the routes.
#
def finalize # :nodoc:
check
rack_adapter.finalize
end
# Checks app for missing things.
#
# Warns if something is missing.
#
# TODO Good specs.
#
def check # :nodoc:
warnings = []
warnings << check_external_interface
puts "\n#{warnings.join(?\n)}\n\n" unless warnings.all? &:nil?
end
def check_external_interface
"WARNING: No routes defined for application configuration in #{self.class}." if rack_adapter.empty?
end
# TODO Add more info if possible.
#
def to_s # :nodoc:
"#{self.name}:\n#{rack_adapter}"
end
end
end