-
Notifications
You must be signed in to change notification settings - Fork 1.3k
/
bulk_data.json
262 lines (262 loc) · 13.4 KB
/
bulk_data.json
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
[
{
"pk": 1,
"model": "core.mocktag",
"fields": {
"name": "search_test"
}
},
{
"pk": 1,
"model": "core.mockmodel",
"fields": {
"author": "daniel1",
"foo": "Registering indexes in Haystack is very similar to registering models and ``ModelAdmin`` classes in the `Django admin site`_. If you want to override the default indexing behavior for your model you can specify your own ``SearchIndex`` class. This is useful for ensuring that future-dated or non-live content is not indexed and searchable. Our ``Note`` model has a ``pub_date`` field, so let's update our code to include our own ``SearchIndex`` to exclude indexing future-dated notes:",
"pub_date": "2009-06-18 06:00:00",
"tag": 1
}
},
{
"pk": 2,
"model": "core.mockmodel",
"fields": {
"author": "daniel2",
"foo": "In addition, you may specify other fields to be populated along with the document. In this case, we also index the user who authored the document as well as the date the document was published. The variable you assign the SearchField to should directly map to the field your search backend is expecting. You instantiate most search fields with a parameter that points to the attribute of the object to populate that field with.",
"pub_date": "2009-07-17 00:30:00",
"tag": 1
}
},
{
"pk": 3,
"model": "core.mockmodel",
"fields": {
"author": "daniel3",
"foo": "Every custom ``SearchIndex`` requires there be one and only one field with ``document=True``. This is the primary field that will get passed to the backend for indexing. For this field, you'll then need to create a template at ``search/indexes/myapp/note_text.txt``. This allows you to customize the document that will be passed to the search backend for indexing. A sample template might look like:",
"pub_date": "2009-06-18 08:00:00",
"tag": 1
}
},
{
"pk": 4,
"model": "core.mockmodel",
"fields": {
"author": "daniel3",
"foo": "The exception to this is the TemplateField class. This take either no arguments or an explicit template name to populate their contents. You can find more information about them in the SearchIndex API reference.",
"pub_date": "2009-07-17 01:30:00",
"tag": 1
}
},
{
"pk": 5,
"model": "core.mockmodel",
"fields": {
"author": "daniel1",
"foo": "This will pull in the default URLconf for Haystack. It consists of a single URLconf that points to a SearchView instance. You can change this class’s behavior by passing it any of several keyword arguments or override it entirely with your own view.",
"pub_date": "2009-07-17 02:30:00",
"tag": 1
}
},
{
"pk": 6,
"model": "core.mockmodel",
"fields": {
"author": "daniel1",
"foo": "This will create a default SearchIndex instance, search through all of your INSTALLED_APPS for search_indexes.py and register all SearchIndexes with the default SearchIndex. If autodiscovery and inclusion of all indexes is not desirable, you can manually register models in the following manner:",
"pub_date": "2009-07-17 03:30:00",
"tag": 1
}
},
{
"pk": 7,
"model": "core.mockmodel",
"fields": {
"author": "daniel1",
"foo": "The SearchBackend class handles interaction directly with the backend. The search query it performs is usually fed to it from a SearchQuery class that has been built for that backend. This class must be at least partially implemented on a per-backend basis and is usually accompanied by a SearchQuery class within the same module.",
"pub_date": "2009-07-17 04:30:00",
"tag": 1
}
},
{
"pk": 8,
"model": "core.mockmodel",
"fields": {
"author": "daniel2",
"foo": "Takes a query to search on and returns dictionary. The query should be a string that is appropriate syntax for the backend. The returned dictionary should contain the keys ‘results’ and ‘hits’. The ‘results’ value should be an iterable of populated SearchResult objects. The ‘hits’ should be an integer count of the number of matched results the search backend found. This method MUST be implemented by each backend, as it will be highly specific to each one.",
"pub_date": "2009-07-17 05:30:00",
"tag": 1
}
},
{
"pk": 9,
"model": "core.mockmodel",
"fields": {
"author": "daniel1",
"foo": "The SearchQuery class acts as an intermediary between SearchQuerySet‘s abstraction and SearchBackend‘s actual search. Given the metadata provided by SearchQuerySet, SearchQuery build the actual query and interacts with the SearchBackend on SearchQuerySet‘s behalf. This class must be at least partially implemented on a per-backend basis, as portions are highly specific to the backend. It usually is bundled with the accompanying SearchBackend.",
"pub_date": "2009-07-17 06:30:00",
"tag": 1
}
},
{
"pk": 10,
"model": "core.mockmodel",
"fields": {
"author": "daniel3",
"foo": "Most people will NOT have to use this class directly. SearchQuerySet handles all interactions with SearchQuery objects and provides a nicer interface to work with. Should you need advanced/custom behavior, you can supply your version of SearchQuery that overrides/extends the class in the manner you see fit. SearchQuerySet objects take a kwarg parameter query where you can pass in your class.",
"pub_date": "2009-07-17 07:30:00",
"tag": 1
}
},
{
"pk": 11,
"model": "core.mockmodel",
"fields": {
"author": "daniel1",
"foo": "The SearchQuery object maintains a list of QueryFilter objects. Each filter object supports what field it looks up against, what kind of lookup (i.e. the __’s), what value it’s looking for and if it’s a AND/OR/NOT. The SearchQuery object’s “build_query” method should then iterate over that list and convert that to a valid query for the search backend.",
"pub_date": "2009-07-17 08:30:00",
"tag": 1
}
},
{
"pk": 12,
"model": "core.mockmodel",
"fields": {
"author": "daniel2",
"foo": "The SearchSite provides a way to collect the SearchIndexes that are relevant to the current site, much like ModelAdmins in the admin app. This allows you to register indexes on models you don’t control (reusable apps, django.contrib, etc.) as well as customize on a per-site basis what indexes should be available (different indexes for different sites, same codebase).",
"pub_date": "2009-07-17 09:30:00",
"tag": 1
}
},
{
"pk": 13,
"model": "core.mockmodel",
"fields": {
"author": "daniel3",
"foo": "If you need to narrow the indexes that get registered, you will need to manipulate a SearchSite. There are two ways to go about this, via either register or unregister. If you want most of the indexes but want to forgo a specific one(s), you can setup the main site via autodiscover then simply unregister the one(s) you don’t want.:",
"pub_date": "2009-07-17 10:30:00",
"tag": 1
}
},
{
"pk": 14,
"model": "core.mockmodel",
"fields": {
"author": "daniel2",
"foo": "The SearchIndex class allows the application developer a way to provide data to the backend in a structured format. Developers familiar with Django’s Form or Model classes should find the syntax for indexes familiar. This class is arguably the most important part of integrating Haystack into your application, as it has a large impact on the quality of the search results and how easy it is for users to find what they’re looking for. Care and effort should be put into making your indexes the best they can be.",
"pub_date": "2009-07-17 11:30:00",
"tag": 1
}
},
{
"pk": 15,
"model": "core.mockmodel",
"fields": {
"author": "daniel2",
"foo": "Unlike relational databases, most search engines supported by Haystack are primarily document-based. They focus on a single text blob which they tokenize, analyze and index. When searching, this field is usually the primary one that is searched. Further, the schema used by most engines is the same for all types of data added, unlike a relational database that has a table schema for each chunk of data. It may be helpful to think of your search index as something closer to a key-value store instead of imagining it in terms of a RDBMS.",
"pub_date": "2009-07-17 12:30:00",
"tag": 1
}
},
{
"pk": 16,
"model": "core.mockmodel",
"fields": {
"author": "daniel3",
"foo": "Common uses include storing pertinent data information, categorizations of the document, author information and related data. By adding fields for these pieces of data, you provide a means to further narrow/filter search terms. This can be useful from either a UI perspective (a better advanced search form) or from a developer standpoint (section-dependent search, off-loading certain tasks to search, et cetera).",
"pub_date": "2009-07-17 13:30:00",
"tag": 1
}
},
{
"pk": 17,
"model": "core.mockmodel",
"fields": {
"author": "daniel3",
"foo": "Most search engines that were candidates for inclusion in Haystack all had a central concept of a document that they indexed. These documents form a corpus within which to primarily search. Because this ideal is so central and most of Haystack is designed to have pluggable backends, it is important to ensure that all engines have at least a bare minimum of the data they need to function.",
"pub_date": "2009-07-17 14:30:00",
"tag": 1
}
},
{
"pk": 18,
"model": "core.mockmodel",
"fields": {
"author": "daniel1",
"foo": "As a result, when creating a SearchIndex, at least one field must be marked with document=True. This signifies to Haystack that whatever is placed in this field while indexing is to be the primary text the search engine indexes. The name of this field can be almost anything, but text is one of the more common names used.",
"pub_date": "2009-07-17 15:30:00",
"tag": 1
}
},
{
"pk": 19,
"model": "core.mockmodel",
"fields": {
"author": "daniel3",
"foo": "One shortcoming of the use of search is that you rarely have all or the most up-to-date information about an object in the index. As a result, when retrieving search results, you will likely have to access the object in the database to provide better information. However, this can also hit the database quite heavily (think .get(pk=result.id) per object). If your search is popular, this can lead to a big performance hit. There are two ways to prevent this. The first way is SearchQuerySet.load_all, which tries to group all similar objects and pull them though one query instead of many. This still hits the DB and incurs a performance penalty.",
"pub_date": "2009-07-17 16:30:00",
"tag": 1
}
},
{
"pk": 20,
"model": "core.mockmodel",
"fields": {
"author": "daniel2",
"foo": "The other option is to leverage stored fields. By default, all fields in Haystack are both indexed (searchable by the engine) and stored (retained by the engine and presented in the results). By using a stored field, you can store commonly used data in such a way that you don’t need to hit the database when processing the search result to get more information.",
"pub_date": "2009-07-17 17:30:00",
"tag": 1
}
},
{
"pk": 21,
"model": "core.mockmodel",
"fields": {
"author": "daniel2",
"foo": "For example, one great way to leverage this is to pre-rendering an object’s search result template DURING indexing. You define an additional field, render a template with it and it follows the main indexed record into the index. Then, when that record is pulled when it matches a query, you can simply display the contents of that field, which avoids the database hit.:",
"pub_date": "2009-07-17 18:30:00",
"tag": 1
}
},
{
"pk": 22,
"model": "core.mockmodel",
"fields": {
"author": "daniel3",
"foo": "However, sometimes, even more control over what gets placed in your index is needed. To facilitate this, SearchIndex objects have a ‘preparation’ stage that populates data just before it is indexed. You can hook into this phase in several ways. This should be very familiar to developers who have used Django’s forms before as it loosely follows similar concepts, though the emphasis here is less on cleansing data from user input and more on making the data friendly to the search backend.",
"pub_date": "2009-07-17 19:30:00",
"tag": 1
}
},
{
"pk": 23,
"model": "core.mockmodel",
"fields": {
"author": "daniel3",
"foo": "Each SearchIndex gets a prepare method, which handles collecting all the data. This method should return a dictionary that will be the final data used by the search backend. Overriding this method is useful if you need to collect more than one piece of data or need to incorporate additional data that is not well represented by a single SearchField. An example might look like:",
"pub_date": "2009-07-17 20:30:00",
"tag": 1
}
},
{
"pk": 1,
"model": "core.anothermockmodel",
"fields": {
"author": "daniel3",
"pub_date": "2009-07-17 21:30:00"
}
},
{
"pk": 2,
"model": "core.anothermockmodel",
"fields": {
"author": "daniel3",
"pub_date": "2009-07-17 22:30:00"
}
},
{
"pk": 1,
"model": "core.ScoreMockModel",
"fields": {
"score": "42"
}
}
]