-
-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add search with ElasticSearch #82
Conversation
What about using a more explicit name for dir |
@asaunier This conflicted with the package |
8eb43bb
to
6fd3963
Compare
There are still 2-3 issues that I have to work on, but besides this PR would be ready for a review. |
Looks like an excellent work! One question: is it possible to adapt the indexed fileds depending on the doc type? I know that in v5 with Solr, all documents are not indexed the same way. For instance (not sure if I remember well) only titles are indexed for summits (and routes?) but titles and "descriptions" (content) are both indexed for articles. I don't know if we want to keep this distinction in v6 or if it's OK/better to indexed title/summary/description for all doc types. I can ask. Please note that "route" docs are special: we most likely need to get the routes associated to a given waypoint when searching the waypoint title. But the waypoint title is not embedded in the route name (especially when you have several WP associated to a route). For instance "Mont Blanc":
|
Yes, that would be possible.
I created an issue for this: #90 |
What remains on my todo list is the fine-tuning. I would like to do this at the same time I am integrating the search in the UI, because there might be changes to the ElasticSearch configuration required for the auto-completion. So, please review! :) |
(I have only looked at the 2 additional commits you have pushed after my last comments) Stripping the BBcode when indexing the contents actually makes sense. The full parser is here: https://github.com/c2corg/camptocamp.org/blob/master/plugins/sfPunBBCodeParserPlugin/lib/sfPunBBCodeParser.class.php#L36 Possible formatting is described here: Here is a simple example of formated contend:
An advanced one:
Climbing routes may also be a bit complicated because of the "pitches" list that might be used:
but perhaps it's OK only removing tags such as |
The '#'s are no problem, if I search for '##' I am getting no results. The main idea to remove the bbcodes was that if you for example search for "col" that you are not getting results where the bbcode "[col]" was used. About tags like " [[routes/53781|Arête des Bosses]]": I'd just keep them, because they add useful information. For example if you search for "53781" you'd also find the documents that contain a link to this route. What do you think? |
OK, that indeed makes sense. |
@gberaudo, do you also want to take a look? |
@tsauerwein, I will have a quick look now. |
return [] | ||
|
||
documents = DBSession.\ | ||
query(model).\ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are all fields loaded from the database?
A great enhancement would be to only load the fields which will be actually sent in the response.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this would also be relevant for the listings. I was thinking about creating a separate model for the listings which only contains the fields needed for the listings. I will create an issue for it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we could use http://docs.sqlalchemy.org/en/latest/orm/loading_columns.html#load-only-cols
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes this could also be an option.
This PR looks great to me. +1 for merging. |
to use the ES 2.1 instance on the dev server.
Add search with ElasticSearch
Thanks for your comments. |
/search
Paging(not required for the simple search, only for advanced search)closes #81