Mega Enhancement: Search #28

sbrl · 2015-10-25T16:21:20Z

The current search facility only directs the user to a specific page. I would be nice if it would actually search the site if it can't find an exact match. Perhaps an upgrade is in order?

Firstly we should do some research on how to create a search engine in PHP which doesn't take ages to find anything.

sbrl · 2015-10-26T06:30:04Z

Basically, we need to construct an 'index' of sorts that contains the most common words on every page. We could include this in the page index, but the worry here is that this will bloat the size of the page index, increasing the time it takes for the pageindex to be read in and parsed.

sbrl · 2015-10-26T09:02:05Z

We might want to look into an inverted index. This apparently allows for 'fast full text searches'.

Information here:

sbrl · 2015-10-26T09:11:31Z

In order to use an inverted index though, we will have to assign an id to every page. This is such that the inverted index remains small and easy to search.

We will have to come up with an algorithm to assign short unique id to every page.

sbrl · 2015-10-27T07:49:13Z

Perhaps we could have a single pageids.json file that maps ids onto page names and back again. That would avoid us having to touch the pageindex, and would also allow us to make the id <---> page name translation process completely transparent.

sbrl · 2015-11-01T14:34:16Z

We now have search! It isn't yet complete though. I have a few things that I think we really ought to deal with before closing this bug.

Allow searching by tag
Allow searching by title

It might also be smart to give tags and titles a weighting. What I mean by this is that occurrences of a query term in the title or tags should bump the rank by say 5 for titles and 2 or 3 for tags. This should be configurable, though.

Give a rank weighting to query nterms found in the title / tags of a page

sbrl · 2015-11-01T15:07:40Z

Done! We have a fully integrated search function.

Since it's a hugely complex system (it has many more moving parts than other parts of Pepperminty WIki) it ~~may~~ will contain some bugs - please open an issue if you find one!

sbrl added the enhancement Let's make it better! label Oct 26, 2015

sbrl mentioned this issue Nov 1, 2015

Adding new word to a page doesn't add it to the inverted index #29

Closed

sbrl closed this as completed Nov 1, 2015

sbrl modified the milestone: v0.9 Nov 12, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mega Enhancement: Search #28

Mega Enhancement: Search #28

sbrl commented Oct 25, 2015

sbrl commented Oct 26, 2015

sbrl commented Oct 26, 2015

sbrl commented Oct 26, 2015

sbrl commented Oct 27, 2015

sbrl commented Nov 1, 2015

sbrl commented Nov 1, 2015

Mega Enhancement: Search #28

Mega Enhancement: Search #28

Comments

sbrl commented Oct 25, 2015

sbrl commented Oct 26, 2015

sbrl commented Oct 26, 2015

sbrl commented Oct 26, 2015

sbrl commented Oct 27, 2015

sbrl commented Nov 1, 2015

sbrl commented Nov 1, 2015