Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mega Enhancement: Search #28

Closed
sbrl opened this issue Oct 25, 2015 · 6 comments
Closed

Mega Enhancement: Search #28

sbrl opened this issue Oct 25, 2015 · 6 comments
Labels
enhancement Let's make it better!
Milestone

Comments

@sbrl
Copy link
Owner

sbrl commented Oct 25, 2015

The current search facility only directs the user to a specific page. I would be nice if it would actually search the site if it can't find an exact match. Perhaps an upgrade is in order?

Firstly we should do some research on how to create a search engine in PHP which doesn't take ages to find anything.

@sbrl
Copy link
Owner Author

sbrl commented Oct 26, 2015

Basically, we need to construct an 'index' of sorts that contains the most common words on every page. We could include this in the page index, but the worry here is that this will bloat the size of the page index, increasing the time it takes for the pageindex to be read in and parsed.

@sbrl sbrl added the enhancement Let's make it better! label Oct 26, 2015
@sbrl
Copy link
Owner Author

sbrl commented Oct 26, 2015

We might want to look into an inverted index. This apparently allows for 'fast full text searches'.

Information here:

@sbrl
Copy link
Owner Author

sbrl commented Oct 26, 2015

In order to use an inverted index though, we will have to assign an id to every page. This is such that the inverted index remains small and easy to search.

We will have to come up with an algorithm to assign short unique id to every page.

@sbrl
Copy link
Owner Author

sbrl commented Oct 27, 2015

Perhaps we could have a single pageids.json file that maps ids onto page names and back again. That would avoid us having to touch the pageindex, and would also allow us to make the id <---> page name translation process completely transparent.

@sbrl
Copy link
Owner Author

sbrl commented Nov 1, 2015

We now have search! It isn't yet complete though. I have a few things that I think we really ought to deal with before closing this bug.

  • Allow searching by tag
  • Allow searching by title

It might also be smart to give tags and titles a weighting. What I mean by this is that occurrences of a query term in the title or tags should bump the rank by say 5 for titles and 2 or 3 for tags. This should be configurable, though.

  • Give a rank weighting to query nterms found in the title / tags of a page

@sbrl
Copy link
Owner Author

sbrl commented Nov 1, 2015

Done! We have a fully integrated search function.

Since it's a hugely complex system (it has many more moving parts than other parts of Pepperminty WIki) it may will contain some bugs - please open an issue if you find one!

@sbrl sbrl closed this as completed Nov 1, 2015
@sbrl sbrl modified the milestone: v0.9 Nov 12, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Let's make it better!
Projects
None yet
Development

No branches or pull requests

1 participant