Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Search Posts API #306

Closed
ErisDS opened this issue Jul 22, 2013 · 18 comments

Comments

Projects
None yet
6 participants
@ErisDS
Copy link
Member

commented Jul 22, 2013

Our API for browse posts should accept a query option which will perform a full text search on the post model. This should use the title, content, and possibly meta_* fields of a post.

At the moment, posts are self contained, but in the near future we will be adding additional tables such as tags/categories which will also need to be searchable.

Search should return a paginated set of matching posts, using the existing settings for limit, offset etc.

@ghost ghost assigned julesbravo Jul 22, 2013

@ErisDS ErisDS referenced this issue Aug 9, 2013

Closed

MySQL Support #364

@ErisDS

This comment has been minimized.

Copy link
Member Author

commented Aug 21, 2013

Do you have any details as to what schema changes you need yet? If you do, perhaps push up some stuff to a fork so I can take a look? I am doing a bit of a schema audit is all.

@ErisDS

This comment has been minimized.

Copy link
Member Author

commented Sep 8, 2013

@julesbravo Any further updates to this?

@julesbravo

This comment has been minimized.

Copy link

commented Sep 8, 2013

Hannah I've been slacking I'll try to wrap it up Tuesday

On Sep 8, 2013, at 9:04 AM, Hannah Wolfe notifications@github.com wrote:

@julesbravo Any further updates to this?


Reply to this email directly or view it on GitHub.

@ErisDS

This comment has been minimized.

Copy link
Member Author

commented Sep 9, 2013

No probs just been wondering whether to put this into the current schema migration or not. Think it might be worth waiting, purely because there are so many other changes ongoing.

@julesbravo

This comment has been minimized.

Copy link

commented Sep 11, 2013

Hannah,

I've got what I think should work done, but I'm having issues around Knex.
I'm going to try to hop into IRC tomorrow to hopefully get some pointers.
I've just been having the damnedest time with this.

On Mon, Sep 9, 2013 at 7:46 AM, Hannah Wolfe notifications@github.comwrote:

No probs just been wondering whether to put this into the current schema
migration or not. Think it might be worth waiting.


Reply to this email directly or view it on GitHubhttps://github.com//issues/306#issuecomment-24081970
.

@ErisDS

This comment has been minimized.

Copy link
Member Author

commented Oct 24, 2013

This work was started in #489, but now needs to be picked up by a developer willing to give it some serious love - including considering how it might be written as a BookShelf or Knex plugin.

@halfdan

This comment has been minimized.

Copy link
Member

commented Oct 25, 2013

@ErisDS Can you assign me to this one?

@ErisDS

This comment has been minimized.

Copy link
Member Author

commented Oct 25, 2013

FYI: This is for 0.5, and there is a search branch to submit PRs to. I recommend looking at what was done so far julesbravo@d9944de (not merged).

The big questions I have are:

  • should this be broken down into smaller chunks? It seems like a big chunk.
  • is it possible to do as a plugin for Bookshelf, or at least in knex, otherwise this will need to be re-implemented in Ghost for each and every data store - people are already using SQLite, MySQL and even postgres although we don't officially support that yet - that's a lot of code. Seems to me it should live elsewhere.
@Swaagie

This comment has been minimized.

Copy link
Contributor

commented Oct 25, 2013

Caught a bit of the discussion on IRC, I think the search should be handled by something like https://github.com/olivernn/lunr.js, this would allow plugins to be written towards an API, basically solving part of the puzzle. Lunr.js provides a tf-idf algoritm that allows documents to be ranked as well. Not simply listing the posts that contain the word but also sorting on relevance.

As reference, the search for the nodejitsu handbook is done with lunr.js. It's just a matter of pushing text/titles in at startup and let lunr.js do its magix

@ErisDS

This comment has been minimized.

Copy link
Member Author

commented Oct 25, 2013

A bit like Solr, but much smaller and not as bright
http://lunrjs.com

👯

Excellent thanks 👍 Think this is probably well worth having a bit of a play with?

@halfdan

This comment has been minimized.

Copy link
Member

commented Oct 25, 2013

The issue with lunrjs seems to be that all items have to be indexed on app startup and be kept in-memory during the lifetime of the application. This results in an increased memory footprint and scalability issues at some point (imagine indexing 5000 posts at app startup).

@Swaagie

This comment has been minimized.

Copy link
Contributor

commented Oct 26, 2013

Agree with @halfdan there, we discussed this a bit over IRC. Using in database fts would be best, but not every database is as capable. For smaller blogs lunr.js will just do fine, however the upfront indexing will indeed increase memory footprints, reducing scalability. Perhaps lunr.js could be provided as plugin as intermediate API for databases which do not support fts

@ErisDS ErisDS modified the milestones: Future, Multi-user Apr 15, 2014

@ErisDS ErisDS added api and removed data labels Apr 15, 2014

@ErisDS

This comment has been minimized.

Copy link
Member Author

commented Jan 5, 2015

This issue is pretty old and floundering. We're looking for someone to take the lead! See: https://ghost.org/contribute/search-lead/

@seesharper

This comment has been minimized.

Copy link

commented Jan 28, 2015

You might want to check out my new project that adds full text search to the Ghost platform.
It is rather simple and only supports SQLite, but it works :)
https://github.com/seesharper/GhostSearch

@dwstevens

This comment has been minimized.

Copy link

commented Apr 3, 2015

I think having a stand alone db for the search index would be best. This way you don't have to deal with fuzzy text searching inconsistencies (or nonexistence) of the various dbms's out there.

A possible option would be using a LevelDB backed index (file based key/value store) using the search-index module.

https://github.com/fergiemcdowall/search-index

Yes it adds another file to the mix that could grow quite large, but I believe it would have a lower memory footprint than Lunr.js.

@ErisDS

This comment has been minimized.

Copy link
Member Author

commented Apr 3, 2015

@dwstevens Tell me if I'm wrong, but I really don't think levelDB is an option. As far as I am aware it'll add a new dependency that is far more complicated even than sqlite3 to get installed (and that has the wonderful node-pre-gyp feature), meaning it's not suitable for our user base.

@dwstevens

This comment has been minimized.

Copy link

commented Apr 3, 2015

@ErisDS Ah, my bad. You are correct.

@ErisDS

This comment has been minimized.

Copy link
Member Author

commented Oct 8, 2015

Closing this issue in favour of #5321 which has plenty of discussion & traction. Having 2 issues is just confusing at this point.

@ErisDS ErisDS closed this Oct 8, 2015

@ErisDS ErisDS removed the api label Oct 8, 2015

@ErisDS ErisDS removed this from the Future Backlog milestone Oct 8, 2015

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.