Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for full text search indexes #203

Open
elliotf opened this Issue Mar 15, 2014 · 30 comments

Comments

Projects
None yet
@elliotf
Copy link
Contributor

elliotf commented Mar 15, 2014

Postgres, sqlite, and MySQL all support some form of full-text index. It would be useful to support generating schema for and querying based on these indexes.

This does not seem like a trivial feature, as the implementations are very different for index creation requirements (sqlite appears to be at the table-level, MySQL only supports it on MyISAM tables) and querying.

That said, this issue could at least be used to track interest for such a feature.

References:

Example index creation and querying:

  • PostgreSQL:
CREATE INDEX <INDEX NAME> ON <TABLE NAME> USING gin(to_tsvector(<COLUMN NAME>));

SELECT * FROM <TABLE NAME> WHERE <COLUMN NAME> @@ to_tsquery(<INPUT>);
  • MySQL:
CREATE FULLTEXT INDEX <INDEX NAME> ON <TABLE NAME>;

SELECT * FROM <TABLE NAME> WHERE MATCH <COLLUMN NAME> AGAINST <INPUT>;
  • sqlite:
CREATE VIRTUAL TABLE <TABLE> USING fts3(<COLUMN NAME> TEXT);

SELECT * FROM <TABLE NAME> WHERE <COLUMN NAME> MATCH <INPUT>;

@tgriesser tgriesser added the feature label Mar 18, 2014

@nfour

This comment has been minimized.

Copy link
Contributor

nfour commented Mar 19, 2014

Seconded of course, I thought it might have already been supported as it seems kinda important.

@tgriesser

This comment has been minimized.

Copy link
Owner

tgriesser commented Mar 19, 2014

Will be supported in the next minor release, I'm shooting for the end of March for that.

@elliotf

This comment has been minimized.

Copy link
Contributor Author

elliotf commented Mar 19, 2014

@tgriesser wow, that's unexpected but awesome. Thank you!

@ErisDS

This comment has been minimized.

Copy link
Contributor

ErisDS commented Apr 22, 2014

@tgriesser Did this actually happen?

@tgriesser

This comment has been minimized.

Copy link
Owner

tgriesser commented Apr 22, 2014

It's not released yet. Let me go though and make sure this got added though, trying to close out almost every remaining ticket here with the new version. I know you can now rename column and change types :)

@ErisDS

This comment has been minimized.

Copy link
Contributor

ErisDS commented Apr 22, 2014

Oooooooh exciting! 🎉 cc/ @sebgie

@tgriesser

This comment has been minimized.

Copy link
Owner

tgriesser commented Apr 30, 2014

@ErisDS so I'm looking at this again, just wanted to make sure how I was thinking about going about it would work... was thinking you'd do t.text('colName').fulltext() or t.fulltext(columnName)... just wanted to make sure you were aware though that in sqlite3 you lose the ability to do any other indexing on the table and the ability to add columns. Presumably, that functionality will be wanted, so I'll have to add it to the sqlite3 ddl stuff, but just wanted to point it out.

@ErisDS

This comment has been minimized.

Copy link
Contributor

ErisDS commented May 1, 2014

just wanted to make sure you were aware though that in sqlite3 you lose the ability to do any other indexing on the table and the ability to add columns. Presumably, that functionality will be wanted, so I'll have to add it to the sqlite3 ddl stuff, but just wanted to point it out.

You mean you lose ability to add columns normally, and have to do the whole table-copy thing?

@tgriesser

This comment has been minimized.

Copy link
Owner

tgriesser commented May 1, 2014

Yep. Seems to be the story with pretty much everything around modifying sqlite, except now it's even on adding columns.

@ErisDS

This comment has been minimized.

Copy link
Contributor

ErisDS commented May 1, 2014

Wooohoooooo 🎈 ... lol

In Ghost - there will be the post markdown, the title, and perhaps the tags names that we'll want to do FTS on I think to start with... not sure if that helps.

@ericclemmons

This comment has been minimized.

Copy link

ericclemmons commented May 8, 2014

I just discovered FULLTEXT ... MATCH in MySQL (such a nub), spent an hour trying to force a fullText: function(name) { this.isFullText = name || true; return this; } into SchemaBuilder then tried to shoehorn it into SchemaGrammar without scucess, then I found this thread ;)

Need some funding for 0.6 @tgriesser? I dunno how you've done this much already :)

@MetaMemoryT

This comment has been minimized.

Copy link

MetaMemoryT commented Jul 17, 2014

👍

@dwstevens

This comment has been minimized.

Copy link

dwstevens commented Apr 3, 2015

@tgriesser @ErisDS For sqllite and FTS maybe following this pattern would work? http://peewee.readthedocs.org/en/latest/peewee/playhouse.html?highlight=full%20text#FTSModel They create a separate table specifically to use in the full text search index. That way they avoid changing all columns to text in the main table.

@ErisDS

This comment has been minimized.

Copy link
Contributor

ErisDS commented Apr 16, 2015

@dwstevens that could work, not sure where we are on FTS here. Bookshelf supports plugins, perhaps something like this could be done with a plugin and knex raw? Would be an awesome thing for someone to have a play with and see if they could get a working version together.

@ErisDS

This comment has been minimized.

Copy link
Contributor

ErisDS commented May 24, 2015

I've just raised a discussion issue on Ghost about implementing search, it's linked here. It's quite literally the oldest issue on our repository and we desperately need to move forward with it.

@tgriesser what's the status here in knex? Are there any plans? We could really use your input.

@bretmattingly

This comment has been minimized.

Copy link

bretmattingly commented Jul 23, 2015

@ErisDS Our solution with Sequelize was to use raw Sequelize queries for our FT indexes. Now that I'm trying to switch to a DAO pattern and using Knex, I'll likely implement something similar. It's not as elegant as having built-in support, but it'll do.

@tgriesser I'd like to help if I can!

@ErisDS

This comment has been minimized.

Copy link
Contributor

ErisDS commented Jul 23, 2015

@bomattin I think most people are using raw queries, but are only having to support a single DB, where we need to have support for Sqlite3, MySQL & pg. Are you planning to write something generalised?

@bretmattingly

This comment has been minimized.

Copy link

bretmattingly commented Jul 24, 2015

@ErisDS I'm going to start looking into it. Unfortunately most of my programming time for the next week or so is going to be building my DAOs and trying to sell my fellow devs on it. I'm taking a look at the Knex codebase when I can so I can try to dive in and do it right quickly.

@s-stude

This comment has been minimized.

Copy link

s-stude commented Aug 9, 2015

Hello! Any luck on this?

@bretmattingly

This comment has been minimized.

Copy link

bretmattingly commented Aug 9, 2015

@s-stude No movement on my end, unfortunately. This is what I was using for the time being:

Users.find = function(params, options) {
    return new Promise(function(resolve, reject){
        var options = options? options : {};
        var whereclause;
        var lim = options.limit? options.limit : 65536; // Need a valid number here? 50 isn't a bad default.
        knex.select(cols.user)
            .from('users')
            .orWhereRaw('MATCH(firstname_preferred,lastname_preferred,username) AGAINST(? IN BOOLEAN MODE)', params.search)
            //.orWhereRaw('MATCH(groups.name) AGAINST(? IN BOOLEAN MODE)', '+matt*')
            .limit(lim)
            .then(function(userresults){
                resolve(new Users(userresults))
            })
            .catch(function(err){
                reject(err)
            });
    })
};
@hdzidic

This comment has been minimized.

Copy link

hdzidic commented Dec 14, 2015

Any updates here?

This is what I'm using as a workaround:

query.whereRaw('to_tsvector(parts.description) || to_tsvector(cars.engine_name)
        || to_tsvector(makes.name) || to_tsvector(models.name)
        || to_tsvector(part_types.name) || to_tsvector(part_types.description)
        || to_tsvector(part_categories.name)
        || to_tsvector(part_manufacturers.name) @@ to_tsquery(?)',req.query.search);
@cbrunnkvist

This comment has been minimized.

Copy link

cbrunnkvist commented Apr 26, 2016

I looked at http://knexjs.org/#Schema-index and missed the little detail about "index type is only supported on PostgreSQL", so I ran the following as part of my migration:

return knex.schema.table('host', t => {
  t.index('name', 'name_idx', 'FULLTEXT') # <<< I wish
})

resulting .debug() output against mysql:

[ { sql: 'alter table `host` add index name_idx(`name`)',
    bindings: [] } ]

where I suppose it should have said add FULLTEXT index... Is there anything in particular that is blocking the Type argument from being used for the mysql dialect?

@yamikuronue

This comment has been minimized.

Copy link

yamikuronue commented Sep 19, 2016

What happened with this? Is there a holdup? Can I be of help?

@elhigu

This comment has been minimized.

Copy link
Collaborator

elhigu commented Sep 20, 2016

@yamikuronue sure, looks like major databases support this, so common API to create index would be nice. Pull requests are welcome if you like to implement this.

@yamikuronue

This comment has been minimized.

Copy link

yamikuronue commented Sep 20, 2016

Was any work already done I can build on, or should I start over from scratch? I'm not sure I'll be able to do it, of course, but I'm willing to take a look at least

Repository owner deleted a comment from jocull Jul 18, 2017

Repository owner deleted a comment from AlexRex Jul 18, 2017

Repository owner deleted a comment from seeden Jul 18, 2017

Repository owner deleted a comment from mitchellporter Jul 18, 2017

Repository owner deleted a comment from mazahell Jul 18, 2017

@elhigu

This comment has been minimized.

Copy link
Collaborator

elhigu commented Jul 18, 2017

removed flood +1 and status update request comments, please use thumbs for voting

EDIT: if there are no new messages in this feed, then there is nothing new to tell

@Palisand

This comment has been minimized.

Copy link

Palisand commented Jun 25, 2018

So how can this be done in the meantime in one top-level invocation (i.e. createTable) without having to use raw for the entire CREATE TABLE statement? Or must we createTable, then:

knex.raw('alter table `table` add fulltext (`col`)');

?

This actually isn't all that annoying. I keep all my schemas in one directory, with file names corresponding to table names. So for some table foo that does not require any unsupported statements:

// foo.js

module.exports = function (table) {
  table.string(...);
};

and for some table bar that requires a FULLTEXT index:

// bar.js

module.exports = {
  builder: function (table) {
    table.string(...);
  },
  raw: function () {
    return knex.raw('alter table `bar` add fulltext (`col`)');
  }
};

Then I just:

function createTable(table) {
  const schema = require('schemas/' + table);
  return (
    typeof schema === 'function'
      ? db.schema.createTable(table, schema)
      : db.schema.createTable(table, schema.builder)
        .then(() => schema.raw())
  ).catch(err => {
    logError(err.message);
  })
}
@rapodaca

This comment has been minimized.

Copy link

rapodaca commented Sep 18, 2018

This issue was the top hit for many of the searches I did, but didn't address my question. For those who also end up here for the same reason...

Postgres users can add a full text index to a tsvector column using the third argument to table.index. For example:

exports.up = (knex) => {
  return knex.schema.createTable('foo', (table) => {
    table.increments('id');
    table.specificType('fulltext', 'tsvector');

    table.index('fulltext', null, 'gin');
  );
);

Also see:

https://knexjs.org/#Schema-index

and

https://stackoverflow.com/questions/45871474/how-to-add-gin-index-using-knex-js

@kibertoad

This comment has been minimized.

Copy link
Collaborator

kibertoad commented Oct 21, 2018

@rapodaca Thank you for your contribution, added this information to the wiki!

@montera82

This comment has been minimized.

Copy link

montera82 commented Jan 31, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.