Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Case sensitive find #6258

Closed
mikermcneil opened this issue Jan 22, 2014 · 49 comments
Closed

Case sensitive find #6258

mikermcneil opened this issue Jan 22, 2014 · 49 comments

Comments

@mikermcneil
Copy link
Member

User.find({
  role: 'admin',
  name: { 
    equals: 'Gabe'
    caseSensitive: true
  }
})
@mikermcneil
Copy link
Member Author

Originally posted in Sails:
#152

@campbellanderson
Copy link
Contributor

Hey guys any advance on this?

Can look at forking and implementing but don't want to double up on someone else's work

We are running a decent sails setup and serving some serious requests per minute. We hit this issue and thought lets just throw more tin at it for now, but that didn't solve it.

Moved to Native queries and everything is fine. Downside is it steps away from the model domain with the result.

Let me know if you want to know more about our real world experiences

@particlebanana
Copy link
Contributor

@campbellanderson I'd love to hear about some of the issues you have run up against. I'm guessing the case sensitive issue is because you want to use indexes correct? I know in general regex matches are going to be slow but the big issue is with using indexes and strings I think.

One solution I had for querying indexed strings is to store a lowercased version and index it and then store the original value. When querying indexed strings you still get case insensitive matching AND the ability to use indexes. Then when Waterline returns the values just strip out the lowercased field. This is a bigger solution though and for now we could add a caseSensitive flag and just have it passed down to the adapter to use when building up the query.

I don't think anyone is currently working on this as far as I know so go for it! If you have any other ideas for implementation I'm open for whatever works.

Another quick solution, if you don't mind writing native queries (which will probably always be faster), is to pass the results of the query into a _model constructor. Something like http://stackoverflow.com/a/21695563/909625

@kai23
Copy link

kai23 commented Aug 25, 2014

Hey,

I'm currently searching for this feature too. I'm using a unique case sensitive hash in my database, and I use this hash to find the row. Is there any advance on it ?

@winrid
Copy link

winrid commented Sep 12, 2014

Queries should be case sensitive by default and maybe the option for case insensitive queries could be added. This is a serious drawback when developing any sort of mid/large size application and completely defeats the purpose of using Waterline with Mongo.

@paddyabc
Copy link

I think waterline is brilliant. However, the case-insensitive search can cause a lot of problem for user. I think, in the real case, over 80% query are case sensitive. The case insensitive query cause a very poor performance since it uses regular expression for searching.
In my case, I am a mongodb user. I have nearly 100,000,000 data, and I only want to do a simple query search on a string field, but I never get the result because the regular expression is used. I think if I want to use case insensitive search, I would like to build a full text search index on mongo collection rather than using a regular expression for searching....
In my opinion, it is better to use a flag indicating whether the field should be queried case insensitively in the Model Object rather than using the case insensitive query by default.

@particlebanana
Copy link
Contributor

Guys this comes from the legacy version of Waterline when it was used with MySQL only, which is case insensitive. The api has stuck as all string queries are case insensitive because we didn't want to break the api and force all the adapters to update on each version. I think we all realize that it's a large issue, especially when dealing with indexes on strings.

The original format Mike posted looks good which allows you to select which format to use or setting search defaults somewhere would work. We just haven't sat down and come up with a solution that works yet. Open to ideas from everyone.

@winrid
Copy link

winrid commented Sep 16, 2014

What about having it as a global setting for the adapter?
Much better than explicitly saying each query is case sensitive. :)

@tjwebb
Copy link
Contributor

tjwebb commented Sep 16, 2014

Much better than explicitly saying each query is case sensitive. :)

+1

@jmls
Copy link

jmls commented Nov 18, 2014

is there any movement on this issue ? Was wondering what the state of play regarding this is

@campbellanderson
Copy link
Contributor

I never found the time to implement unfortunately and just moved forward with the native call. If your experiencing any real throughput you have to step away from the case insensitive calls especially where multikey indexes are used.

I still rate this pretty high. Pro's for node are its speed and if the platform you choose hinders that its a big Con for sails.

To be honest I would have thought that case insensitive would be the edge case.

@pr1ntr
Copy link

pr1ntr commented Jan 19, 2015

afaik, you can use BINARY in mysql to get a case sensitive query.

@eabovsky
Copy link

+1 for being able to specify case sensitivity on the attribute

@dmarcelino
Copy link
Member

👍

0.11 could be a good opportunity for this since it will require API/adapter changes (I believe).

@dmarcelino
Copy link
Member

If everyone agrees with @mikermcneil's suggestion:

User.find({
  role: 'admin',
  name: { 
    equals: 'Gabe'
    caseSensitive: true
  }
})

I don't think we need any waterline core changes but only adapter and waterline-adapter-tests changes. Is this a spec everyone agrees on?

@winrid
Copy link

winrid commented May 21, 2015

I don't agree. It should be case sensitive by default because that is how Mongo behaves.
Trying to make an ORM function entirely the same way for different storage mechanisms simply won't work. There will always be inconsistencies and trying to hide them will only cause confusion.

@dmarcelino
Copy link
Member

@winrid, as explained by @particlebanana in https://github.com/balderdashy/waterline/issues/239#issuecomment-55673429 the case insensitive default is a legacy issue and one I doubt it will go away. I was hoping we could be constructive and figure a way to improve this in the future without breaking everyone's applications. @mikermcneil's suggestion seems a good one for query level case sensitivity changes.

sails-postgresql already allows changing the default case sensitivity through configuration. A similar config has been discussed for sails-mongo in balderdashy/sails-mongo#260 (comment) and everybody on the thread agreed upon it.

Summarising:

  1. @mikermcneil's suggestion would change case sensitivity at the query level (which can be useful);
  2. Through each adapters' configuration we can change the default case sensitivity of each adapter.

@winrid
Copy link

winrid commented May 21, 2015

Sometimes you have to release breaking changes for the sake of good development practices. This would be one of them.

Anyways the adapter level change would be ideal for me.

@jfaylon
Copy link

jfaylon commented Jul 6, 2015

any updates on this?

@Jolo510
Copy link

Jolo510 commented Jul 6, 2015

I have resulted to use using .native() to write mongodb queries. Seems like a lot of the others have done the same. For SQL queries, I the method used is called .query().

http://sailsjs.org/documentation/reference/waterline-orm/models/native

http://sailsjs.org/documentation/reference/waterline-orm/models/query

@kai23
Copy link

kai23 commented Jul 6, 2015

That's what we are using too ( .query() )

@winrid
Copy link

winrid commented Jul 6, 2015

This bypasses the ORM level validations and defeats the purpose of using waterline.

@Jolo510
Copy link

Jolo510 commented Jul 6, 2015

That is true.

@haavardw
Copy link
Contributor

Case insensitive came as a big surprise to me. I ran into problems when looking up a user by name implementing login functionality. When creating users it passes the unique constraint. When searching it all falls apart.

@Ehesp
Copy link

Ehesp commented Sep 2, 2015

Also causing problems here, we're trying to do an exact match on a string and it's doing a regexp... taking 200ms for a super simple (or should be) search...

Profiler result:

"query" : {
        "device_id" : /^5ad2771a\\-35c7\\-4010\\-8420\\-4b8b378050df$/i, 
        "role" : NumberInt(0)
    }, 

@sailsbot
Copy link

sailsbot commented Oct 3, 2015

Thanks for posting, @mikermcneil. I'm a repo bot-- nice to meet you!

It has been 30 days since there have been any updates or new comments on this page. If this issue has been resolved, feel free to disregard the rest of this message. On the other hand, if you are still waiting on a patch, please:

  • review our contribution guide to make sure this submission meets our criteria (only verified bugs with documented features, please; no questions, commentary, or bug reports about undocumented features or unofficial plugins)
  • create a new issue with the latest information, including updated version details with error messages, failing tests, and a link back to the original issue. This allows GitHub to automatically create a back-reference for future visitors arriving from search engines.

Thanks so much for your help!

@sailsbot sailsbot closed this as completed Oct 3, 2015
@devinivy
Copy link

devinivy commented Oct 3, 2015

@mikermcneil @dmarcelino probably oughtta stay open, at least until it's PRed into the roadmap? Some serious labels on this one, and it's in a milestone.

@dmarcelino
Copy link
Member

👍 for reopening. This is a big one and lots of people have shown interest in this being done.

@dmarcelino dmarcelino reopened this Oct 3, 2015
@tjwebb
Copy link
Contributor

tjwebb commented Oct 3, 2015

+1

@sailsbot sailsbot closed this as completed Nov 3, 2015
@sailsbot
Copy link

sailsbot commented Nov 3, 2015

Thanks for posting, @mikermcneil. I'm a repo bot-- nice to meet you!

It has been 30 days since there have been any updates or new comments on this page. If this issue has been resolved, feel free to disregard the rest of this message. On the other hand, if you are still waiting on a patch, please:

  • review our contribution guide to make sure this submission meets our criteria (only verified bugs with documented features, please; no questions, commentary, or bug reports about undocumented features or unofficial plugins)
  • create a new issue with the latest information, including updated version details with error messages, failing tests, and a link back to the original issue. This allows GitHub to automatically create a back-reference for future visitors arriving from search engines.

Thanks so much for your help!

@chenweiyj
Copy link

Since I am using utf8 and my data has full-angle data, how can I set the collate for search on a field to do the same thing as following?

SELECT * FROM tablename where name COLLATE utf8_unicode_ci like '%(商品名:舒澈)%' ;

@FewKinG
Copy link

FewKinG commented Apr 27, 2016

Ran into this issue too, started out pretty well, but not knowing about this issue, as our database slowly grew bigger, some queries got slower and slower. So we got suspicious, I investigated and found out, that quite a few of our indices didn't do any good because queries were transformed to regular expressions. When I found out, that this is expected behavior and the issue known for so long, I couldn't believe it at first. This should be made far more obvious and possible to circumvent by configuration if not by default.

@neoadventist
Copy link

Yeah this is a big problem for me, compose was complaining about this and i'm going to try to use the native implementation.

@particlebanana
Copy link
Contributor

@FewKinG @neoadventist depending on the adapter being used you can toggle the case-sensitivity on/off using the proper flag in your connection configuration. So for example in postgres you would use:

postgresql: {
  url: 'postgres://username:password@hostname:port/database',
  wlNext: {
    caseSensitive: true
  }
}

It's being worked on for MySQL but should work in the Mongo adapter.

Case sensitivity on the query by query level isn't planned for the near future and would need to go through the proposal process.

@neoadventist
Copy link

thanks @particlebanana, can anyone confirm that this works for the monogo adaptor?

@neoadventist
Copy link

already implemented the .native but interested in the workaround.

@wkhu
Copy link

wkhu commented May 4, 2016

+1

@Salakar
Copy link

Salakar commented Jun 15, 2016

Can confirm @particlebanana 's solution works on the mongo adapter. 👍

@theoinsight
Copy link

I'm having the same issue with mongo doing a regexp. Did anyone find a fix for this?

@albertpeiro
Copy link

albertpeiro commented Dec 18, 2017

@Salakar what version of sails were you using?
With Sails 1.0 and this config:

envMongodbServer: {
      adapter   : 'sails-mongo',
      host      : 'localhost',
      schema: true,
      wlNext: {
        caseSensitive: true
      }
    },

I get this:

error: Failed to lift app: Error: Invalid configuration for datastore `envMongodbServer`:  Unrecognized options (`wlNext`) specified as config overrides.
This adapter expects only whitelisted properties.
--
See http://sailsjs.com/config/datastores#?the-connection-url for info,

UPDATE:
I guess this is relevant for latest news:
balderdashy/sails-mongo#464

@luislobo
Copy link
Contributor

luislobo commented Dec 26, 2017

@albertpeiro With the latest Waterline, all queries are Case Insensitive. Check the last paragraph in https://github.com/balderdashy/sails-docs/blob/552f6c07da17a2569bcac947e3e5a864fcc8dc81/concepts/ORM/Querylanguage.md

Case-sensitivity
All queries inside of Waterline are case-insensitive. This allows for easier querying but makes indexing strings tough. This is something to be aware of if you are indexing and searching on string fields.
Currently, the best way to execute case-sensitive queries is using the .native() or .query() method.

@anterodev
Copy link

I cant't run a insensitive query with orm using sails-mongo, need any configuration or option?

"sails": "^1.0.0-46",
"sails-hook-orm": "^2.0.0-23",
"sails-mongo": "^1.0.0-11",

@winrid
Copy link

winrid commented Feb 5, 2018

Waterline cannot be recommended for production environments due to design decisions such as this. Telling users to bypass the ORM to make use of indexes is insane.

@devinivy
Copy link

devinivy commented Feb 5, 2018

Waterline is free, modifiable software, and like any tool it's important to understand its limitations upon adopting it. The reason this is hard is largely due to the fact that waterline is adapter-based, which is also the ORM's defining feature. While I don't use waterline anymore and haven't contributed to it in ages, I think it's important that someone express that calling OSS collaborators "insane" for their design decisions, support, or documentation-writing is not cool and definitely not constructive. So here I am expressing it! A lot of people avoid collaborating on OSS due to comments like this. If there's some idea about how to make it easier to implement a feature like this across adapters, that would be very useful and interesting!

@winrid
Copy link

winrid commented Feb 5, 2018

Provided constructive criticism years ago on the same issue :) It is insane.

The reason this is hard is largely due to the fact that waterline is adapter-based

No, the reason this is hard is because it is the wrong thing to do. What needs to happen is that it needs to be accepted that you should not be designing a layer of abstraction that functions exactly the same no matter what data store is behind the scenes. Having the interface the same is great - that's the point of an ORM - but for example enforcing what you think is best instead of the DB creators in terms of configuration is not good.

@devinivy
Copy link

devinivy commented Feb 6, 2018

I'm not so interested in the technical aspect here, and I'm not saying you're wrong– the point is that it's more important to be good to each other and respectful of all the free effort that has gone into this tool than to be right.

@luislobo
Copy link
Contributor

luislobo commented Feb 6, 2018

@winrid this is OS, and MIT, feel free to send your PR! :)

@johnabrams7 johnabrams7 transferred this issue from balderdashy/waterline May 20, 2019
@OnbitShashan
Copy link

@albertpeiro did you find a solution for mongodb? I'm stuck on this. I have an existing database that need to query with case-insensitivity

@vrajasekhar1
Copy link

Hi, is this issue resolved? I want to do a case insensitive search with postgresql, but it does not work. Appreciate any pointers how to get this working.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests