Skip to content

Standalone ToLower/ToUpper Support (Take 2) #110

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

ray2k
Copy link

@ray2k ray2k commented May 26, 2012

Only modified 2 files, PredicateTranslator, and SelectQueryTests. Not sure why the others are included in this request- not a git expert.

@craiggwilson
Copy link
Contributor

Thanks Ray. I'll get this merged in this week.

@craiggwilson
Copy link
Contributor

Ray,
Thanks so much for doing this. After much deliberation internally, we have decided to not implement this feature. The caseMismatch actually showed us the issue. There are two ways of reading a query that looks like this:

  1. make sure that the lowercase/uppercase of this field on the left matches the string on the right
  2. make a case-insensitive comparison to the string on the right

You implemented it as #1, but some others at 10gen felt it should be implemented as #2. In the light that we will be missing someones expectations, we felt it prudent at this time to not support the feature, especially because there are other ways of supporting case-insensitive queries.

Once again, thanks for your time and effort. It was not wasted.

@ray2k
Copy link
Author

ray2k commented Jun 1, 2012

I'm not following, can you elaborate?

Edit: removing example, too distracted, will comment later

Edit2: OK so let me get this straight. For:

where Name.ToLower() == "acme.corp"

you'd want a simple

{ name : /acme.corp/i }

? Because that will work when you provide a lower case query, but if you have

Name.ToLower() == "Acme.Corp"

Suddenly you're getting results when Acme.Corp is obviously not a lower case string. So I put logic in there to take care of those scenarios. What alternate way of doing this would accomplish 2) ?

@craiggwilson
Copy link
Contributor

I understood your example. Let's use it for discussion:

where p.Name.ToLower() == "acme.corp" select p

turns into a simple case insensitive search:

{ name : /acme.*corp/i }

This is true only when the right side is all lowercase to begin with. If the right side is not all lowercase, for instance this: where p.Name.ToLower() == "Acme.Corp" select p, then there are two options. 1) always return false because that how it would be handled in C# or 2) assume the user screwed up and treat the whole thing as case insensitive because presumably their intent was that they wanted case insensitive search.

The end result in either case is that they probably simply want a case-insensitive search. We don't want to try and guess at their meaning, so because we already have a good way of performing case-insensitive searches, then we don't really have any need to incorporate this in. In addition, using ToLower to perform case insensitive searches is bad practice anyways due to both the performance overhead as well as the culture issues involved. So, by not implementing this, we are promoting best practices subliminally.

Anyways, please keep the discussion going with your thoughts.

@ray2k
Copy link
Author

ray2k commented Jun 1, 2012

then there are two options. 1) always return false because that how it would be handled in C# or 2) assume the user screwed up and treat the whole thing as case insensitive because presumably their intent was that they wanted case insensitive search.

Right, I did option 1 and have code in there for the catch-none scenarios, which is semantically the same as returning false for all documents (matching where _id doesn't exist, as was already being done in other translations).

And my problem isn't that I need case insensitive queries per se, its that I specifically need ToLower() support. Clients that are out of my control (nuget) are calling a nuget feed (mine) which is a WCF data service backed by mongo, with linq queries that include .ToLower() on string properties to do case-insensitive searches on package ids. So all the 'bad practice' and 'performance overhead' is irrelevant if I don't have a service clients can actually query at the end of the day.

And since you mentioned performance, I'm guessing that translating to toLowerCase()/toUpperCase() (as fluentmongo does) is off the table as well.

Looks like i'm either ditching mongo, using fluentmongo (which I assume is dead or dying now that there's official linq support especially given this conversation is with the guy who manages fluentmongo), or venturing into ugly expression rewriting land, none of which are very appealing.

@craiggwilson
Copy link
Contributor

Ok, I understand your issue now and am well aware of the nuget issue with ToLower. I'll bring the issue up again and see if we can come to a consensus.

@ray2k
Copy link
Author

ray2k commented Jun 1, 2012

OK...

Possibly a shot in the dark, but- does the convention stuff allow any logic to be applied to the query generation? Or is that solely for mapping/serialization? It would be interesting to have a convention (I assume you'd opt into) that when enabled would translate where p.Name.ToLower() == "foo" into where LoweredName == "foo". Then the driver-user would be knowingly acknowledging the queries will be altered as well as not hitting the bad practice/performance issues, with the trade off of having an additional field to support in their documents.

Pipe dream? MIght not make much sense outside of my scenario, because if you're aware that ToLower() isn't formally supported you'd probably juse use LoweredName to begin with. Dunno

@craiggwilson craiggwilson reopened this Jun 1, 2012
@craiggwilson
Copy link
Contributor

No, I like the idea. (actually, I like it a lot). I'll have to figure out where it makes sense and how it would get integrated. However, to accomplish this, you'll be needing to rewrite queries anyway.

However, we are going to get this pulled in, probably with a couple of tweaks. So, you won't need to do anything :).

@rstam
Copy link
Contributor

rstam commented Jun 2, 2012

These are the semantics we think ToLower/ToUpper string comparisons should have:

  1. where c.S.ToLower() == "abc" => { S : /^abc$/i }
  2. where c.S.ToLower() == "Abc" => { _id : { $exists : false } } // matches no documents
  3. where c.S.ToLower() == null => { S : null }
  4. where c.S.ToLower() != "abc" => { S : { $not : /^abc$/i } }
  5. where c.S.ToLower() != "Abc" => { } // matches all documents
  6. where c.S.ToLower() != null => { S : { $ne : null } }

and similarly for ToUpper.

They are slightly different than yours in that there are no explicit checks for field existence. This makes them behave the same way as their case sensitive equivalents.

@ray2k
Copy link
Author

ray2k commented Jun 2, 2012

That all sounds fair. Is there anything else you need from me?

@rstam
Copy link
Contributor

rstam commented Jun 2, 2012

Nope. Working on it. You can see my proposed changes at:

https://github.com/rstam/mongo-csharp-driver/commits/tolower

@craiggwilson
Copy link
Contributor

This has been integrated in Ray. Thanks for your ideas and your persistence. We can't possibly think of all use cases and this was definitely one we missed. If you can, please pull down master, try it out, and let us know if it fixes your issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants