Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make specifying a native filter more explicit in 7.0 #5001

Closed
sdlime opened this issue Sep 20, 2014 · 28 comments
Closed

Make specifying a native filter more explicit in 7.0 #5001

sdlime opened this issue Sep 20, 2014 · 28 comments

Comments

@sdlime
Copy link
Member

sdlime commented Sep 20, 2014

@tbonfort, @rouault, @dmorissette, I'd like to suggest a breaking change for 7.0 as a result of the RFC 91 work. Presently if you use a database driver and it sees filter->string set it takes that as SQL. From setting via a mapfile the expression is of type MS_STRING. For RFC 91 I added an explicit native_string structure member and changed db drivers to only use that if not NULL. Works ok...

However, digging into the SLD and WMS time code I found that native expressions were being created as type MS_EXPRESSION (e.g. MapServer expression syntax ('[column]' eq 'foo')). That was a problem since those were the types of expressions we want to be able to translate to native SQL. Ugh... I had to force the filter type to MS_STRING in those instances.

It would be nice to be more explicit. OGR uses a "WHERE " qualifier and I think it would make sense to do the same universally. That what the load expression function can strip off the WHERE, set native_string and a specific type (MS_NATIVE_EXPRESSION). It'll be more convenient for MapScript users.

This is a relatively simple change but it can break some stuff: mapfiles that use database drivers and FILTERs (there's little reason to do so) and MapScript scripts that use setFilter() with database drivers. I think those should be relatively limited.

I know the RFC has been a pain in the ass but now is the time...

Steve

@rouault
Copy link
Contributor

rouault commented Sep 20, 2014

@sdlime I'm always feel intimated w.r.t. MapServer filtering, so no strong opinion, but a few questions :

  • What would be the valid values for expressionObj::type : MS_STRING ( for FILTERITEM filtering ?), MS_EXPRESSION, MS_REGEX and MS_NATIVE_EXPRESSION ? (I see other values in the 2XXX range in mapserver.h).
  • Would expressionObj::native_string still needed (i.e. wouldn't ::string be enough) ?

@sdlime
Copy link
Member Author

sdlime commented Sep 22, 2014

Hi @rouault: Yes, the allowed type would be MS_STRING, MS_EXPRESSION, MS_REGEX and MS_NATIVE_EXPRESSION (constant overlap is another issue we should fix at some point). We could get by then with just expressionObj::string although I was thinking it would be nice to have both the original (in expressionObj::string) and the translated (expressionObj::native_string) versions of the source expression available in cases where we translate from one form to the other. Otherwise we’d overwrite the original. I could go either way with that.

Steve

@tbonfort
Copy link
Member

I'm not sure I understand the implications either, but am not opposed to a breaking change for 7.0.

Why is the fact that the WMS / time functions are setting MS_EXPRESSION filters a bad thing? I would have thought the contrary, i.e. it is bad for them to set MS_STRING expressions (as the RFC changes are supposed to translate the MS_EXPRESSION into well formed SQL for the specific backend)

@sdlime
Copy link
Member Author

sdlime commented Sep 22, 2014

The problem is not knowing what is database SQL and what is a MapServer logical expression. The WFS and WMS code were writing, for example, PostgreSQL/PostGIS SQL using type MS_EXPRESSION so you couldn’t rely on expressionObj::type to know if you should translate or not. My suggestion would make that more clear. With the change, translation would be from MS_EXPRESSION to MS_NATIVE_EXPRESSION. The “WHERE “ prefix would tell the msLoadExression… functions to expect native SQL.

Post-RFC we do want the WMS/WFS code writing MS_EXPRESSIONs (or even MS_STRING/MS_REGEX + filter item).

Steve

@tbonfort
Copy link
Member

Just to make sure I understand what you mean ...

If the FILTER starts with "where", and there is no FILTERITEM set, then it's classified as type MS_NATIVE_EXPRESSION (the WFS/WMS code would need updating right away to prefix it's generated SQL with "where"). There would be a backwards incompatibility for postgis/oracle?/sde?/mssql? for people who are filtering with FILTER "column='value'", but OGR would be fine.
If that's correct, won't you have an issue in the parser as you may or may not have yet encountered the FILTERITEM when you're parsing the FILTER?

@sdlime
Copy link
Member Author

sdlime commented Sep 22, 2014

If the FILTER starts with “where “ we’d set type=MS_NATIVE_EXPRESSION regardless of FILTERITEM. The msLoadExpression…() functions don’t consider FILTERITEM or CLASSITEM anyway. Any errors with a type/no item mis-match are trapped later. The syntax change only impacts a FILTER set in the mapfile or via MapScript. This is how OGR worked since it tried to support all variants. The WMS/WFS code in master would need little if any change since they create MS_EXPRESSION filters now.

You are correct about the regression for the database backends. To fix they would have to add the “where “ prefix.

I can prepare a patch if you’d like to see change beforehand.

Steve

@tbonfort
Copy link
Member

I'm a bit weary about using magic keywords to alter behaviour in the parser. msLoadExpressionString is used quite a bit throughout the code, and in those cases we'd probably fail for user supplied strings that start with "where " (allegedly uncommon but not impossible scenario). I think you'd also fail for mapfiles with

FILTERITEM "company_name"
FILTER "where group"

Since we're making a breaking change, should we rather not use a special delimiter for the parser for these native strings? e.g.

FILTER |column_name='foobar'|

@sdlime
Copy link
Member Author

sdlime commented Sep 22, 2014

I thought about that too. I only questioned how big a breaking change to make. Just using the prefix would leave OGR behavior intact and that’s the one driver where this is really used. Others (e.g. PostGIS, Oracle, …) really shouldn’t need FILTERs since they can add the where clause to the DATA definition.

@tbonfort
Copy link
Member

Tough choice... I see it as having to choose between maintaining maximum backwards compatibility, and long term maintenance ease. Given we're breaking backwards compatibility anyways , and that the unified filtering is here to stay in the long run I think I would favor the second option. I don't have a strong opinion on this however. We might add a backwards compatibility hack in the layer->end parser in mapfile.c and in layer->setfilter for mapscripts, with a warning message.

@dmorissette
Copy link
Contributor

I do not claim to understand all the implications either, but since we're at 7.0 which is a good time to break things, I'd also lean towards a more explicit syntax that will be easier to maintain in the long run. The OGR "WHERE ..." legacy syntax could be deprecated and supported for a few more releases with some code in mapfile.c with a warning message as suggested.

@sdlime
Copy link
Member Author

sdlime commented Sep 22, 2014

Anyone got a favorite delimiter then? Implications are relatively minor but it’s worth fixing I think. FWIW, drivers would just need to examine filter->type to know whether they should act on it.

@dmorissette
Copy link
Contributor

I was going to suggest curly braces {...} as I don't think we use them anywhere yet, but the Scribe syntax does use them extensively since Scribe is a form of JSON encoding (the Scribe python script uses a JSON parser), so if we want to support Scribe someday then we should avoid using curly braces in mapfile syntax.

@yjacolin
Copy link
Contributor

@dmorissette curly braces are already used in mapfile e.g. EXPRESSION {list1,vale1,val3}

@sdlime
Copy link
Member Author

sdlime commented Sep 23, 2014

How about <>?

@tbonfort
Copy link
Member

I like <>. We also have [] that is unused, but I think we should keep it spare if/when we need intervals some day.

@sdlime
Copy link
Member Author

sdlime commented Sep 23, 2014

[] is also used as a binding delimiter…

From: Thomas Bonfort [mailto:notifications@github.com]
Sent: Tuesday, September 23, 2014 8:47 AM
To: mapserver/mapserver
Cc: Lime, Steve D (MNIT)
Subject: Re: [mapserver] Make specifying a native filter more explicit in 7.0 (#5001)

I like <>. We also have [] that is unused, but I think we should keep it spare if/when we need intervals some day.


Reply to this email directly or view it on GitHubhttps://github.com//issues/5001#issuecomment-56521594.

@dmorissette
Copy link
Contributor

< > works for me, but I guess we'll need a way to escape those chars inside native expressions (whatever delimiter we use escaping is likely to be required anyway)

@sdlime
Copy link
Member Author

sdlime commented Sep 23, 2014

Flex regex’s are greedy so they match the longest possible chunk of text. We should be able to copy the one for MS_EXPRESSION and work from that. You don’t have to escape ()’s within a logical expression.

From: Daniel Morissette [mailto:notifications@github.com]
Sent: Tuesday, September 23, 2014 9:20 AM
To: mapserver/mapserver
Cc: Lime, Steve D (MNIT)
Subject: Re: [mapserver] Make specifying a native filter more explicit in 7.0 (#5001)

< > works for me, but I guess we'll need a way to escape those chars inside native expressions (whatever delimiter we use escaping is likely to be required anyway)


Reply to this email directly or view it on GitHubhttps://github.com//issues/5001#issuecomment-56526668.

@rouault
Copy link
Contributor

rouault commented Sep 23, 2014

Hum, thinking about that more. What is the point of having a native (=SQL in practice) filter specified in FILTER and not in DATA (apart backward compatibility) ? Aren't there too many ways of doing the same thing ? OGR supports SQL syntax like 'SELECT ... FROM ... WHERE ...'. Perhaps the need to change the WHERE part dynamically without reloading the whole layer ?

Regarding < > and escaping, there's perhaps no need of escaping if we just need to look at the first and last character of the string.

@sdlime
Copy link
Member Author

sdlime commented Sep 23, 2014

That’s kind of been my feeling all along – there’s little reason for a native FILTER when you have the DATA statement. The code would be a whole heckuva lot simpler if we only supported MapServer filters definitions via mapfile/mapscript and applied translations under the hood. Drivers would still look for the filter string. For example, we wouldn’t have to worry about these new delimiters nor would we have to deal with filter merging for anything but shapefiles.

That would be a more radical change… I’d support it actually.

@tbonfort
Copy link
Member

Why would shapefiles be the only driver needing filter merging? wouldn't we support FILTER ([my_column] = 'foo') in the mapfile for e.g. postgis layers?

@sdlime
Copy link
Member Author

sdlime commented Sep 23, 2014

Good point, duh… However we wouldn’t have to deal with the merging of native filters which simplifies the merging process. It would be the same for all drivers. You’d 1) merge then 2) translate (if necessary). With those pesky native filters you have to do the merging after translation and if translation doesn’t work then you leave the native filter in place and apply the MapServer filter separately. Messy…

@tbonfort
Copy link
Member

that's what I thought... and the merging is just ((original expr) AND (wfs expression)) which is pretty simple. It would be nice if we could save the original expression to be able to reset to its original value once the filter has run, it makes it easier to re-use mapfile instances across requests (that's one of the reasons we can't efficiently re-use a mapobj in fastcgi or the apache module). I'd be +1 for the change as it simplifies the codebase greatly. I expect folks using OGR use the "where" filter syntax quite often so the upgrade to 7.0 will not be effortless for them. #4969 would be fixed for all drivers which is a bonus.

@tbonfort
Copy link
Member

I think we'd need to take this one to the mailing list and have a re-vote given the backwards incompatibility is not trivial.

@sdlime
Copy link
Member Author

sdlime commented Sep 23, 2014

I concur… I’ll write something up and send to mapserver-dev.

From: Thomas Bonfort [mailto:notifications@github.com]
Sent: Tuesday, September 23, 2014 11:52 AM
To: mapserver/mapserver
Cc: Lime, Steve D (MNIT)
Subject: Re: [mapserver] Make specifying a native filter more explicit in 7.0 (#5001)

I think we'd need to take this one to the mailing list and have a re-vote given the backwards incompatibility is not trivial.


Reply to this email directly or view it on GitHubhttps://github.com//issues/5001#issuecomment-56552038.

@sdlime
Copy link
Member Author

sdlime commented Oct 23, 2014

I have a branch that implements this change for PostGIS and simplifies the msQueryByFilter() function a good bit. Merging is much more straight forward when you know a FILTER can only be a MapServer expression. See: https://github.com/sdlime/mapserver/tree/issue-5001.

This approach uses a PROCESSING directive (NATIVE_FILTER=...) to define SQL that is always added to the full SQL statement. If you look at the mappostgis.c code you see that native SQL can be provided via translation or via processing. The way that code was written it was easy to just tag on a second potential SQL source. If a driver doesn't support translation then it would only need to look to the processing source.

This should be adaptable to the other drivers - although I could use some advice/help for each on the best way to proceed.

Steve

cc: @msmitherdc @szekerest

@sdlime
Copy link
Member Author

sdlime commented Jan 28, 2015

Has anyone been able to consider this for other drivers (e.g. Oracle, SQL Server, etc...)? I'd like to know if this work can be done before the sprint. The work on the PostGIS driver really makes things clearer for users and simplifies the query code.

Steve

sdlime added a commit to MapServer/msautotest_DEPRECATED that referenced this issue Feb 12, 2015
@sdlime sdlime closed this as completed Feb 12, 2015
@tbonfort
Copy link
Member

Support for NATIVE_FILTER in oracle added in 1842327

havatv pushed a commit to MapServer/MapServer-documentation that referenced this issue Jun 11, 2015
…R, and updated the documentation of LAYER-> FILTER to reflect the changes made in MapServer/MapServer#5001 (#123)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants