Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix broken AND ( + ) filtering #2694

Merged
merged 1 commit into from Jun 15, 2017
Merged

Conversation

nitriques
Copy link
Member

Right now, usign a single + for the AND filter did not worked.
The + symbol we used is a space character in urls, where most of the
filters comes from.

So, the idea is to add a new split token, namely "+++" which makes
"filter 1 + filter 2" work.

This change remove code duplication in the publish page and in the section data source.

Fixes #2561

Right now, usign a single + for the AND filter did not worked.
The + symbol we used is a space character in urls, where most of the
filters comes from.

So, the idea is to add a new split token, namely "+++" which makes
"filter 1 + filter 2" work.

This change remove code duplication in the publish page and in the section data source.

Fixes symphonycms#2561
@brendo
Copy link
Member

brendo commented Jun 1, 2017

I feel like there is an Apache setting at play here. I recall encountering this before but there was some other configuration that needed tweaked. Let me come back to you.

@nitriques
Copy link
Member Author

@brendo Great! I've tested it on WAMP and on my DEV box (Centos 6) and both needed the commit. I've tried magic_quotes on and off.

@nitriques nitriques self-assigned this Jun 2, 2017
Copy link
Member

@michael-e michael-e left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like the idea of using +++ in URLs (which translates to 3 spaces, right?). Couldn't we embrace the standards and use an encoded + character , i.e. %2B? The rules for URLs are simple in this respect:

  • if you want a space, use +
  • if you want a +, use %2B

Or do I get something wrong here?

@brendo
Copy link
Member

brendo commented Jun 2, 2017

@nitriques, I believe this might of been it?

Do you have a test case that I can try locally to see if I can reproduce the behaviour you are experiencing?

@nitriques
Copy link
Member Author

Or do I get something wrong here?

Yes. %2B which is %20%2B%20 comes in PHP as +++

string(23) "relation-1+++relation-2"

@nitriques, I believe this might of been it?

Same result :( Adding the B flag to all rewrite rules...

And filtering is working on your boxes ?

@nitriques
Copy link
Member Author

@brendo looks like [B,BNP] would work!!! BUT, it's not available in my Apache 2.4.17 :( So I guess it won't work in 2.2 either and I have a real big bunch of sites on 2.2 as well...

@michael-e
Copy link
Member

Yes. %2B which is %20%2B%20 comes in PHP as +++

You mean it isn't possible to translate a query string to what it means (per the spec)? Differentiating in PHP between + (meaning space) and %2B (meaning +) as parts of a query string? Excuse my question, but this really hard to believe.

@nitriques
Copy link
Member Author

I don't like the idea of using +++ in URLs

We would not, and neither should anybody ;)

@nitriques
Copy link
Member Author

Excuse my question, but this really hard to believe.

From my testing, yes, that's it the case.

And from mod_rewrite's doc

mod_rewrite has to unescape URLs before mapping them, so backreferences are unescaped at the time they are applied.
The [BNP] flag instructs RewriteRule to escape the space character in a backreference to %20 rather than '+'. Useful when the backreference will be used in the path component rather than the query string.

@michael-e
Copy link
Member

michael-e commented Jun 2, 2017

Why do we have spaces in the URL? I would have thought:

  • ?foo=value1,value2 is translated to OR
  • ?foo=value1%2Bvalue2 means + (which is then translated to AND in the datasource logic)

@nitriques
Copy link
Member Author

Why do we have spaces in the URL

Because we want '?foo=value with spaces' to be a search for 'value with spaces' not a search for 'value' AND 'with' AND 'spaces' (remember, we get spaces like '+')

@michael-e
Copy link
Member

No, no. Honestly, if we could not differentiate between those query strings, Symphony would do something fundamentally wrong. This is so simple, it MUST be possible.

Look at the XML!

For ?debug=raw&foo=value1,value2 I see:

<current-query-string>
  <![CDATA[ debug=raw&foo=value1,value2 ]]>
</current-query-string>
<url-foo>value1,value2</url-foo>

For ?debug=raw&foo=value1%2Bvalue2 I see:

<current-query-string>
  <![CDATA[ debug=raw&foo=value1%2Bvalue2 ]]>
</current-query-string>
<url-foo>value1+value2</url-foo>

For ?debug=raw&foo=value1+value2 I see:

<current-query-string>
  <![CDATA[ debug=raw&foo=value1+value2 ]]>
</current-query-string>
<url-foo>value1 value2</url-foo>

You see that PHP can handle these query strings very well.

@nitriques
Copy link
Member Author

Indeed it does when you do not use '+' for a special treatment...
And using a single + does not work at the moment.

@michael-e
Copy link
Member

What do you need that is not covered by the above examples?

(Which can be translated to: Why would you need special treatment? And why does a single + not work?)

@nitriques
Copy link
Member Author

Why would you need special treatment?

Because Apache converts all spaces and plus sign into the same char.

And why does a single + not work?)

Because you loose the ability to search for a value that contains the plus sign, which is currently working.

Your ?debug=raw&foo=value1+value2 come out as value1 value2 after proper decoding, which is not a AND filter. Since we test BEFORE decoding (so we have the + in the url), we could change the regexp to match + unescaped and %2B unescaped instead of +++ ?

@michael-e
Copy link
Member

I admit that I didn't have the actual implementation in mind, I have not even looked at it. I just wanted to say that it is technically possible to differentiate between the two characters (which are pretty standard in URLs), and therefore it would hurt to see any custom strings (supposed to have the same meaning as a single character).

If Symphony's regex filter implementation makes it hard to work with URL standards, we should really improve the implementation.

Since we test BEFORE decoding

Honestly, this sounds like a major design flaw. URLs should always be decoded first.

@nitriques
Copy link
Member Author

I totally agree with what you say, except:

any custom strings (supposed to have the same meaning as a single character).

Changing this would require a lot of effort. But my patch just fixes a bug. The +++ should never get documented anyways. And we can't url decode in the DS because the value might not come from the request at all.

When this system will get its overhaul, this should not be a problem, but right now, it's the only way to fix it I can came up with.

@michael-e
Copy link
Member

Changing this would require a lot of effort. But my patch just fixes a bug.

Now I understand. :-(

@nitriques
Copy link
Member Author

I will be happy to fix this later on... Count on me ;)

@michael-e
Copy link
Member

Count on me ;)

You bet!

@nitriques nitriques merged commit 759050a into symphonycms:2.7.x Jun 15, 2017
@nitriques nitriques deleted the issue-2561 branch June 15, 2017 14:55
nitriques added a commit that referenced this pull request Jun 16, 2017
Right now, using a single + for the AND filter did not worked.
The + symbol we used is a space character in urls, where most of the
filters comes from.

So, the idea is to add a new split token, namely "+++" which makes
"filter 1 + filter 2" work.

This change remove code duplication in the publish page and in the section data source.

Fixes #2561

Picked from 7ccc080
Picked from 759050a
nitriques added a commit that referenced this pull request Jun 16, 2017
Right now, using a single + for the AND filter did not worked.
The + symbol we used is a space character in urls, where most of the
filters comes from.

So, the idea is to add a new split token, namely "+++" which makes
"filter 1 + filter 2" work.

This change remove code duplication in the publish page and in the section data source.

Fixes #2561

Picked from 7ccc080
Picked from 759050a
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants