Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug found in http_trim_filter-module. #368

Closed
makailol opened this issue Dec 27, 2013 · 74 comments
Closed

Bug found in http_trim_filter-module. #368

makailol opened this issue Dec 27, 2013 · 74 comments

Comments

@makailol
Copy link

Hello,

I just tried the trim module with nginx-echo module to test few cases. I am not sure how it works but when there are more number of space " " characters, it just replaces the word after the spaces on page refresh.

Test case:


location  /htmlTest {
    default_type text/html;
    echo "";
    echo "hello Moki\n\n\n";
    echo "2nd line";
    echo "";
    echo "";
    echo "";
    echo "3rd                                                                                                                             linetest";  #Long space before  "linetest" word.
    echo "last line";
    trim on;
}

Result (first time page reload):
We get first line blank and all contents in second line
1
2 hello Moki 2nd line 3rd linetest last line

Result (After reloading page 8-10 times, we get final result like below) :
1 hello Moki
2 2nd line 3rd linetest linetest linetest linetest linetest linetest linetest linetest linetest linetest linetest linetest linetest linetestinetest last line

It would be good if someone can look into this because it seems to be major bug.

Thanks,
Makailol

@makailol
Copy link
Author

Hello,

Can anyone check this bug? I think this bug can cause issue for almost all users who are using this module. As mentioned in above example, multiple continuous spaces are replaced with the word next to the multiple spaces.

Thanks,

@yaoweibin
Copy link
Member

Thank you. We are tracking this bug. It seems there is a compatible problem with echo module.

@cfsego
Copy link
Member

cfsego commented Dec 31, 2013

leaving first line blank is all right, we expect that.

we will track the second case, and will you please post your whole conf file?

@makailol
Copy link
Author

Thanks for some updates on this.

Only the first result had first line blank. After reloading page first blank line was no more exist.

There is no spacial settings in my conf file. I just have two locations

  1. /htmlTest (with above code)
  1. / (to cache all other request to pass to backend apache server).

I have used this module just for /htmlTest location where I am generating response with echo module.

I would be thankful to you if you could just keep updating progress here on ticket.

@cfsego
Copy link
Member

cfsego commented Dec 31, 2013

we have found the reason.
module trim always modifies the buffer contents directly, however, module echo never notices its contents have been changed and still uses them for the next request.
so, we are sorry to say, module trim is not compatible to module echo.

maybe we should add a "method=copy" parameter to "trim", so that it will not change the source.

@makailol
Copy link
Author

I really appreciate your quick updates on this today.
Is it possible to develop this module such that it can run on final output just like substitution output filter module? Not sure I may be wrong :) .

I would like to use that new feature "method=copy" if you would develop(add) this parameter and it can solve this issue.

I need one small help. This module replaces single/multiple line breaks with single ' ' (space) character. I would like to make a small change here. I want to replace ONLY "multiple line breaks" with "single line break". Would you suggest me what should be changed in the "ngx_http_trim_filter_module.c" file?

Thanks

@yaoweibin
Copy link
Member

On 2013/12/31 21:45, makailol wrote:

I really appreciate your quick updates on this today.
Is it possible to develop this module such that it can run on final
output just like substitution output filter module? Not sure I may be
wrong :) .

I would like to use that new feature "method=copy" if you would
develop(add) this parameter and it can solve this issue.

I need one small help. This module replaces single/multiple line
breaks with single ' ' (space) character. I would like to make a small
change here. I want to replace ONLY "multiple line breaks" with
"single line break". Would you suggest me what should be changed in
the "ngx_http_trim_filter_module.c" file?

Thanks


Reply to this email directly or view it on GitHub
#368 (comment).

It could work with substitution module. Could you explain why you want
it work with echo module? Or do you have any special filter module?

Thanks
-YWB

@makailol
Copy link
Author

makailol commented Jan 1, 2014

I am not sure how does substitution module work but as it runs on final output so it could work with echo module too. So I just suggested if it is possible to make changes in trim module just like substitution module.

For some URI, I use echo subrequest directive.

Would you suggest me changes in the trim module for replacing "multiple line breaks" with "single line break" and to avoid replacing single line break with ' ' (space) character (as I mentioned in previous post)? I just want to trim just like pagespeed does.

@yaoweibin
Copy link
Member

On 2014/1/1 21:23, makailol wrote:

I am not sure how does substitution module work but as it runs on
final output so it could work with echo module too. So I just
suggested if it is possible to make changes in trim module just like
substitution module.

For some URI, I use echo subrequest directive.

Would you suggest me changes in the trim module for replacing
"multiple line breaks" with single line break and to avoid replacing
single line break with ' ' (space) character (as I mentioned in
previous post)? I just want to trim just like pagespeed does.


Reply to this email directly or view it on GitHub
#368 (comment).

Yes. We have the same target:-) .

Is there any problem when replacing all the linefeeds to be one space?

Weibin Yao
Thanks.

@makailol
Copy link
Author

makailol commented Jan 2, 2014

I don't have any test case where replacing all linefeeds could cause the issue right now. But I think this change should be easy to do in this trim module.

I would like to change this because it can make source more readable.

Would you suggest me changes in trim module c code to achieve this?

@makailol
Copy link
Author

makailol commented Jan 6, 2014

Hello,

Could anyone get time to check this yet?

@taoyuanyuan
Copy link
Contributor

I'm sorry, trim module can't work with echo module, it is different with substitution as cfsego says.
there is many changes in order to work with echo module,and it will use more memory.

About make source more readable, you can enter args "http_trim=off" to read the original http code.
e.g. http://ip/htmlTest?http_trim=off

@makailol
Copy link
Author

makailol commented Jan 6, 2014

Hi,

Thanks for reply. Yes I checked "http_trim=off" argument but I want below change in module with "http_trim=on".

This module replaces single/multiple line breaks with single ' ' (space) character. I would like to make a small change here. I want to replace ONLY "multiple line breaks" with "single line break".

Would you suggest me where should I make change in C file to achieve this?

Thanks

@taoyuanyuan
Copy link
Contributor

try this patch.

2014/1/6 makailol notifications@github.com

Hi,

Thanks for reply. Yes I checked "http_trim=off" argument but I want below
change in module with "http_trim=on".

This module replaces single/multiple line breaks with single ' ' (space)
character. I would like to make a small change here. I want to replace ONLY
"multiple line breaks" with "single line break".

Would you suggest me where should I make change in C file to achieve this?

Thanks


Reply to this email directly or view it on GitHubhttps://github.com//issues/368#issuecomment-31648521
.

@makailol
Copy link
Author

makailol commented Jan 7, 2014

Hello,

Where is the patch file? I can not find link.

@taoyuanyuan
Copy link
Contributor

diff --git a/src/http/modules/ngx_http_trim_filter_module.c b/src/http/modules/ngx_http_trim_filter_module.c
index 873626c..d769e0e 100644
--- a/src/http/modules/ngx_http_trim_filter_module.c
+++ b/src/http/modules/ngx_http_trim_filter_module.c
@@ -394,16 +394,8 @@ ngx_http_trim_parse(ngx_http_request_t *r, ngx_buf_t *buf,
             case '\r':
                 continue;
             case '\n':
-                if (ctx->first_line) {
-                    ctx->first_line = 0;
-                    break;
-
-                } else {
-                    *read = ' ';
-                }
-
                 ctx->state = trim_state_text_whitespace;
-                if (ctx->prev == ch || ctx->prev == ' ') {
+                if (ctx->prev == ch) {
                     continue;

                 } else {
@@ -1548,12 +1540,11 @@ ngx_http_trim_parse(ngx_http_request_t *r, ngx_buf_t *buf,
             case ' ':
                 continue;
             case '\n':
-                if (ctx->first_line) {
-                    ctx->first_line = 0;
-                    break;
+                if (ctx->prev == ch) {
+                    continue;

                 } else {
-                    continue;
+                    break;
                 }
             case '<':
                 ctx->state = trim_state_tag;

@makailol
Copy link
Author

makailol commented Jan 7, 2014

The given patch seems to be working but I have noticed that this trim module breaks the nested pre tag.

For example,
pre>
sometext
pre> 2nd line

         pre>
         pre>3rd line      </pre>
</pre>
</pre>

In above html code, this trim module removes blank line after "pre>2nd line" .
It also doesn't remove extra spaces in tag lines like below
'script SPACES language="JavaScript" SPACES type="text/javascript" >'
'style SPACES type SPACES = SPACES "text/css" >'

Unable to paste html so I removed '<' from starting tag.

@taoyuanyuan
Copy link
Contributor

Thank you very much.
it's a bug. I don't know 'pre' can be nested, I'm sorry for my carelessness.
I will fix this bug.

@makailol
Copy link
Author

makailol commented Jan 7, 2014

Thank you very much for quick updates.

Please also check another issue which I have mentioned in my previous post. The extra spaces in script and style tags(probably other tags too). I think we can remove extra spaces from any tag lines. For example spaces between two attributes and spaces after < and spaces before > .

Once you fix this bug, let us know I will test it.

@cfsego
Copy link
Member

cfsego commented Jan 7, 2014

First of all, thank you for your support.

Yet I have a doubt if 'pre' can really be nested, because it makes no use, i suppose.
Will you please give me some reference, or some case that someone really uses like this?

@makailol
Copy link
Author

makailol commented Jan 7, 2014

It is okay to nest the pre tag inside a div tag. It can be used for different formatting for different text.

If you want to format the pre text to display in a different font or color, you can easily define CSS styles for it by using the "pre" selector in a CSS rule. And that style is used with class="" attribute of pre tag.

I have also noticed that cloudflare minification preserves nested 'pre'.

@cfsego
Copy link
Member

cfsego commented Jan 7, 2014

we test it, and it's okay. so we will look into these two points:

  1. space within a tag should be minified
  2. support nested 'pre'

thank you

@makailol
Copy link
Author

makailol commented Jan 7, 2014

Thank you!

I will wait for updated version of module and it would be good if you can import the patch given by "taoyuanyuan" in default module.

Basically google page speed and cloudflare preserve single line break.

taoyuanyuan added a commit to taoyuanyuan/patch that referenced this issue Jan 14, 2014
@taoyuanyuan
Copy link
Contributor

https://github.com/taoyuanyuan/patch/blob/master/trim_140114.patch
try this patch.

  1. remove extra whitespaces in tag.
  2. support nest 'pre'.

@makailol
Copy link
Author

Hello,

First of all Thanks for all your efforts and work on this module. I really appreciate your quick updates on this.

I rebuilt my Nginx with the updated patch of trim module and tried to perform some tests. Out of two issues second one (support nest 'pre') seems to be fixed now.

For first one I have used below test html code :


<script
type="text/javascript" 
src="jsfile"></script>
<script 
type="text/javascript" 
src="jsfile"></script>
<script 
>str='testing';alert(str);
</script>

And result was as below :


<script
type="text/javascript" src="jsfile">
</script>
<script type="text/javascript" src="jsfile"></script>
<script >str='testing';alert(str);
</script>

In the result, first line break in the first script tag is not removed. And in last script tag there is extra space before '>'. I think we can remove spaced after starting < and before closing > in the tag. I'm not sure how difficult this would be. :)

I have just tested these cases with script tag but I think result will be same with other tags too.

Regards,

@taoyuanyuan
Copy link
Contributor

Trim replaces more than one spaces with a space, so.
this is my test case.

<script    type="text/javascript"   src="jsfile"></script>
<script    type="text/javascript"      src="jsfile"></script>
<script    >str='testing';    alert(str);

</script>

result:

<script type="text/javascript" src="jsfile"></script>
<script type="text/javascript" src="jsfile"></script>
<script >str='testing';    alert(str);

</script>

About "first line break in the first script tag is not removed", this is your suggest.

  • replace ONLY "multiple line breaks" with "single line break".

@makailol
Copy link
Author

Yes it replaces multiple spaces with single space.

But have you noticed the replacement of line breaks in tag line? The test case which I have posted in my previous post, you can see only first line break in first script tag is not remove which can be removed just like other line breaks in tag line(which are being removed already).

About space before > character , it is just my suggestion that we can remove it too but I am not sure if it is possible to implement in this module.

@taoyuanyuan
Copy link
Contributor

About space before > character, I have considered before , it's rare in the case, it's diffrent from code logic of "replaces multiple spaces with single space" and there is some trouble doing it, so I don't remove this single space specially.

I don't understand your test case. my result is diffrent with yours, I don't replace line break with space .
can you upload your test case on your github ? I go to download. ^_^

@makailol
Copy link
Author

To reproduce the case , you can add this code in your HTML.
https://gist.github.com/makailol/a313330beef2c61a4e05

This module removes the line break in tag lines with a space. This is proper way too. But with this test case you can see the first line break in the first (script)tag line is not being removed.

Current result is like https://gist.github.com/makailol/2d89d1edb962a49d621e
Result should be like this https://gist.github.com/makailol/e7b52cfc3d5e7dedb7d1

@makailol
Copy link
Author

I reinstalled Nginx with updated version of this module. New options are available and seems to be working.

I will do more testing on inline js and css trimming and if I find any issue, I will give you updates.

I have just noticed that in tag lines spaces around '=' can be removed if it is for attribute values.
i.e.
code :

 src    =    "some value"

Can be changed to :

 src="some value"

In current module it is already replacing multiple spaces with single space in this test code too so it is not a major thing if we can not remove space around '=' in tag lines.

@taoyuanyuan
Copy link
Contributor

download the code in my github again, I have update the code for you suggest.

code:

<pre    style  =  "color:   blue"  >Welcome    to    nginx!</pre  >
<pre    style=
    "color:blue"  >Welcome    to    nginx!</pre  >
<textarea   style 
  ="color:blue">Welcome    to    nginx!</textarea  >

result:

<pre style="color:   blue">Welcome    to    nginx!</pre>
<pre style="color:blue">Welcome    to    nginx!</pre>
<textarea style="color:blue">Welcome    to    nginx!</textarea>

this is my all test cases
https://github.com/taoyuanyuan/tengine/blob/master/tests/test-nginx/cases/trim.t

thank you, have a nice weekend.

@makailol
Copy link
Author

Good job!

I really appreciate this quick update again. Your quick updates motivates me for more testing!
I will build Nginx with latest code and do further testing.

Thanks & Have a nice weekend

@makailol
Copy link
Author

Hello,

I have built nginx with updated version of trim module and tested it. It seems like all issues which we had noticed are fixed in this update.

Is there any plan to merge this changes in tengine? So that we can always get updated version of this module from tengine?

It would be good if someone can add English version of the trim document.

Thanks for developing this open source module and providing support for fixing all issues and changing features as needed.

Regards,
Makailol

@taoyuanyuan
Copy link
Contributor

I'm changing the code for trim js and css these days, we will merge this changes in tengine this week.
English doc will be added try my best.
thank you , so many good ideas !

@makailol
Copy link
Author

Do you mean are you still making changes to trim module? If yes then can you just tell me what are those changes?

Tengine trim module will have all these changes available or will there be some differences?

Also please notify me (on this ticket) when changes are merged to tengine.

Thanks,
Makilol

@taoyuanyuan
Copy link
Contributor

remove space around some characters ( '{' '}' ',' ':' ';' '>' ) in css.

All changes will be merged to tengine this week.

cfsego pushed a commit that referenced this issue Jan 23, 2014
trim module feature and bugfix (#368)
@taoyuanyuan
Copy link
Contributor

hi, all changes have been merged to tengine.
And a simple English doc, thanks to cfsego!
https://github.com/alibaba/tengine/blob/master/docs/modules/ngx_http_trim_filter_module.md

@makailol
Copy link
Author

Thanks for information.

Thanks cfsego for English doc. Would it be possible to make it complete ? i.e. Chinese version contains more information about js and css trimming.

So far I have done some testing on inline js and css and all cases were passed though I would like to check how we trim inline js and css.

@cfsego
Copy link
Member

cfsego commented Jan 27, 2014

Oh, such a long list about trimming rules. I dont like it, though I know it is important.
It will be done after the Chinese Spring Festival, a long wait, I think :)

@makailol
Copy link
Author

Hi,

Would you check the possibility to use variable value to on|off trim module dynamically?

For example,
set $html_trim "on";
if (Some Condition) { set $html_trim "off" }
trim $html_trim;
trim_js $html_trim;
trim_css $html_trim;

Currently it gives error on nginx restart : invalid value "$html_trim" in "trim" directive, it must be "on" or "off" .

Thanks,
Makailol

@makailol
Copy link
Author

Hello!
Could anyone check this possibility to use variable value to on|off trim module dynamically yet?

@yaoweibin
Copy link
Member

Thank you @makailol . All of us are having Chinese New Year holiday. @taoyuanyuan will track your issue.

@taoyuanyuan
Copy link
Contributor

Happy New Year!
I have updated doc for trim rule, It will be merged to tengine soon.

Generally, nginx don't set boolean value(on/off) as variable value, I don't recommend do it.
But I can modify code support 'if' context to achieve a similar result.
For example, (default trim off)
location / {
if (Some Condition) {
trim on;
trim_js on;
trim_css on;
}
}

is this ok?

@makailol
Copy link
Author

makailol commented Feb 7, 2014

Happy New Year 😄

Thanks for updating doc.

I tried below block to set and test the boolean value(on/off) as variable value and it works fine too.


location = /varTest {
    default_type 'text/plain';
    set $html_trim "Off";
    echo $html_trim;
}

I usually avoid using "if" in nginx (because http://wiki.nginx.org/IfIsEvil), but I am not sure if it was the case with older Nginx version.

It would be good to have support of using variable value (on/off) with trim. Additionally it would be best if you can add support for 'if' context as you mentioned. :)

Thanks,

@taoyuanyuan
Copy link
Contributor

ok
waitting for next update last week.

@makailol
Copy link
Author

Thanks for accepting this change request.

I have tested this with variable value today and it is working fine.
Have you already made changes to support 'if' context?

Please let me know once after you merge this change to tengine.

@cfsego
Copy link
Member

cfsego commented Feb 11, 2014

I believe that enable the trimming function in "if" context is duplicate to enable it with variable, so I think we will not do this.

@yaoweibin
Copy link
Member

Could we add a directive like trim_bypass, just similar like proxy_cache_bypass.

@makailol
Copy link
Author

@taoyuanyuan, are you going to make changes suggested by yaoweibin?

Please let me know when you merge the changes made to support variable value in trim directive into tengine.

@makailol
Copy link
Author

Hello,

I have noticed an issue with assigning variable value to trim_js directive.

While assigning value dynamically from "on" to "off", it works but it sets trim_js "off" permanently until nginx is restarted. So it can not be "on" dynamically.

For example,


location /example.html {
    root /path/;
    set $html_trim 'on';
    set $js_trim $arg_js;
    set $css_trim 'on';
    trim $html_trim;
    trim_js $js_trim;
    trim_css $css_trim;
}

URL to access : www.example.com/example.html?js=on

Changing js GET parameter from "on" to "off" works but it can not be "on" again.

Thanks,
Makailol

@taoyuanyuan
Copy link
Contributor

hi, this bug has be fixed last week, you can update code in my github

@makailol
Copy link
Author

Hi,
I could not find any recent commit in your github for below repository
https://github.com/taoyuanyuan/ngx_http_trim_filter_module
Can you give me exact link from where I can download code?
Thanks,

@taoyuanyuan
Copy link
Contributor

@taoyuanyuan
Copy link
Contributor

hi, these changes have merged to tengine.

@yaoweibin
Copy link
Member

If you still have the problem, you can reopen this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants