Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

-export-config creates invalid configuration file #679

Closed
AnrDaemon opened this issue Feb 17, 2018 · 9 comments
Closed

-export-config creates invalid configuration file #679

AnrDaemon opened this issue Feb 17, 2018 · 9 comments
Labels
Milestone

Comments

@AnrDaemon
Copy link

HTML Tidy 5.6 (windows build)

_config="$( mktemp --tmpdir XXXXXX.tidy )"
test "$_config" || exit 1
trap "rm '$_config'" EXIT HUP INT QUIT ABRT TERM
HTML_TIDY=
tidy -xml --new-blocklevel-tags "b1 b2" --new-empty-tags e1,e2 --new-inline-tags i1,i2 --new-pre-tags p1,p2 --priority-attributes "a1 a2" -export-config > "$_config"
grep -E "^(new\-[a-z]+\-tags|priority\-attributes|):" "$_config"
tidy -xml -config "$_config" -export-config | diff -bd "$_config" -

Sample output

new-blocklevel-tags: b2
: b1
new-empty-tags: e2
: e1
new-inline-tags: i2
: i1
new-pre-tags: p2
: p1
priority-attributes:
14c14
< css-prefix: c
---
> css-prefix: c-
56d55
< : b1
58d56
< : e1
60d57
< : i1
62d58
< : p1
@geoffmcl
Copy link
Contributor

@AnrDaemon thanks for this issue...

In a quick test using multiple values like --new-blocklevel-tags "b1 b2", with latest 5.7.3, the output service printOptionExportValues will incorrectly output two lines -

new-blocklevel-tags: b2
: b1

This is a bug!

The code -

            TidyIterator pos = tidyOptGetDeclTagList( tdoc );
            while ( pos )
            {
                d->def = tidyOptGetNextDeclTag(tdoc, optId, &pos);
                if ( pos )
                {
                    printf( "%s: %s\n", d->name, d->def );
                    d->name = "";
                    d->type = "";
                }
            }

will output the first line above, obviously with a newline... and as you can see the d->name is cleared... the next call to tidyOptGetNextDeclTag will get the 2nd value, b1 in this case, but will clear pos...

The code will fall down to -

    /* fix for http://tidy.sf.net/bug/873921 */
    if ( *d->name || *d->type || (d->def && *d->def) )
    {
        if ( ! d->def )
            d->def = "";
        printf( "%s: %s\n", d->name, d->def );
    }

which will now generate the 2nd line with no name, so we get just : d1\n... it seems in this case the d->name should not be cleared... or something...

If you add 3 values ``--new-blocklevel-tags "b1 b2 b3"`, then you will get 3 lines, etc...

new-blocklevel-tags: b3
: b2
: b1

I would expect new-blocklevel-tags: b3 b2 b1\n, or maybe at least -

new-blocklevel-tags: b3
new-blocklevel-tags: b2
new-blocklevel-tags: b1

So we are definitely in need of a fix for this service...

Comments, patches, PR very welcome... thanks...

@geoffmcl geoffmcl added the Bug label Feb 17, 2018
@geoffmcl geoffmcl added this to the 5.7 milestone Feb 17, 2018
@AnrDaemon
Copy link
Author

There's two more issues. css-prefix is changed and the priority-attributes (the only other list option) is not saved at all.

@geoffmcl
Copy link
Contributor

@AnrDaemon yes, thanks for pointing out the additional items...

The css-prefix is strange, in that the option setter, namely ParseCSS1Selector, deliberately adds, appends that -, with the code and comment - buf[i++] = '-'; /* Make sure any escaped Unicode is terminated */...

Actually the full code and comment is -

    buf[i++] = '-';  /* Make sure any escaped Unicode is terminated */
    buf[i] = 0;      /* so valid class names are generated after */
                     /* Tidy appends last digits. */

    SetOptionValue( doc, option->id, buf );

One wonders about the need for this addition... there is already a value validator, IsCSS1Selector, which tests the string input, and has a big clear comment over it, which makes it clear things like /* ab\555\444 is 4 chars {'a', 'b', \555, \444} */ is allowed and validated...

But this setter would automatically add a 5th - character! ALWAYS!! WHY???

Either the setter should not do this, or when getting the value it is removed... probably need to look closely at where tidy fetches this value, and uses it to generate CSS... Lots to ponder here...

Also why is there no value output for --priority-attributes "id name"? On reading that config, the value is definitely set to "id, name", but in output printOptionExportValues the d->def is never set!

It seems there must be another switch case added, case TidyPriorityAttributes: to fetch the value! Like there is in the ParseList() service, DeclareListItem. It uses its own DefinePriorityAttribute service...

There exists a tidyOptGetPriorityAttrList/getPriorityAttrList service but it seems it is never called...

Finally there may be another bug, and that is the order of such multiple string items...

A config of --new-blocklevel-tags "b1 b2 b3" will output them as b3 b2 b1, and this was true even back at the last cvs tidy release 2009... all those years ago... we did not have -export-config then but we did have --show-config so, for example ...

$ tidy-cvs -v
HTML Tidy for Windows released on 25 March 2009
$ tidy-cvs --new-blocklevel-tags "b1 b2 b3" -show-config
[skipped]
new-blocklevel-tags         Tag names  b3
                                       b2
                                       b1

So unless that order is corrected the following will fail

$ tidy --new-blocklevel-tags "b1 b2 b3" -export-config > tempcfg1.txt
$ tidy -config tempcfg1.txt -export-config > tempcfg2.txt
$ diff -u tempcfg1.txt tempcfg2.txt

This would show up this different order bug, that would remain even when the output is fixed, unless something is done about reversing the output order at the same time...

So there are certainly a number of bugs here in this -export-config option... hopefully all solvable...

As stated comments, patches, PR very welcome... I too will experiment... thanks...

geoffmcl added a commit that referenced this issue Feb 19, 2018
@geoffmcl
Copy link
Contributor

@AnrDaemon this intrigued me, so I went on to create an issue-679 branch... and coded some changes...

The full diff is commit ea4ae0d

So I built a HTML Tidy for Windows version 5.7.3.I679 and ran some tests...

First there was no change in the Regression tests, but this just means maybe there is no tests that tests this output, but it is still a good thing...

Then I ran -

$ tidy --new-blocklevel-tags "b1 b2 b3" --css-prefix d --priority-attributes "id name" -export-config 1>tempcfg1.txt
$ tidy -config tempcfg1.txt -export-config  1>tempcfg2.txt
$ diff -u tempcfg1.txt tempcfg2.txt

and there were no differences...

Some relevant output in the tempcfg1.txt -

css-prefix: d
new-blocklevel-tags: b1 b2 b3
priority-attributes: id name

This is a first cut fix, and more testing must be done...

It would be appreciated if users could checkout the issue-679 branch, build, test and report... thanks...

@geoffmcl
Copy link
Contributor

Commit bb7c494 adds display of --priority-attributes "id name" to the -show-config output, but the order of the --new-*-tags has been left inverted. The order does not seem so important to this simple -show-config output... eg...

$ tidy --new-blocklevel-tags "b1 b2 b3" --css-prefix d --priority-attributes "id name" -show-config
...
css-prefix                  String     d
new-blocklevel-tags         Tag Names  b3                                      
                                       b2                                      
                                       b1
priority-attributes         Attribute  id                                      
                                       name

As indicated, this does not seem a problem...

Also it is noted -xml-config does not, in fact has never, shown the current values - it will always just show <default /> - but will leave this is a separate issue, if it is one...

The only test remaining is to setup and try say --css-prefix "ab\555\444" and see what is done...

Look forward to any other user tests of this issue-679 branch... thanks...

@geoffmcl
Copy link
Contributor

Concerning css-prefix, first the default value is just the letter c, and when generating CSS, due to the -clean option, on the sample in_679.html tidy will generate the likes of -

 span.c3 {color: blue; background: yellow}
 span.c2 {color: green; background: yellow}
 span.c1 {font-size: 120%}

Now when you set the prefix like --css-prefix d, in 2009 tidy-cvs you would get -

 span.d-3 {color: blue; background: yellow}
 span.d-2 {color: green; background: yellow}
 span.d-1 {font-size: 120%}

Maybe this looks better, but now that I have removed this auto addition of the - in the option setter it will be back to like the default case -

 span.d3 {color: blue; background: yellow}
 span.d2 {color: green; background: yellow}
 span.d1 {font-size: 120%}

So what about the strange case mention in the source code comment, namely --css-prefix "ab\555\444". Again 2009 tidy-cvs perhaps looks better with -

 span.ab\555\444-3 {color: blue; background: yellow}
 span.ab\555\444-2 {color: green; background: yellow}
 span.ab\555\444-1 {font-size: 120%}

But on the other hand this modified tidy 5.7.3.I679 is not wrong with -

 span.ab\555\4443 {color: blue; background: yellow}
 span.ab\555\4442 {color: green; background: yellow}
 span.ab\555\4441 {font-size: 120%}

AND the HTML generated with the likes of <li><span class="ab\555\4441">y2</span></li> passes the W3C validation, so it seems the ab\555\4441 string is a valid class name...

Passing the output through the validator pointed out another small thing, that tidy generates a style element with an attribute, <style type="text/css">, to which the validator issues a warning -

Warning: The type attribute for the style element is not needed and should be omitted.

But will leave that as a separate issue...

In summary, as far as I can tell the removal of the auto addition of a - character is cosmetic only. If a user really wants it back then they can add it to the css-prefix "d-" option...

At the moment, unless I think of something else to try, or others have a suggestion, this concludes my testing of this issue-679 branch, and will get around to setting up a PR to merge it...

As stated, look forward to any other user tests of this issue-679 branch... thanks...

@AnrDaemon
Copy link
Author

According to https://drafts.csswg.org/css-syntax-3/#tokenization (if I read it correctly), the parser is right. So, no surprize here.

@geoffmcl geoffmcl mentioned this issue Mar 15, 2018
@geoffmcl
Copy link
Contributor

PR #695 created...

geoffmcl added a commit that referenced this issue Apr 13, 2018
@geoffmcl
Copy link
Contributor

@AnrDaemon this fix is now in next... thanks...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants