Skip to content

[Question] How to use tidy for multiple files? #668

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Kristinita opened this issue Jan 21, 2018 · 7 comments
Closed

[Question] How to use tidy for multiple files? #668

Kristinita opened this issue Jan 21, 2018 · 7 comments

Comments

@Kristinita
Copy link

1. Briefly

I don't find, how I can modify multiple files in multiple folders use HTML Tidy.

2. Structure

Example site structure:

output
    SashaTest
        SashaTest.html
    TidyTest
        TidyTest.html

In real project I have more than 100 HTML files.

3. Expected behavior

Modifying SashaTest.html and TidyTest.html files.

It works for me for single files:

D:\SashaPelicanTest>tidy -mq "output/TidyTest/TidyTest.html"
line 12 column 16 - Warning: trimming empty <span>

4. Actual behavior

I can't use glob patterns, examples:

D:\SashaPelicanTest>tidy -mq "output/**/*.html"
Error: Can't open "output/**/*.html"
D:\SashaPelicanTest>tidy -mq *.html
Error: Can't open "*.html"

5. Did not help

I use Grunt for building my site. grunt-htmltidy support multiple files, but plugin works with bugs and, possibly, no longer maintained, see #6.

Thanks.

@geoffmcl
Copy link
Contributor

@Kristinita you are correct console tidy does not support glob patterns, but as tidy -h points out tidy does support multiple input html files on the command line -

tidy [options...] [file...] [options...] [file...]

But that would be a very long command line if you have more than 100 files... and in Windows there may be some limit of the maximum command line... but should work...

Alternatively, since you appear to be in windows, why not write a batch file, using for %%i in (*.html) do tidy -mq %%i. Running for /? will show the syntax...

Or you could make a list file of all html files, like dir /s /b output\*.html > templist.txt...

Then again use a batch file with like for /F %%i in (templist.txt) do tidy -mq %%i...

May need to add double quotes if the file path or names includes spaces...

There are just so many options using batch files...

And a word of warning, while I have considerable confidence in tidy, be aware the -m, modify option will overwrite the existing file with the tidied version and there is no going back if tidy changes something you do not want changed, or makes a change you do not like... you should make a backup of ALL files first...

And such a batch file system could even copy the original to a safe place first...

Also note you can set HTML_TIDY=d:\path\tidy-def.conf in the environment, and tidy will use this config file for each file processed...

HTH...

@geoffmcl geoffmcl added this to the 5.7 milestone Jan 21, 2018
@Kristinita
Copy link
Author

@geoffmcl , thanks for the answer.

Batch file — is Windows-specific solution, not compatible for UNIX users.

Is it cross-platform solution, for Windows and UNIX both?

Thanks.

@balthisar
Copy link
Member

This isn't really a Unix or DOS support site, but try man for in Unix to learn how to do things multiple times in POSIX environments. The Unix philosophy favors composability as opposed to monolithic design, meaning that your shell provides tools such as for, xargs, and others that can address these needs. It would be highly inappropriate for Tidy to support "globs" directly.

@geoffmcl
Copy link
Contributor

geoffmcl commented Feb 2, 2018

@Kristinita yes, my answer suggested using batch files since the example you gave D:\SashaPelicanTest>tidy -mq *.html indicated windows usage...

Had it been in unix, the normal unix shell will already expand a command like $ tidy -mq *.html to a list of files matching the glob, if any exist in the current directory... and/or using shell scripting...

The important issue is that console tidy has no built-in ability to expand globs, and as @balthisar points out would be "inappropriate for Tidy to support globs directly"... that must be done outside tidy...

As mentioned it does accept multiple files on its command line... $ tidy [options] path/file1.html path/file2.html another/path/file3.html ... and so on...

HTH...

@geoffmcl
Copy link
Contributor

20201123: @Kristinita seems question asked, and answered, so closing this...

Please provide further feedback if there is still some outstanding question... thanks...

@placoderm
Copy link

For anyone else who is trying to do this, ChatGPT gave me this to work on Windows Command Prompt:

@echo off
setlocal enabledelayedexpansion

set "tidy_command=tidy --tidy-mark 0 -m"

for %%f in (*.html) do (
    echo Processing file: %%f
    !tidy_command! "%%f"
)

endlocal

I improved it by adding --quiet yes That stops all the verbiage about improving Tidy, etc.

@Kristinita
Copy link
Author

Type: Solution

1. Cross-platform commands for Windows and UNIX

Use fd for running Tidy recursively.

For example, I want to modify and then check all .html files in the output directory recursively.

The command for modifying files:

fd . output --extension html --exec tidy -config tidy.conf -modify || true

The command for validating files:

fd . output --extension html --exec tidy -config tidy.conf --markup-no

2. Commands explanation

  1. || true — is the cross-platform command for force returning exit code 0. Tidy always returns non-zero exit code if it detected any markup that calls warning. I’m not sure that it’s the best behavior.

    If I run eslint --fix, ESLint fix problems and then lint files. If ESLint fixed all problems, it returns exit code 0, if any problems remain, it returns exit code 1. That’s ESLint exit code depends on files after modifications, not on pre-modified files. Possibly, behavior like ESLint would be better for Tidy. If Tidy developing continues, I’ll open a new issue.

  2. If I use --markup-no CLI argument with -modify, Tidy not modify files. It looks like a bug, and if Tidy developing continues, I’ll send a bug report.

Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants