Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tidy 5.7.16 -> empty result #740

Closed
ogolovanov opened this issue May 31, 2018 · 2 comments
Closed

Tidy 5.7.16 -> empty result #740

ogolovanov opened this issue May 31, 2018 · 2 comments

Comments

@ogolovanov
Copy link

Hello.

I don't know how to reopen my previous ticket, so i created new one.
Take a look, please.
Those things make me nervous :)

root@6cf6ead63be0:/app# php -v

PHP 7.2.3-1ubuntu1 (cli) (built: Mar 14 2018 22:03:58) ( NTS )
Copyright (c) 1997-2018 The PHP Group
Zend Engine v3.2.0, Copyright (c) 1998-2018 Zend Technologies
    with Zend OPcache v7.2.3-1ubuntu1, Copyright (c) 1999-2018, by Zend Technologies

root@6cf6ead63be0:/app# php -r 'phpinfo();' | grep -i tidy

/etc/php/7.2/cli/conf.d/20-tidy.ini,
tidy
Tidy support => enabled
libTidy Version => 5.7.16
libTidy Release => 2018/04/27
tidy.clean_output => 0 => 0
tidy.default_config => no value => no value

test code:

$content = file_get_contents('/tmp/1.html');
$tidy = new tidy();
$tidy->parseString($content, array());
var_dump($tidy->cleanRepair());
var_dump(strlen($content), strlen($tidy->value));

result:

bool(true)
int(56467)
int(0)

File uploaded at: https://seo-utils.ru/storage/get_document/39b732df607ef1eb490a5dd1ee324e82/?download=1
Thanks.

@geoffmcl
Copy link
Contributor

@ogolovanov thanks for the issue...

That tidy_test.html has errors so no html will be output... so your result is expected...

Add --force-output yes to your config, and you should get the something...

And hope the web site adopts a better generator of html web pages... It has no DOCTYPE, so tidy will choose html5... has lots of tivial warnings... still using legacy <table> elments, to get format... should use css... tidy does its best to render a clean document... only if you add --force-output yes...

HTH

Regards, Geoff.

@ogolovanov
Copy link
Author

Heh, thank you.
I was not expecting that problem because everything is ok with tidy 5.4.x, but not with 5.6.x+.
Do i need to set any other config options for bad formatted input html?

@geoffmcl geoffmcl added this to the 5.7 milestone Jun 1, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants