Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parse error #20

Closed
sagan opened this issue Nov 4, 2013 · 4 comments
Closed

Parse error #20

sagan opened this issue Nov 4, 2013 · 4 comments

Comments

@sagan
Copy link

sagan commented Nov 4, 2013

Hi, I encounter some problems when parsing html which contains certain elments:

HTML:

<!DOCTYPE html>
<html>
<body>
<div>
<table style="width: 520px; height: 361px;" border="1px solid">
<tbody>
        <tr>
                <td>a</td>
                <td>b</td>
                <td>c</td>
                <td>d</td>
                <td>d</td>
                <td>f</td>
        </tr>
</tbody>
</table>
</div>
</body>
</html>

PHP:

require_once(__DIR__ . "/vendor/autoload.php");

$html = file_get_contents("1.html");
$dom = HTML5::loadHTML($html); //DOMDocument
echo  HTML5::saveHTML($dom);

What I get is a wrong result:

<html><body>
<div>
<table style="width: 520px; height: 361px;" border="1px solid"></table>
<tbody></tbody>
        <tr></tr>
                <td></td>a
                <td></td>b
                <td></td>c
                <td></td>d
                <td></td>d
                <td></td>f



</div>
</body>

</html>

It works well using DOMDocument::loadHTML parsing the same test html file.

@mattfarina
Copy link
Member

looking into this. stay tuned.

@mattfarina
Copy link
Member

I've verified this is happening.

@mattfarina
Copy link
Member

To add a little context, the parser is throwing a bunch of errors...

[errors] => Array
        (
            [0] => Line 0, Col 0: Could not find closing tag for td
            [1] => Line 0, Col 0: Could not find closing tag for td
            [2] => Line 0, Col 0: Could not find closing tag for td
            [3] => Line 0, Col 0: Could not find closing tag for td
            [4] => Line 0, Col 0: Could not find closing tag for td
            [5] => Line 0, Col 0: Could not find closing tag for td
            [6] => Line 0, Col 0: Could not find closing tag for tr
            [7] => Line 0, Col 0: Could not find closing tag for tbody
            [8] => Line 0, Col 0: Could not find closing tag for table
        )

mattfarina added a commit that referenced this issue Nov 4, 2013
…he element to use as the current element. This caused the current parser element to get messed up.
@mattfarina
Copy link
Member

10e129b fixes this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants