Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

img alt text is included in p-name #54

Closed
aaronpk opened this issue Jan 12, 2018 · 2 comments
Closed

img alt text is included in p-name #54

aaronpk opened this issue Jan 12, 2018 · 2 comments

Comments

@aaronpk
Copy link
Owner

aaronpk commented Jan 12, 2018

This issue is part of #52.

The alt text of img tags is included in the parsed name value, so I can't remove it from the name value, causing XRay to think this is a named post when it tries to dedupe the content and name values.

HTML

<html>
  <head>
    <title>Test</title>
  </head>
  <body class="h-entry">
    <p class="e-content p-name">This is a photo post with an <code>img</code> tag inside the content. <img class="u-photo" src="http://target.example.com/photo.jpg" alt="a photo"></p>
  </body>
</html>

mf2 json

        {
            "type": [
                "h-entry"
            ],
            "properties": {
                "name": [
                    "This is a photo post with an img tag inside the content. a photo"
                ],
                "photo": [
                    "http://target.example.com/photo.jpg"
                ],
                "content": [
                    {
                        "html": "This is a photo post with an <code>img</code> tag inside the content. <img class=\"u-photo\" src=\"http://target.example.com/photo.jpg\" alt=\"a photo\">",
                        "value": "This is a photo post with an img tag inside the content. a photo"
                    }
                ]
            }
        }

https://pin13.net/mf2/?id=20180112184043608

This would be solved by microformats/microformats2-parsing#16

@aaronpk
Copy link
Owner Author

aaronpk commented Jan 12, 2018

This is solved in 66adfbe by XRay doing its own plaintext conversion of the HTML after first using the parsed mf2 to dedupe name/content.

@aaronpk aaronpk closed this as completed Jan 12, 2018
@aaronpk
Copy link
Owner Author

aaronpk commented Jan 12, 2018

now parsed as:

{
    "type": "entry",
    "photo": [
        "http://target.example.com/photo.jpg"
    ],
    "content": {
        "text": "This is a photo post with an img tag inside the content.",
        "html": "This is a photo post with an <code>img</code> tag inside the content."
    }
}

@aaronpk aaronpk removed the blocked label Jan 12, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant