Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Order of HTML attributes in a content of feed entry is unstable #100

Open
AndreyMZ opened this issue May 3, 2017 · 0 comments
Open

Order of HTML attributes in a content of feed entry is unstable #100

AndreyMZ opened this issue May 3, 2017 · 0 comments

Comments

@AndreyMZ
Copy link

AndreyMZ commented May 3, 2017

Problem description

feedparser does not preserve and does not determine an order of HTML attributes in a content of feed entry.
So the same content value of the same feed entry when returned by feedparser can differ.

Impact

It is not possible to use hash of the content to check if the content is changed over time.

STR

Run the test:

import unittest
import textwrap
import subprocess

code = textwrap.dedent('''\
    import feedparser
    parsed = feedparser.parse('https://savannah.gnu.org/news/atom.php?group=tar')
    entry = next(filter((lambda entry: entry.id == 'http://savannah.gnu.org/forum/forum.php?forum_id=8545'), parsed.entries))
    content_value = entry.content[0].value
    print(content_value)
''')

class MyTestCase(unittest.TestCase):
    def test(self):
        res1 = subprocess.check_output(['python', '-c', code])
        res2 = subprocess.check_output(['python', '-c', code])
        self.assertEqual(res1, res2)

Actual result

AssertionError: b'<p>[2158 chars]nput readonly="readonly" class="verbatim" size[2149 chars]\r\n' != b'<p>[2158 chars]nput value="     OLDNAME NEWNAME[:NEWID] " cla[2149 chars]\r\n'

Expected result

The test is passed.

Possible solution

Preserve or determine an order of HTML attributes in a content of feed entry.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant