Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better handling for link rel dns-prefetch and preconnect #536

Merged
merged 4 commits into from
Dec 9, 2021

Conversation

vdwijngaert
Copy link
Contributor

I changed a few things in regards to handling link tags, in particular dns-prefetch and preconnect.

  • rel attribute can have multiple values, separated by a space, so switched to substring comparison
  • preconnect usually uses the root URL (eg https://fonts.googleapis.com) for setting up a connection, but that URL can return a 404 in some cases, resulting in a false positive. I changed it to a DNS check for now. This could be changed in the future to use a check for a valid http connection.

I added some tests, too :)

@@ -157,7 +157,7 @@ def html_element(self, tag, attrs, element_text, lineno, column):
base = self.base_ref
# note: value can be None
value = attrs.get(attr)
if tag == 'link' and attrs.get('rel') == 'dns-prefetch':
if tag == 'link' and ('dns-prefetch' in attrs.get('rel') or 'preconnect' in attrs.get('rel')):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does attrs.get('rel') return when there's no rel attr? If None, the proposed code will crash with a TypeError.

Ideally there'd be a unit test with a <link> without rel, to see that this works.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, welp, flake8 complains:

./linkcheck/htmlutil/linkparse.py:160:89: E501 line too long (106 > 88 characters)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're absolutely right. I added a fallback for this, and a test that makes sure no TypeError is raised (there's probably a better way, but it does the job).

@mgedmin
Copy link
Contributor

mgedmin commented May 19, 2021

Looks like we ran out of Travis CI credits and need to migrate to something else (GitHub Actions) before we can see whether tests pass on pull requests. sigh more work...

@mgedmin
Copy link
Contributor

mgedmin commented May 20, 2021

We now have working GitHub actions. Let's see what happens if I push the 'Update branch' button GitHub shoves at me...

Copy link
Contributor

@cjmayo cjmayo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checking DNS only is allowed even in the Resource Hints Editor's Draft
https://w3c.github.io/resource-hints/#preconnect

@cjmayo cjmayo removed the request for review from mgedmin December 8, 2021 19:53
@cjmayo cjmayo closed this Dec 8, 2021
@cjmayo cjmayo reopened this Dec 8, 2021
@cjmayo cjmayo merged commit 900586d into linkchecker:master Dec 9, 2021
@cjmayo
Copy link
Contributor

cjmayo commented Dec 9, 2021

Thanks for the improvement and especially the tests.

github-actions bot pushed a commit that referenced this pull request Dec 9, 2021
…536)

preconnect is only DNS checked.

This is allowed even in the Resource Hints Editor's Draft
https://w3c.github.io/resource-hints/#preconnect 900586d
@vdwijngaert
Copy link
Contributor Author

You are most welcome! :) Glad to see this merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants