Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Links with %2A in them are incorrectly reported as 301 redirects #37

Open
videochums opened this issue Jul 24, 2020 · 7 comments
Open

Comments

@videochums
Copy link

Whenever a link is checked that contains %2A, the checker appears to convert that to a * then reports a 301 redirect even if the URL with %2A in it actually results in a 200 response.

Example: https://validator.w3.org/checklink?uri=https%3A%2F%2Fvideochums.com%2Freview%2Fgal-gun-double-peace

In the above tested page, the link in the source code is https://www.esrb.org/ratings/34456/Gal%2AGun%3A+Double+Peace/ which warrants a 200 response. However, the tool thinks that the link is https://www.esrb.org/ratings/34456/Gal*Gun%3A+Double+Peace/ which 301 redirects.

@dontcallmedom
Copy link
Member

in a URL, %2A is the URI-syntax escape equivalent of *. Since * is not a reserved character in the URI syntax, the linkchecker replaces the escaped value with *. The strange thing here is the server on esrb.org redirecting '*' to '%2A' - I don't think this is a bug in the checker.

@videochums
Copy link
Author

videochums commented Sep 22, 2020

in a URL, %2A is the URI-syntax escape equivalent of *. Since * is not a reserved character in the URI syntax, the linkchecker replaces the escaped value with *. The strange thing here is the server on esrb.org redirecting '*' to '%2A' - I don't think this is a bug in the checker.

Hello. It is a bug in the checker because the link to esrb has %2A in it and the checker is incorrectly converting it to * then detecting a redirect to %2A.

To be clear, the URL in the source website is:
https://www.esrb.org/ratings/34456/Gal%2AGun%3A+Double+Peace/
This directly links without redirect to:
https://www.esrb.org/ratings/34456/Gal%2AGun%3A+Double+Peace/ (the exact same URL)

However, the checker thinks that the source website has the link of:
https://www.esrb.org/ratings/34456/Gal*Gun%3A+Double+Peace/ (which is completely incorrect and causes a redirect)

@dontcallmedom
Copy link
Member

again, in a URL, %2A and * are one and the same. For instance, if you go to https://videochums.com/review/gal-gun-double-peace and hover over the said link, you'll in your browser status bar the link appear with * in it, even if the HTML markup uses %2A.

@videochums
Copy link
Author

again, in a URL, %2A and * are one and the same. For instance, if you go to https://videochums.com/review/gal-gun-double-peace and hover over the said link, you'll in your browser status bar the link appear with * in it, even if the HTML markup uses %2A.

I understand that but the link checker is still erroneous in that it's reporting a 200 response as a 301 redirect.

@dontcallmedom
Copy link
Member

Sorry I should have been clearer - if you want to avoid the 301, you need to double-escape "%2A" in your link, that is use "%252A" where you're currently using "%2A" (%25 being interpreted a "%").

What happens right now is:
you link to "*-foo", but the page is really at "%2A-foo", so the server redirects you from here to there. To hit "%2A-foo" directly, you need to percent-escape that name, which leads to "%252A-foo".

Hopes this is clearer.

@videochums
Copy link
Author

Sorry I should have been clearer - if you want to avoid the 301, you need to double-escape "%2A" in your link, that is use "%252A" where you're currently using "%2A" (%25 being interpreted a "%").

What happens right now is:
you link to "*-foo", but the page is really at "%2A-foo", so the server redirects you from here to there. To hit "%2A-foo" directly, you need to percent-escape that name, which leads to "%252A-foo".

Hopes this is clearer.

Thanks for the suggestion. I changed it to https://www.esrb.org/ratings/34456/Gal%252AGun%3A+Double+Peace/ and it's doing a 301 redirect and the link checker is reporting a 301 redirect.

So, I'm going to change it back to what it was which gave a 200 response while the link checker reports a 301 response which is a bug.

@videochums
Copy link
Author

videochums commented Sep 22, 2020

Just to be clear - all of these are reported as 301 redirects using the link checker:

https://www.esrb.org/ratings/34456/Gal*Gun%3A+Double+Peace/
https://www.esrb.org/ratings/34456/Gal%2AGun%3A+Double+Peace/
https://www.esrb.org/ratings/34456/Gal%252AGun%3A+Double+Peace/

So, no matter what I change it to, the link checker reports a 301 redirect.

Also, the 2nd URL is displayed as the 1st URL in the checker results.

I'm going to stick with the 2nd URL because that results in a 200 response. Hopefully, the link checker can be updated to reflect that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants