Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTTP_403 while curl gives HTTP_200 #58

Open
dbogatov opened this issue Jan 2, 2017 · 11 comments
Open

HTTP_403 while curl gives HTTP_200 #58

dbogatov opened this issue Jan 2, 2017 · 11 comments
Labels

Comments

@dbogatov
Copy link

dbogatov commented Jan 2, 2017

I have encountered a link which is considered broken by blc but opens well in curl or browser.

Here it is:

blc https://www.nginx.com

CURL works fine:

$ curl -I https://www.nginx.com
HTTP/1.1 200 OK
Date: Mon, 02 Jan 2017 06:52:15 GMT
Content-Type: text/html; charset=UTF-8
Connection: keep-alive
X-Pingback: https://www.nginx.com/xmlrpc.php
Link: <https://www.nginx.com/wp-json/>; rel="https://api.w.org/"
Link: <https://www.nginx.com/>; rel=shortlink
Link: <https://www.nginx.com/wp-json>; rel="https://github.com/WP-API/WP-API"
X-User-Agent: standard
X-Cache-Config: 0 0
Vary: Accept-Encoding, User-Agent
X-Cache-Status: MISS
Server: nginx
X-XSS-Protection: 1; mode=block
X-Frame-Options: SAMEORIGIN
X-Content-Type-Options: nosniff
X-Sucuri-ID: 14010

but BLC does not:

$ blc https://www.nginx.com
Getting links from: https://www.nginx.com/
Error: HTML could not be retrieved

User agent does not help:

$ blc --input https://www.nginx.com --user-agent "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_3) AppleWebKit/602.4.3 (KHTML, like Gecko) Version/10.0.3 Safari/602.4.3"
Getting links from: https://www.nginx.com/
Error: HTML could not be retrieved

What is the problem?
Is it a particularly NGINX bug, or larger set of websites is affected?

@stevenvachon
Copy link
Owner

stevenvachon commented Jan 3, 2017

It could be caused by bhttp which does not have an https agent, and only sort of works with https.

@dbogatov
Copy link
Author

dbogatov commented Jan 3, 2017

Thanks!

So the issue will persist until bhttp releases a fix, right?

@stevenvachon
Copy link
Owner

stevenvachon commented Jan 3, 2017

Yeah. I've spoken with that project's creator, and he's been working on the next major version. Not sure when it will be released, though.

@dbogatov
Copy link
Author

dbogatov commented Jan 3, 2017

Great, thank you!

I would leave the issue open if you don't mind.
I'll close as soon as they release a fix.

@stevenvachon
Copy link
Owner

stevenvachon commented Jan 3, 2017

No problem. I think it makes sense to keep it open, as it is an issue that needs fixing he plans to fix along with breaking changes in a major release.

@dbogatov
Copy link
Author

dbogatov commented Apr 11, 2017

Hi!

It's been more than 3 months. Have bhttp released the fix?

@stevenvachon
Copy link
Owner

stevenvachon commented Apr 11, 2017

Nope 👎
I'll have to switch to something else in 0.8.x and so far I've been looking at axios.

@vkotovv
Copy link

vkotovv commented May 12, 2017

@stevenvachon can I exclude links with 403 error from blc report?

@stevenvachon
Copy link
Owner

stevenvachon commented May 12, 2017

@vkotovv not currently in the CLI. You can create a custom report with the programmatic API, though.

@Glavin001
Copy link

Glavin001 commented Jul 2, 2019

Is there a workaround for this? I am currently blocked. node-fetch works, bhttp simply does not.

@stevenvachon
Copy link
Owner

stevenvachon commented Jul 16, 2019

Is this fixed with v0.8.0 branch?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants