-
Notifications
You must be signed in to change notification settings - Fork 10.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Using proxy through http fails (https works) #4505
Comments
Could you share a code snippet to reproduce this issue? |
|
Does your |
Apparently, this was the issue. My What I want to mention is that this behavior can be confusing. I will describe 4 cases:
In my case, the solution was to add the 'http' in front of my |
I guess we can take this as an enhancement to support schema-less HTTP proxy URLs. I checked, and there is no bug, the logic to handle HTTP and HTTPS proxies is different, and the HTTPS one is implemented in a way that the schema is not needed in the proxy URL. As a reference for people wishing to work on this, it should be as simple as modifying |
@Gallaecio I would like to contribute , I will start this as my first open source contribution |
@liveprasad are you still working on this? If not I can take it. |
Hi, I'm pretty new to open source. I have something that is working, but I'm having trouble implementing a test case as required from the contributing docs. |
@HausCloud Create a pull request with the current state of your changes. Maybe we can help you with the rest. |
@Gallaecio Will do! Thanks. UPDATE: Done! If any adjustments are needed, I can fix it! I'd probably need a hint towards the right direction for testing however. |
Noticed the pull request was closed accidentally and the branch @HausCloud was working on seems to have been deleted, Is the issue open to work on or is there someone on it already? @Gallaecio |
Description
When I scrape without proxy, both https and http urls work.
Using proxy through https works just fine. My problem is when I try http urls.
In that moment I get the
twisted.web.error.SchemeNotSupported: Unsupported scheme: b''
errorAs I see, most of the people have this issue the other way around.
Steps to Reproduce
Expected behavior: Get a 200 with the desired data.
Actual behavior:
Reproduces how often: Every time I scrape with proxy
Versions
Additional context
I tried to add some breakpoints at the end to see where it cracks.
I added the following lines in "twisted/web/client/py", before the cracking point:
Apparently in this point there is no schema. If I run the same code with a https url, this code is never reached. It seems that getting up to point there is bad and the proxy is not used
(edited to apply formatting)
The text was updated successfully, but these errors were encountered: